Extension talk:Graph/Plans
Add topic
|
Archives
| |
---|---|
| |
Update April 2024 (discussion)
[edit]Hello everyone -- I posted an update with a proposal for moving forward with graphs here on the project page. I posted on the project page (instead of the talk page) so that it can be marked for translation to other languages. This is a thread to discuss, so please join in the discussion!
Also pinging some of the people who have been active in the discussions on graphs so far (but everyone else is also encouraged to speak up!) @Sj @TheDJ @Levivich @Bawolff @Theklan @Aaron Liu @Snævar @Nux @HaeB @Iniquity @Strainu @John Broughton MMiller (WMF) (talk) 23:04, 10 April 2024 (UTC)
- I very much like the proposed approach - the initial efforts will generate at least 80% of the value (the less complex graphs that the vast majority of editors might create, or have created) with the proverbial 20% of the effort that would be required to offer a solution that covers a much wider range of functionality. I definitely agree that "the larger topic of interactive content is worthy of separate, continued conversations moving forward". John Broughton (talk) 01:34, 11 April 2024 (UTC)
- Thanks for the update. The proposed approach of server-rendering the content and serving it as a static image sounds similar to how SVGs, LilyPond and LaTeX are handled today. I am wondering whether this essentially boils down to implementing phab:T334372? I'm no expert with SVG but I believe that's the format usually used for static graphs. Adding the ability to server-render SVG markup provided as an inline parser tag (or as a Commons data page) and then providing Lua libraries to easily author modules for turning graph data to SVG would make for a very extensible solution. SD0001 (talk) 04:50, 11 April 2024 (UTC)
- @SD0001 That’s an interesting idea, thanks for sharing that task, and yes I believe we would be rendering the static graphs as SVGs. My understanding is that there are still security concerns regarding inlining SVGs, but I see some workarounds are proposed in the comments. With this new graph extension, our goal is to have an editor-friendly graph definition interface that can be easily shared across projects. Another advantage would be that the code would be testable and reviewable. Essentially, if you wanted to add a new visualization type, you could do that by contributing to the extension rather than writing a new Lua module. Does that approach also seem extensible enough? CCiufo-WMF (talk) 23:10, 17 April 2024 (UTC)
- It should be noted that there is nothing preventing testing/reviewing in lua. It is somewhat less likely due to technical and political factors, but plenty of on wiki modules have test cases and get changes reviewed. Similarly, the original graph extension was developed by WMF, but i wouldn't say the test/review process was all that it should have been. I agree that sharing of lua between wikis is a major issue with on-wiki templates. Bawolff (talk) 19:06, 18 April 2024 (UTC)
- @SD0001 That’s an interesting idea, thanks for sharing that task, and yes I believe we would be rendering the static graphs as SVGs. My understanding is that there are still security concerns regarding inlining SVGs, but I see some workarounds are proposed in the comments. With this new graph extension, our goal is to have an editor-friendly graph definition interface that can be easily shared across projects. Another advantage would be that the code would be testable and reviewable. Essentially, if you wanted to add a new visualization type, you could do that by contributing to the extension rather than writing a new Lua module. Does that approach also seem extensible enough? CCiufo-WMF (talk) 23:10, 17 April 2024 (UTC)
- Oppose It doesn't solve the issue not in time, nor in purpose. I'm not going to say that this is disappointing, because I didn't expect too much. But definitively a bad move (not even a solution). Theklan (talk) 06:10, 11 April 2024 (UTC)
- Thanks for the update. I'm going to trust the process that got you to this conclusion and focus on the practicalities:
- what I'll miss if not implemented: several data series on a single graph. Beyond that, the basic graphs you describe are OK for known usages on my wiki.
- what I would like to know: a timeline for re-activating the feature.
- Strainu (talk) 09:34, 11 April 2024 (UTC)
- If I read it correctly, the proposal dismiss the possitiblity of having graphs from Wikidata data. In Basque Wikiepdia we have around ~35K articles loading interactive graphs from population data that comes from Wikidata. Theklan (talk) 12:24, 11 April 2024 (UTC)
- For population data, Lua + inline data input for graphs (which is promised) should be okay. What’s lost is the ability to use SPARQL, but that’s not needed for population data: hu:Modul:Népességdiagram reads data from Wikidata and displays a graph from it (previously using Extension:Graph, now using CSS trickery) without a single line of SPARQL, only using Wikidata’s Lua interface. —Tacsipacsi (talk) 01:28, 12 April 2024 (UTC)
- If I read it correctly, the proposal dismiss the possitiblity of having graphs from Wikidata data. In Basque Wikiepdia we have around ~35K articles loading interactive graphs from population data that comes from Wikidata. Theklan (talk) 12:24, 11 April 2024 (UTC)
- Thanks for the update. I'm going to trust the process that got you to this conclusion and focus on the practicalities:
- I appreciate the update. Static images is the worst solution as you cannot hover to see hidden labels (it's not possible to have a compact image with all labels in large charts).
- Not sure where do you get this part of information:
- and we tried wrapping the Vega canvas in a sandboxed iframe (which caused significant performance issues)
- This is not true. Caching is possible (even in Chrome). It's not trivial, but also not that hard. If you would dismiss iframe for accessibility reasons I would understand, but also static images are not accessible at all (not that you can really make a generic solution to make graphs accessible)...
- Please talk to devs, I think there is some misunderstanding around actual iframe problems. Nux (talk) 11:02, 11 April 2024 (UTC)
- I want to clarify that static images is the worst long term solution. Wikipedia should - long term - be more interactive, not less. Static images would be ok as a temporary workaround. Nux (talk) 11:08, 11 April 2024 (UTC)
- With iframes we have been there, done that, as per phab:T169027. Read it and move on. Snævar (talk) 12:08, 11 April 2024 (UTC)
- @Snævar I know that task, but this is not the correct one. Also a lot has changed since 2017. In 2017 loading jQuery from common CDN still made sense and that was before Spectre...
- But even though caching changed a lot - as I said - loading in an iframe works fine.
- I tested this myself and as long as you setup the iframe correctly it is both fast and secure. Nux (talk) 08:50, 12 April 2024 (UTC)
- I think it is a bit ambitious. It is really only known whether this will work, when an appropriate graphic software is found. Personally I would put a caveat on all of this, pending on said software, just to avoid setting expectations that may or may not happen. I like that the proposal admits this will take months, it is the truth anyway.
- I have heard some people being excited for having more functionality than has been used for Graph, but in the end, this is fine for WMF wikis. Snævar (talk) 12:07, 11 April 2024 (UTC)
- I suggest to visit Our World In Data climate change page, to see why this approach is not good now, and it would be out of date even five years ago. The future goes in another direction, spending another year to make a patch that will add little value is spending time, money and effort. Theklan (talk) 12:31, 11 April 2024 (UTC)
- Since many past graphs won't survive the transition, and we are now committing to definitely, positively, not resurrecting existing Graph tooling: please render static snapshots of all of the past/current graphs.
- If starting from scratch: implementing the full OWID library seems a good idea. They have an active community of development, practice, and knowledge suitable for our projects; they have a best-in-class approach to making data visible alongside a visual; we have an active community of crossover use trying keenly to incorporate that knowledge.
- as interim ways explore future interactivity: enable task T303853 to see what happens; other wikis could have a server-rendered implementation of OWID, with a link that takes you to a toolforge exploration of that same graph to get a custom interactive view of the data, and a way from that tool to render a new server-side image + template text to embed that. Sj (talk) 14:03, 11 April 2024 (UTC)
- We have deployed the OWID gadget at Basque Wikipedia, and the result is impressive. We have added in one day interactive graphs to 24 articles (eu:Kategoria:Our World in Data grafikoak dituzten artikuluak). This is a really powerful software piece, and having the ability to reuse it with other data sources would be a huge step forward. Now, we need to figure out how to translate the software itself (and also the data pieces). Theklan (talk) 19:26, 17 April 2024 (UTC)
- Thanks for the suggestions @Sj, I’ve tried to address some of these points in my general comment. Regarding what to do about existing graphs in the meantime, one of the options we’ve considered is to render static snapshots. One unknown here is what to do if the graph definitions are changed, since we wouldn’t be able to update the rendered image. I’m also not sure we’d be able to render all the existing graphs as static images. Assuming we were able to though, would you expect the images to be inserted alongside the existing graph definitions instead of replacing them? What do you think should happen with the existing error messages communicating that graphs are unavailable? CCiufo-WMF (talk) 23:17, 17 April 2024 (UTC)
- Hello CCiufo, I expect the static render would be from the last point in time that had a graph definition, as of the render. Not updated after that. I would expect the static images to replace the graph definitions, with the full image description (on commons) and possibly the caption (where appropriate) linking to the replacing diff. That would highlight in diff view the graph definition that was used. The existing error messages are an eyesore and should go away once the graph definition is replaced by an image. Sj (talk) 03:36, 18 April 2024 (UTC)
- Thanks @Sj, I appreciate the clarification and suggestions. I’ll include a section about this option when I update Extension:Graph/Plans, after we’ve had a chance to evaluate the technical implications in more detail. I’m assuming different wikis may want to handle this in different ways and even within a given wiki, I think getting consensus about doing this type of mass-update will be tricky and might distract from focusing on the development of the replacement extension. Do you get the sense that there already is consensus to do this on some projects? CCiufo-WMF (talk) 20:54, 25 April 2024 (UTC)
- Hi |CCiufo, I don't know that anyone would object to replacing an error message with a static image. But a small % of renders might be wrong and need reversion; and some communities would certainly update the replacement to apply a better style guide. In general high volume updates should be done by a script monitored by a person, which could be a local implementer. So e.g. you could render all of the static images, put them in a category on Commons with the metadata about which graph revision from which article they were a render of, and host a single-purpose tool for editors to use to replace broken graphs with those images. That's a self-contained task [outside of using the tool], doesn't require waiting for local decisions about how to do the replacement, and — crucially — preserves knowledge that is about to be lost once Graph goes away and this renderer stops working.
- Optimally you would look through page histories to find past Graphs that were removed without replacement once the extension wasn't promptly fixed. If this doesn't happen, then every month of delay leads to more lost knowledge: past Graphs that are now no longer memorialized in this update process. Sj (talk) 18:11, 21 May 2024 (UTC)
- Thanks @Sj, I appreciate the clarification and suggestions. I’ll include a section about this option when I update Extension:Graph/Plans, after we’ve had a chance to evaluate the technical implications in more detail. I’m assuming different wikis may want to handle this in different ways and even within a given wiki, I think getting consensus about doing this type of mass-update will be tricky and might distract from focusing on the development of the replacement extension. Do you get the sense that there already is consensus to do this on some projects? CCiufo-WMF (talk) 20:54, 25 April 2024 (UTC)
- Hello CCiufo, I expect the static render would be from the last point in time that had a graph definition, as of the render. Not updated after that. I would expect the static images to replace the graph definitions, with the full image description (on commons) and possibly the caption (where appropriate) linking to the replacing diff. That would highlight in diff view the graph definition that was used. The existing error messages are an eyesore and should go away once the graph definition is replaced by an image. Sj (talk) 03:36, 18 April 2024 (UTC)
- I suggest to visit Our World In Data climate change page, to see why this approach is not good now, and it would be out of date even five years ago. The future goes in another direction, spending another year to make a patch that will add little value is spending time, money and effort. Theklan (talk) 12:31, 11 April 2024 (UTC)
- [The following is a bit rambly and incoherent. Sorry] The most recent update is a bit vague to be honest. It would perhaps be helpful to include some anti-goals - what do we not want to do, as well as maybe some persona-style use case discussion. Then again perhaps this is not the right venue for that. Ultimately though I feel like I'm pretty unclear on what the solution will look like just based on this. Its also still a bit unclear to me what subset of the "graph" problem we are trying to solve. Maybe giving some concrete examples of use cases would help. Some questions to think about:
- This will be static images only:
- I assume we aren't going to just resurrect graphoid and call it a day. While I personally agree with that decision, i think the rationale for not doing that should be fleshed out.
- It feels like we are talking about interactivity as a binary - Its either just an image, or we do full blown Turing complete scriptability. I think it would be useful here to distinguish levels of interactivity and the various use cases they have
- Just an <img> tag, like an uploaded file.
- Links that can be clicked on (e.g. Easytimeline)
- Links, tooltips and hover effects (Essentially what you can do with wikitext using CSS pseudo-selectors :hover, :active, etc. HTML title attribute)
- declarative animations. Like CSS animations (@keyframes and friends) or SMIL used in SVG
- Full blown scripting.
- It seems like we are going with just an <img> tag. I'm not sure that is sufficient. It seems like one of the main selling point of old graphs was the ability to do hover effects - mouse over the bar in the bar graph and it shows more data about that data point, and that sort of thing. I've also seen drill-down effects be very effectively used on graphs elsewhere on the internet (e.g. flame graphs)
- What actually is the value proposition of graphs (in general)? The way I see it, its a combination of the following things:
- Easy to edit histories integrated like normal page edits so users can easily track changes
- Automatic Ingestion of dynamic data sources (e.g. page views. WDQS). [This however is problematic in its own way]
- semi-interactive displays that look nicer than static uploaded images [Even if not fully interactive, things like tooltips and highlighting the line you are hovering on]
- Separation of presentation & formatting concerns (You can put the data in the data namespace on commons, and have the code separate. Unlike an uploaded svg file, where editing it might be complicated because everything is mixed together).
- I can't help but wonder - perhaps there is no need to do anything special here. Why not just whitelist in wikitext a subset of useful safe SVG tags, allow template styles to do CSS animations [already allowed], allow lua to format data sources, and call it a day? (Like what SD0001 is suggesting) We are basically already there, and it seems like this would already go beyond what is being proposed here. The only additional things to do would be to make JsonConfig data namespace suck less, and optionally allow lua to fetch dynamic data sources (if we decided we actually wanted that).
- This will be static images only:
- Bawolff (talk) 20:47, 11 April 2024 (UTC)
- To allow more interactivity with CSS we should solve this: task T360725 -Theklan (talk) 06:05, 12 April 2024 (UTC)
- While that would be nice, i don't really think that's critical for graph-like usecases you would expect to find on wikipedia. Bawolff (talk) 19:13, 12 April 2024 (UTC)
- You are right, it's not critical, but having an up-to-date CSS would be interesting if we want to achieve some kind of modern view, which this proposal dismiss. Theklan (talk) 19:22, 12 April 2024 (UTC)
- While that would be nice, i don't really think that's critical for graph-like usecases you would expect to find on wikipedia. Bawolff (talk) 19:13, 12 April 2024 (UTC)
- @Bawolff I like the way you’ve broken down the different problems the legacy extension + templates were trying to solve, and for highlighting the nuance of interactivity. I think we can borrow a lot of that to better communicate what we’re trying to achieve with this new extension. I’ve attempted to clarify things in my general comment, but when I update Extension:Graph/Plans, would you mind if I pinged you to see if the goals, scope, and audience is clear?
- Regarding SVGs + Lua, please see my reply to SD0001. In short: there are advantages to going the extension route, but I’m happy to continue discussing this. FWIW, I don’t think the proposal to create a new extension excludes the possibility of phab:T334372 happening in the future. Maybe a combination of both will cover a wider net of use cases over time. CCiufo-WMF (talk) 23:15, 17 April 2024 (UTC)
- Yes, feel free to ping me whenever you want. I agree that the two approaches are orthogonal and we can potentially persue both. I personally think things where we put as much as possible into the hands of the user work best. There is always a lot of barriers once stuff is being done in gerrit. Sometimes that is necessary, but i think the lesson of lua is that giving users the low level tools directly and ways to abstract on top of them is very powerful. We just get so much more creativity when users are allowed to experiment freely. Bawolff (talk) 18:52, 18 April 2024 (UTC)
- To allow more interactivity with CSS we should solve this: task T360725 -Theklan (talk) 06:05, 12 April 2024 (UTC)
- I second John Broughton's overall optimism about the proposal. I've been missing the graphs over at enwiki this past year, and the proposed solution would replace most of what I found most editors (even if clearly not all) to use the Graph module for. Tserton (talk) 10:34, 13 April 2024 (UTC)
- I really strongly disagree with this idea. As far as I'm aware, the update to using a newer version of Vega was almost complete. Now that's all going to be thrown away for who knows how many years again. Meanwhile all the graphs on wikipedia that relied on it still remain broken. This is rediculous. Please fix this so that the data can be seen. THEN work on a replacement. 73.162.189.54 21:09, 24 May 2024 (UTC)
- Agreed - it's been way too long. Merko (talk) 12:18, 21 October 2024 (UTC)
- I think a lot of us have "graph extension fatigue" so to speak. It has been so long and several seemingly-false starts ... I know I'm not excited about contributing at this point. I follow the topic because I had made a fancy graph on one of the pages that has now been hidden for *years*. Hopefully there will be progress again, but I don't have the willpower to put in more effort. (I have been largely low-key following and trying to not get in the way.)
- If I have a question at all it would be "why is this time different?" Jason Olshefsky (talk) 20:24, 20 June 2024 (UTC)
Hey everyone, thanks for taking the time to read the proposal and continuing to provide your thoughts here. As Marshall mentioned, I’ll be leading this effort and I’m eager to work with you all to help shape the future of graphs in Wikimedia projects. I’m glad to hear some of you believe we’re on the right track with this.
As a first step, I’ve distilled some of the common questions I’m seeing in the conversation so far and have provided answers to them below. For some of the more specific questions, I’ll be replying directly.
Why only images? What about interactive graphs?
The primary motivation for sticking with static visualizations is to get to a working graphs solution as soon as possible. We want to be realistic about what we can deliver in the next fiscal year and we are confident that serving images will be performant and secure.
I know that not all existing use cases will be covered by this solution at first. @Bawolff has done a great job explaining that “interactivity” is a spectrum, and for now we are targeting the simple side. That’s why we plan on designing a system that leaves the door open for adding what we’ve been calling “light interactivity”. This could include things such as hovering over charts and maps.
Why not enable use of inline SVGs or some other visualization library directly?
We intentionally want to provide our own graph definition interface (i.e. how you would actually specify a graph in wikitext), for a few reasons:
- We learned from our security team when trying to re-enable the legacy extension that the ability to directly access the underlying visualization library can be fundamentally insecure, so we won’t be repeating this pattern going forward.
- Creating our own graph definition interface at the extension level will provide stability and freedom to upgrade or even change the underlying library in the future without having to worry about breaking existing graphs. For instance, if we were to start out using Library A, but then years later discover a major problem with Library A, we would want to be able to switch over to Library B without having to rebuild the whole extension. This was a challenge when we tried to upgrade from Vega 2 to Vega 5. The syntax had changed and therefore required a migration of the graph definitions too.
- Changes to graph definitions and supported graph types then become observable and testable through standard software development processes in a way not possible currently with templates and modules. Essentially, if you wanted to add a new visualization type, you could do that by contributing code to the extension where knowledgeable volunteer developers or WMF staff could review it. This invites those with the necessary technical expertise to extend functionality in a safe and scalable way.
What will this actually look like? Which visualization types are we thinking about?
We’re thinking that it would be most important to address the use cases previously covered by Graph:Chart and related templates like Graph:Lines, but without the ability to make MediaWiki API calls or SPARQL queries to Wikidata Query Service for now. The reason I haven’t specifically called out which types of visualizations we want to pursue first is because this is a key area we’re looking to get your input on. Without looking at every single graph definition, it’s hard to know for sure what would be most useful. We’ve been looking at template usage across all projects to help identify important use cases, but are there other factors you think we should be considering? Are these the most important graphs for readers? For editors?
For some use cases we don’t think we’re likely to support, like rendering pageview data, could it make sense to use existing tools like https://pageviews.wmcloud.org/pageviews/ instead? Let me know what you think!
What is the plan?
The rough plan is as follows:
- Before the new fiscal year starts in July 2024, we hope to select a visualization library, finalize staffing, and work with you all to identify the initial graph types we want to support and define what the interface could look like. This will all be included in a refreshed version of Extension:Graph/Plans.
- When the team starts work in July, set up the infrastructure needed to render a single graph type. Out of transparency, it’s hard to estimate how long this work will take before we have a clearer idea of what the system architecture will be, but we will keep the project page updated with our best estimates.
- Pick a graph type and prototype it with community members to finalize the graph definition interface.
- Once the graph definition interface and design are settled, test it and make it available in production.
- Repeat steps 3-4 for subsequent graph types identified in step 1.
At some point we’ll also decommission the legacy graph extension, but I’m not sure when it would make sense to do that yet. What’s also missing in this plan is at what point existing graph definitions are migrated to use the new extension. We really can’t do this part without your help, even if we find ways to automate some of it. I’m thinking it makes sense to do it iteratively as each new visualization type becomes available (step 4), instead of a mass migration effort at the end. Let me know if you have suggestions about this.
I’ll share more information about where we are in the process and communicate any decisions we make as early as I can. Like I mentioned in the rough plan above, we’ll be seeking your input specifically on which visualization types are most important to start with and what the graph definition interface should look like. For now, I’d like to continue the conversation here. Thanks again for taking the time to help us think through this! CCiufo-WMF (talk) 23:06, 17 April 2024 (UTC)
- So, as I understand, your answer is that you don't care about what we discussed above, and you will be doing what you thought first. Why are you and the team asking for feedback if the purpose of the feedback is to dismiss it? Don't waste our time, if there's no point on doing that. Theklan (talk) 09:06, 18 April 2024 (UTC)
- Disagree. Talking to the volunteers is always a good thing, even if WMF chooses to go in a different direction. –Novem Linguae (talk) 09:30, 18 April 2024 (UTC)
- I appreciate providing a concrete plan for how the work will be done going forward. I do hope that as part of this, some time will be spent gathering user stories for more ordinary Wikipedia editors. I'm a bit worried that people participating in this discussion might not be representative of Wikipedians writing articles (This is always true, but i feel like it might be even more true in the Graph discussions then most discussions where a lot of people have an image of an ideal future in their mind of interactive content that might be disconnected from the needs of the moment. Myself especially included in that). p.s. As a small nitpick, I am unsure what you mean by "decomissioning" the legacy graph extension. It has already been decommissioned for about a year now. Bawolff (talk) 19:01, 18 April 2024 (UTC)
- You raise an important point about who we design for. This is always a challenge when we build things at WMF and I’m going to be explicit about the audiences we’re focusing on for this new extension as part of outlining the project scope in more detail on Extension:Graph/Plans. As you’ve pointed out already, keeping focused allows us to avoid building for such a wide range of users and use cases that we end up not solving for any of them at all. It would be great to chat with more editors who’ve used the legacy extension (either directly or through templates) but aren’t active in these discussions. If you have ideas about the best way to identify and engage such editors, I’m all ears!
- Re: “decommissioning”, I meant formally deprecating it and pointing people to the new extension or other alternatives. I think there’s a lot of documentation / phab task cleanup needed here to communicate that the legacy extension is not coming back. CCiufo-WMF (talk) 20:56, 25 April 2024 (UTC)
- Hmm... Well, if there were a serve-side rendered SVG with labels shown on hover (and tap), that might actually provide a decent user experience. I think stacked graphs might be something worth doing first. They were the hardest to replace (IIRC I degraded most of them to line charts mostly, unfortunate loss for readers). Stacked graphs are also probably a good starting point to solve some problems with how to define series and how to display a compound data point on hover. Nux (talk) 21:20, 18 April 2024 (UTC)
- ^timeline charts, not line charts. Nux (talk) 16:57, 19 April 2024 (UTC)
- I can’t speak to exactly what SVG capabilities will be available yet, though that would be ideal. I think it’ll depend on the underlying visualization library we use to render the graphs. Regarding “stacked graphs”, do you mean a stacked bar chart, like this? CCiufo-WMF (talk) 20:57, 25 April 2024 (UTC)
- Yes, or stacked line charts, which should mostly have the same problems of displaying labels. Some utils draw a vertical line to better show which points are highlighted and then show labels for all of them. Nux (talk) 22:39, 25 April 2024 (UTC)
- On the subject of page views. I do not think the toolforge tool is a suitable replacement. That said, i don't think its that important a usecase. However since we already have stuff to get the data on wiki, i think we should just expose it to lua. I filed phab:T362937 for that. At some point, we may want to replace the ?action=info code in extension:PageViewInfo to use the new system whenever it exists. Bawolff (talk) 21:59, 18 April 2024 (UTC)
- I agree that porting Extension:PageViewInfo to the new system would make sense instead of introducing a graph type specific to pageview info, for the reasons you mention. CCiufo-WMF (talk) 20:58, 25 April 2024 (UTC)
- Could there be a stopgap measure whilst the new extension is developed? Such as:
- Displaying the information in a table (simple, only use for low data simple graphs such as bar, line or pie, have a cap on number of "rows", maybe 20-30, this could also be an expando)
- Implement a solution to have a user click on a link which takes them to an extenal/internal service which will graph the information on a separate page. There are open source services for this which wouldn't take *too* long to first audit security and code wise, then set up either internally or externally hosted. A simple example showing its technically possible: Quickchart.io, which just takes the information in the URL and gives you a chart.
- I'm sure this has probably already been discussed, but could you statically render the existing graphs offline (to mitigate security issues) then show them on the pages with a note "This graph has not changed since 19 April 2023 and is unable to be changed due to security issues, so information might be outdated". And just bar any changes to existing graphs. I'm not sure how technically involved this would be and if it would be worth it versus how long the new extension will take. For example, if its going to take 1 year and this stopgap would take 1 month, it'd be worth it, but if this stopgap would take 6 months and the dev time would be 1 year, it wouldn't in my opinion.
- Building a whole new extension is quite obviously going to take some time, not starting until July, and as of today it has been 1 year since Graph was disabled. Information is not going to be available for potentially years after it last was. I feel as if there should be some way to at least see the underlying data whilst the new extension is developed, even if not visually. MarkiPoli (talk) 07:27, 19 April 2024 (UTC)
- The community could probably do a table of the data. Like I covered in #Stats, around half of the pages that use graphs on en.wp (your main wiki) are using Module:Graph, so it should go there. It is something you should discuss on en.wp.
- Making a tool that displays old graphs on wmflabs.org is technically possible, but we are not going to show old graphs on wikipedia. There is too much information on the security risks the old graph system (Vega2) has, that it is just out of the question. An user could create this tool on wmflabs.org.
- Taking a time from devs to do a stopgap is going to delay it. Looking at other WMF projects, at least for 3 months, possibly 6 months. I am primarily looking at small and medium sized Wikipedias, some of which have limited technical knowledge. They are not going to put up a stopgap, but just wait for the main one, and waiting longer does not make sense to them. Snævar (talk) 08:52, 19 April 2024 (UTC)
- We’re supportive of finding intermediate/stopgap solutions where there’s a clear need and buy-in to actually implement them. I provided some more context in my reply to Sj, but yes delaying work on the new graph extension is one of the concerns. CCiufo-WMF (talk) 21:00, 25 April 2024 (UTC)
- I would say the order of graph types is Lines, stacked bars, pie. My section at #Stats explains why. Probably normal bars need to go in-between lines and stacked bars, since stacked bars depend on bars. Any precise timeline of how long it takes the community to convert graphs is dependent on the graph definition. As a refresher, I did say moving from Vega2 to Vega5 would take 3 months, and this move is probably going to take at least that, likely more. Snævar (talk) 08:58, 19 April 2024 (UTC)
- That stats might be misleading a bit. It's been a year and communities probably decided to move on. I know plwiki did.
- I created the piechart module which might even be better then Vega in some cases and I think all piecharts on plwiki are now replaced. Ported this to enwiki too.
- Timeline can act as a replacement for bar-charts. Those are static images so they get crowded quite fast... But it is possible and I've done that for population charts, I'm guessing enwiki did too.
- Line-charts can be replaced with bar-charts if you can remove some data points (e.g. show 5 year periods instead of showing progress every year).
- Stacking: you could do that with timeline, but calculations are messy but maybe some communities found ways to craft that. I think I just separated most of stacked charts on plwiki.
- I also remember removing some charts because there were just too many data points and the charts wouldn't be readable as a static image... So yeah, I think those stats are probably off by quite a lot. They can be helpful, but don't read them too literally. Nux (talk) 16:16, 19 April 2024 (UTC)
- (sarcasm)Right(sarcasm ends). You have not seen phab:T137291, where easytimeline would be removed. It is upto each community what they do, just because one does one thing, does not mean that all of them need to follow suit. Easytimeline handles big numbers poorly, they need to be scaled for it to work at all. Both D3 and OWID are more feature rich than EasyTimeline, or I should say Ploticus, the software behind it. Ploticus is not being developed and easytimeline gets minimal attention, just so it is known. I am not looking for suggestions for solutions at all. You are welcome to update those stats, I will not do it. Snævar (talk) 23:29, 19 April 2024 (UTC)
- Yes, timeline was worse, that's I actually agree 😉. On plwiki we did migrate to Graph... And then migrated back. I'm just saying most uses of timeline were probably Graph/Vega uses a year ago. Hence that stats need to be taken with a grain of salt. Nux (talk) 11:39, 20 April 2024 (UTC)
- (sarcasm)Right(sarcasm ends). You have not seen phab:T137291, where easytimeline would be removed. It is upto each community what they do, just because one does one thing, does not mean that all of them need to follow suit. Easytimeline handles big numbers poorly, they need to be scaled for it to work at all. Both D3 and OWID are more feature rich than EasyTimeline, or I should say Ploticus, the software behind it. Ploticus is not being developed and easytimeline gets minimal attention, just so it is known. I am not looking for suggestions for solutions at all. You are welcome to update those stats, I will not do it. Snævar (talk) 23:29, 19 April 2024 (UTC)
- Thanks for the suggestions on which graph types to start with @Snævar, that’s really helpful! Regarding converting legacy graphs, can you elaborate more on what your concerns are about why it would take longer? Do you mean that if the new definition is quite different from Vega’s, it’ll just take longer because each graph will need more attention to rework? CCiufo-WMF (talk) 21:01, 25 April 2024 (UTC)
- If the new definition is functionally different than Vegas, then yes, that takes longer. If the new definition has the same functions, but they are named differently, then that is just a simple replace job, easily done in mass by bots. Snævar (talk) 12:53, 2 May 2024 (UTC)
- There is a problem with Vega´s definitions though, and that is they are too complex. EasyTimeline is simpler and en:Module:Graph even simpler still. It is probably just worth "biting the bullet", make the definition user friendly at the cost of it taking longer to convert. There would be benefits in the future for doing so. Snævar (talk) 13:10, 2 May 2024 (UTC)
- If the new definition is functionally different than Vegas, then yes, that takes longer. If the new definition has the same functions, but they are named differently, then that is just a simple replace job, easily done in mass by bots. Snævar (talk) 12:53, 2 May 2024 (UTC)
- That stats might be misleading a bit. It's been a year and communities probably decided to move on. I know plwiki did.
- It just feels like nothing has moved on in the year since the vuln was found. I mean, from what I read, the idea for most of that time was to update Graphs to use a newer version of Vega but now it's "replace Graphs"? Sounds like scope creep to me. Eilidhmax (talk) 13:31, 19 April 2024 (UTC)
- Essentially they looked into it, and came to the conclusion that updating to a newer version of vega would not satisfactorily fix the problem. Bawolff (talk) 20:26, 21 April 2024 (UTC)
- As someone who works professionally in data visualisation, I'd be very sad to lose the Vega interface, which (via inspiration from Wilkinson's grammar of graphics and Wickham's R implementation ggplot2) is the product of a great deal of thought and refinement by statisticians. Without at all meaning to denigrate the WMF tech team (it's just a difficult problem), I don't think that they can realistically develop anything that it is anywhere near as good in-house. The update mentions rendering server-side would avoid known or substantial security risks, such as those in the legacy Graphs extension – so why not just use server-side vega? Joe Roe (talk) 16:04, 21 April 2024 (UTC)
- Staticians by and large aren't the people editing articles, and not the only use case a graph extension needs to meet to be good. Vega could be great for data professionals and still a bad fit for us. I think this is demonstrated by how for all vega's power, essentially none of that was used in all the years it was actually enabled. Almost all previous uses were of one or two basic types. If the power of vega was actually useful to Wikipedia's usecase, someone presumably would have used it in the last decade. The fact nobody did suggests it was not useful in practise. Bawolff (talk) 20:24, 21 April 2024 (UTC)
- We did at Basque Wikipedia using interactive climate graphs. So someone used it. The fact that we didn't use it more is also related to having Vega 2 instead of Vega 5, with higher capacities. That said, the sentence could be great for data professionals and still a bad fit for us needs a definition of us. Because, as far as I know, us should be the central repository of free knowledge. Theklan (talk) 06:16, 22 April 2024 (UTC)
- Sure, that is a reasonable definition of "us", although i would probably just simplify "us" to mean "wikipedia" (sorry sister projects). I think its good to define usecases to what we really need, since if we try to do everything we end up not being able to do anything well. Bawolff (talk) 06:28, 22 April 2024 (UTC)
- There are two problems for that definition. First, we are not "Wikipedia". Second, the discussion here is not about what we need, but about what we currently can, what is the worst scenario for strategical thinking. Our needs can't be constrained by what our engineers could solve within a fiscal year. Theklan (talk) 07:22, 22 April 2024 (UTC)
- The two aren't disconnected though. The riskiest part of any software project is a disconnect between what users need and what developers think users want. Its one of the main reasons software projects fail across the industry. The lack of a coherent vision around graphs between different stakeholders makes this an incredibly risky project. The risk reduces the amount of things that the foundation can do in a fiscal year. We could do a lot more if we all figured out what we actually want/need. Bawolff (talk) 19:14, 22 April 2024 (UTC)
- The main risk is having something unusable for one full year, and then bringing back something that doesn't solve the issue, and makes even more difficult to improve things in the future. It is going backward, not just trying to be where we were. The idea of doing things that "fit in a fiscal year" is also the worst possible way of thinking. We can't afford a Foundation that only thinks on things that are solvable in a 9-month-window. That's completely destructive, contrary to any strategical thinking and the way projects die. Theklan (talk) 06:40, 23 April 2024 (UTC)
- I'm not sure WMF said they were stopping at the fiscal year - they just wanted to have a first version out by that time and thought that would be a good milestone to plan to. Presumably after that point they would re-asses, see if they are on the track or need to change direction and if further development is warranted. Planning development effort for like the next 5 years is generally a bad plan. Its hard to predict the future and its better to plan in short intervals (Agile!) so you can adjust to changing circumstance. That doesn't mean once the plan ends you are done. Regardless, timeboxing is a very common risk mitigation strategy to deal with uncertainty in software development amidst competing goals and would totally be reasonable here. I think you have the cause and effect backwards here. Limiting the planning to 9 months isn't the risk - it is what one does when nobody can agree on what should be done and the foundation needs to limit the risk of an open ended project that might never complete and might never make anyone happy. Smaller projects limit the risk of going too far down the wrong road before course correction can take place. Like you said earlier - it has been a year since graphs have been removed. In that year all anyone has done on the community side is talk past each other on what is needed (see the wikimedia-l mailing list for example). It would be different if there was widespread agreement among the community about what is needed along with evidence that such a solution would be applicable to a large number of articles (say >1% overall or >5% of featured articles). But that is not what happened. I suspect that makes it hard to justify doing a really large multi-year project like you want. Bawolff (talk) 15:11, 23 April 2024 (UTC)
- Users could talk among themselves and reach an agreement among themselves. Most of them can be asked the question of what is useful, but not what software to use. That is the problem, the questions in September or August where becoming increasingly software related, so there was no answer. The percentage of graphs on English Wikipedia would not reach 1% of all articles. On english wikipedia, there are 18k pages with Vega graphs, some of which are on talk pages. If I where asked about if I care about a feature that affects less than 1% of articles, I would say no. So any statistical post I have made is intentionally in numbers, not percentages, because that is a better selling (convincing) point. Percentage of wikis with more than 10 graphs would also be a high number.
- In therms of finding those counts, we do not have the tools for it. Even with this graph feature, that is on a less than a percentage of articles, I was close to hitting the limits of Global search (global-search.toolforge.org). Really, the only way to get results of all graphs, is to visit the search of every project, and there are hundreds of them. So I am forced to take a subset of graphs and make stats out of those. Snævar (talk) 18:47, 23 April 2024 (UTC)
- A proper search could search for these, in the existing Vega graphs:
- line graph (either linear or point type):
- scales[0].type = linear
- scales[1].type = linear
- scales[0].type = point
- scales[1].type = point
- bar chart:
- marks[0].type = rect
- pie chart:
- marks[0].type = arc Snævar (talk) 05:37, 26 April 2024 (UTC)
- To be clear, I am not saying users should be asked what software to use. On the contrary, I generally think its a very bad thing when users are asked that type of question (to be clear, I also do not think WMF has asked anyone that question). The part that users should answer is what they want to do with whatever software is eventually chosen: For example, What sort of things do they want to be able to communicate using graphs, what things are hard now that they hope will not be hard in the future, etc. WMF is made up of people, and those people do not read minds. The ideal case would be the community is explicit enough about what its needs are, that an outside observer would be able to read through this discussion now, wait a year, see what WMF makes, and then be able to say whether or not what WMF makes actually did what the community said needed to be done (In terms of users editing the wiki can do X. Not in terms of software package Y was deployed). As it stands, I don't think there is very much consensus on the community side over what the goal of all this is. Thus no matter what WMF does here, people will probably be unhappy because there is no concrete ask and thus there is no objective way to fulfill it. Bawolff (talk) 01:11, 2 May 2024 (UTC)
- That's a good point, @Bawolff. What do we need to be able to do? At least, this. Theklan (talk) 06:17, 2 May 2024 (UTC)
- I would argue that the pacman game on your link [1] is probably not a core requirement for most of our users. Bawolff (talk) 01:35, 3 May 2024 (UTC)
- No, because it is copyrighted. But having an usable Pacman at en:Pac-Man (66 language versions available) is not out of scope, as it is also knowledge. Anyhow, take out Pacman from the equation. All other ouser cases are possible. Theklan (talk) 05:13, 3 May 2024 (UTC)
- I don't see how anyone would want to hand-craft anything similar. I did manually draw a snooker table to be really accurate on the dimensions and because computers where slow then (I would do it in Inkscape today). So, no, I don't think anything similar is in scope. It's much better to create such complicated things in a graphical software. For graphs we need things that would be easily updated, look good and have some amount of interactivity to better show data. If it's not easy to update it would be just someone's crazy exercise like brainfck language. Nux (talk) 07:29, 3 May 2024 (UTC)
- A lot of things might be nice but that is a very different question from what is needed to accomplish some goal. Pointing at a list of basically every graph type isn't really believable as a user need as it seems unlikely users really need every graph type ever (Or if they do, that's something that would require evidence and justification because I at least personally don't believe it and I imagine others doubt it too). Saying we need everything says nothing about what is actually important and what is not, so is likely to be ignored by the people actually making the solution. Bawolff (talk) 07:36, 3 May 2024 (UTC)
- Vega5 supports everything there. Theklan (talk) 07:40, 3 May 2024 (UTC)
- I would certainly hope so, given it is a link to the vega website. However, just because something exists does not necessarily imply that it is needed. Bawolff (talk) 07:56, 3 May 2024 (UTC)
- Vega5 supports everything there. Theklan (talk) 07:40, 3 May 2024 (UTC)
- No, because it is copyrighted. But having an usable Pacman at en:Pac-Man (66 language versions available) is not out of scope, as it is also knowledge. Anyhow, take out Pacman from the equation. All other ouser cases are possible. Theklan (talk) 05:13, 3 May 2024 (UTC)
- I would argue that the pacman game on your link [1] is probably not a core requirement for most of our users. Bawolff (talk) 01:35, 3 May 2024 (UTC)
- I think the minimum ask is clear: restore functionality that was already present with the Graph extension. The community has also asked for better interactive graphs and easier editing for a long time, see [2] and [3] for example. Ita140188 (talk) 06:19, 2 May 2024 (UTC)
- That is not an "ask" in the sense i mean (its an implementation detail and not a usecase) Bawolff (talk) 01:33, 3 May 2024 (UTC)
- How is previous functionality not a use case? I think what is needed is pretty clear Ita140188 (talk) 06:59, 3 May 2024 (UTC)
- Hmm, in fairness I suppose that could be a use case. I think there is two ways to interpret your statement - either the use case is compatibility with existing content or the use case is that that particular syntax is important for some reason, much more so then the actual graphs. I guess the question would be is this the general view, and why? For the former interpretation, converting the graphs to some other format seems annoying, sure. However many are template generated, so all and all in doesn't seem like an undue burden in the grand scheme of things. So if that is your view, I would ask why you think this is a core requirement? For the former interpretation: I would again ask why? Why is the vega syntax so special that no other syntax could possibly meet the needs of users? Bawolff (talk) 07:53, 3 May 2024 (UTC)
- I never mentioned vega, or even same syntax. I said we should restore previous functionality. As in, we should be able to create (at least) the same types of charts. The proposed solution (although not really clear) seems to suggest that functionality will be less than what we had before. I also added that another long-term ask from the community (since at least 2020) is interactive graphs. An example of basic interactive graph functionality can be a chart where you can selectively enable some lines and not others, or change from absolute to relative in a stacked bar chart. Ita140188 (talk) 07:58, 3 May 2024 (UTC)
- This still raises the question of why should we "restore previous functionality"? Is this an argument for always restoring functionality regardless of how useful it is? I would find that pretty unconvincing. If we're limiting the argument to useful functionality then we are back to trying to define what the use cases of graphs are. Bawolff (talk) 08:31, 3 May 2024 (UTC)
- By this reasoning we may as well abandon the whole project and do nothing. We should do it because it was already widely used and especially because a selection of basic chart types is an essential tool for visualizing data and information, which itself is a core mission of Wikipedia. But I think the real problem is that the WMF and part of the community is by now more preoccupied with endless discussions and debates rather than actual working on problems. Nothing has been done for a year, in a normal (small) software company this should have been fixed in a matter of days or weeks. Ita140188 (talk) 08:36, 3 May 2024 (UTC)
- Let me put this in phabricator terminology so you understand it Bawolff. Any question about how much a feature is used is stalled on encreasing the limit in global search. The limit of search results would need to go from 5k to 100k or there needs to be a proper json search capable of that kind of search. Until then, you are not getting a response to your questions, unless it is one WMF site at a time. By site, I mean things like the "French Wikipedia", not all Wikipedias or even all Wikiquotes. Snævar (talk) 15:35, 3 May 2024 (UTC)
- This still raises the question of why should we "restore previous functionality"? Is this an argument for always restoring functionality regardless of how useful it is? I would find that pretty unconvincing. If we're limiting the argument to useful functionality then we are back to trying to define what the use cases of graphs are. Bawolff (talk) 08:31, 3 May 2024 (UTC)
- I never mentioned vega, or even same syntax. I said we should restore previous functionality. As in, we should be able to create (at least) the same types of charts. The proposed solution (although not really clear) seems to suggest that functionality will be less than what we had before. I also added that another long-term ask from the community (since at least 2020) is interactive graphs. An example of basic interactive graph functionality can be a chart where you can selectively enable some lines and not others, or change from absolute to relative in a stacked bar chart. Ita140188 (talk) 07:58, 3 May 2024 (UTC)
- Hmm, in fairness I suppose that could be a use case. I think there is two ways to interpret your statement - either the use case is compatibility with existing content or the use case is that that particular syntax is important for some reason, much more so then the actual graphs. I guess the question would be is this the general view, and why? For the former interpretation, converting the graphs to some other format seems annoying, sure. However many are template generated, so all and all in doesn't seem like an undue burden in the grand scheme of things. So if that is your view, I would ask why you think this is a core requirement? For the former interpretation: I would again ask why? Why is the vega syntax so special that no other syntax could possibly meet the needs of users? Bawolff (talk) 07:53, 3 May 2024 (UTC)
- How is previous functionality not a use case? I think what is needed is pretty clear Ita140188 (talk) 06:59, 3 May 2024 (UTC)
- That is not an "ask" in the sense i mean (its an implementation detail and not a usecase) Bawolff (talk) 01:33, 3 May 2024 (UTC)
- On my project is.wikipedia, there is a need for layered graphs. About 6% of our articles have old TimedText graphs that need that. Vega 2 did not have layers, that was added in Vega 3. Other than that, there are graphs that pull data from wikidata.
- I do agree with restoring functionality that was present in the Graph extension, but to Vega 3 like I mentioned, not the Vega 2 version that was prior to it being disabled. I do however not belive that WMF can pull it off, so I am settling for the next best thing. Snævar (talk) 12:57, 2 May 2024 (UTC)
- That's a good point, @Bawolff. What do we need to be able to do? At least, this. Theklan (talk) 06:17, 2 May 2024 (UTC)
- I'm not sure WMF said they were stopping at the fiscal year - they just wanted to have a first version out by that time and thought that would be a good milestone to plan to. Presumably after that point they would re-asses, see if they are on the track or need to change direction and if further development is warranted. Planning development effort for like the next 5 years is generally a bad plan. Its hard to predict the future and its better to plan in short intervals (Agile!) so you can adjust to changing circumstance. That doesn't mean once the plan ends you are done. Regardless, timeboxing is a very common risk mitigation strategy to deal with uncertainty in software development amidst competing goals and would totally be reasonable here. I think you have the cause and effect backwards here. Limiting the planning to 9 months isn't the risk - it is what one does when nobody can agree on what should be done and the foundation needs to limit the risk of an open ended project that might never complete and might never make anyone happy. Smaller projects limit the risk of going too far down the wrong road before course correction can take place. Like you said earlier - it has been a year since graphs have been removed. In that year all anyone has done on the community side is talk past each other on what is needed (see the wikimedia-l mailing list for example). It would be different if there was widespread agreement among the community about what is needed along with evidence that such a solution would be applicable to a large number of articles (say >1% overall or >5% of featured articles). But that is not what happened. I suspect that makes it hard to justify doing a really large multi-year project like you want. Bawolff (talk) 15:11, 23 April 2024 (UTC)
- The main risk is having something unusable for one full year, and then bringing back something that doesn't solve the issue, and makes even more difficult to improve things in the future. It is going backward, not just trying to be where we were. The idea of doing things that "fit in a fiscal year" is also the worst possible way of thinking. We can't afford a Foundation that only thinks on things that are solvable in a 9-month-window. That's completely destructive, contrary to any strategical thinking and the way projects die. Theklan (talk) 06:40, 23 April 2024 (UTC)
- The two aren't disconnected though. The riskiest part of any software project is a disconnect between what users need and what developers think users want. Its one of the main reasons software projects fail across the industry. The lack of a coherent vision around graphs between different stakeholders makes this an incredibly risky project. The risk reduces the amount of things that the foundation can do in a fiscal year. We could do a lot more if we all figured out what we actually want/need. Bawolff (talk) 19:14, 22 April 2024 (UTC)
- There are two problems for that definition. First, we are not "Wikipedia". Second, the discussion here is not about what we need, but about what we currently can, what is the worst scenario for strategical thinking. Our needs can't be constrained by what our engineers could solve within a fiscal year. Theklan (talk) 07:22, 22 April 2024 (UTC)
- Sure, that is a reasonable definition of "us", although i would probably just simplify "us" to mean "wikipedia" (sorry sister projects). I think its good to define usecases to what we really need, since if we try to do everything we end up not being able to do anything well. Bawolff (talk) 06:28, 22 April 2024 (UTC)
- We did at Basque Wikipedia using interactive climate graphs. So someone used it. The fact that we didn't use it more is also related to having Vega 2 instead of Vega 5, with higher capacities. That said, the sentence could be great for data professionals and still a bad fit for us needs a definition of us. Because, as far as I know, us should be the central repository of free knowledge. Theklan (talk) 06:16, 22 April 2024 (UTC)
- Staticians by and large aren't the people editing articles, and not the only use case a graph extension needs to meet to be good. Vega could be great for data professionals and still a bad fit for us. I think this is demonstrated by how for all vega's power, essentially none of that was used in all the years it was actually enabled. Almost all previous uses were of one or two basic types. If the power of vega was actually useful to Wikipedia's usecase, someone presumably would have used it in the last decade. The fact nobody did suggests it was not useful in practise. Bawolff (talk) 20:24, 21 April 2024 (UTC)
- Oppose The way to solve this is not to reinvent the wheel and build a new visualization tool. There are zero chances that a self developed tool can be as good as existing tools that are well-maintained and developed for years. This proposal is actually kind of concerning for me, because it goes against any good practices in software development and gives me even less hope (if possible at this point) that the WMF knows what they are doing --Ita140188 (talk) 06:29, 25 April 2024 (UTC)
- So javascript is scary, and iframes don't cache; let's just create a static image and be done with it. What is the advantage of creating a static graph image in the page editor? Why not just put an image in commons and then use it? In fact, if you want to add graph editing tools, why not add them to commons upload? I upload data and have the option to create a graph. What I miss is interactivity, such as OurWorldInData. I have assumed that we need javascript for that interactivity, but based on comments here I am wondering about CSS. Graph interactivity is mostly about drilling down, by popping up details about a line or section of a map on mouse over. In the OSM example above CSS is used to add labels and markers. What about revealing those things on hover? The graph generator could embed all the popups as hidden sections, since the data is static and known at build time, and even embed the CSS inline in the graph. I'm sure there are others who know SVG and CSS better than I do. Is this possible? (PS I still think there needs to be a future world in which interactive content that uses javascript is permissible.)Tim-moody (talk) 14:34, 3 May 2024 (UTC)
- A lot of graphs (the precentage differs by WMF sites) have the layout of the graph set in a template and then the data is semi or fully unique on each page. I am not willing to duplicate the same graph layout in thousands of graphs. Javascript is for those that need to make it more interactive than the base model is.
- Our thumbnailing system does return PNG's from SVG's, so that is a policy change on WMF's side that would need to happen. Snævar (talk) 15:32, 3 May 2024 (UTC)
- There is a task about the possibility to add SVG inline (phab:T334372). AFAIK there have been no decision to go for it or even to implement it in a specific way (there are a few ways to add SVG to a page). Technically this has been a possibility since at least 2019 when IE8 flatlined...
- But SVG is either scary (because JS) or mostly static (if used inside an img tag). But also not all JS is scary. We are writing HTML here and it works because of algorithms that make it safe. User generated JS would be off limits probably, but library generated JS might be OK. Nux (talk) 22:37, 3 May 2024 (UTC)
- I would like a graph system that outputs SVG, not raw SVG. Converting Vega graphs to raw SVG has some similarities with moving from a Lua module to a wikicode template. Some of this also applies when going from EasyTimeline to raw SVG. There is a lot of duplication, of colors for each bar, colors for the x and y marks; including individual color definitions for each mark on those axis. Those marks also need to be created from scratch. Then there is scaling, to make sure the graph fits within the total width and height given. All of that is done in a graph program for you, where as a raw SVG would need an template to help you. Instead of specifying a color for each component like this, Vega allows you to specify one color or a set of colors, one set for bars, one for marks. EasyTimeline allows you to specify one color for bars, one for marks. Both Vega and EasyTimeline are much easier. Snævar (talk) 11:16, 4 May 2024 (UTC)
- Strong support This is the right approach: have a tool that is useful for editors, which adresses what communities want. A WMF-maintained graph extension is perfect. Good job. Cremastra (talk) 21:59, 10 May 2024 (UTC)
- I like the idea of a community-extensible and -maintainable framework. I think we should find an existing visualization community we want to work with, and figure out how to integrate their library into MW + Wikipedia, as the top-line goal. That seems significantly more aligned with our mission than treating this like an internal feature request meriting a new internal tool + library, which only we [or MediaWiki users] use and maintain.
- This identifies an existing community of practice who care about knowledge and its visualization (like Joe Roe notes above), and about designing and maintaining tools for it (like the communities of open devs who help maintain those tools). They can help name and solve open problems, can talk about our use case at their own events and come to ours.
- This would consolidate efforts across the free knowledge software community, supporting an existing open ecosystem that is also relied on by fellow travelers (researchers, writers, educators) pursuing our mission, rather than trying to recruit open developers to our own. It also provides future-proofing for users of the tool, if this stops being a MediaWiki or Wikimedia priority.
- We would benefit from ongoing work by that existing community to develop new visuals and features, and find bugs or vulnerabilities (as was the case with Vega, no?)
- I suggested the OWID library because they are highly mission-aligned and extremely popular. But many others would do (including finding ways to improve and to secure an improved Vega). Rather than choosing "which simple visuals to implement first", we might be identifying "which existing functions in this library to secure first". Sj (talk) 18:11, 21 May 2024 (UTC)
- Exactly this, @Sj. This is the right approach to the issue. Theklan (talk) 06:17, 22 May 2024 (UTC)
- As someone who's been involved in many software projects I think your timelines are highly naive. The existing upgrade from one version of Vega to another was ongoing for years and was close to completion. Now that work is all going to be thrown away because of "not made here"-itis from ruling-on-high that is the plague of any organization. This does not help editors and it does nothing but waste WMF funds that could be better spent elsewhere. 73.162.189.54 21:16, 24 May 2024 (UTC)
I just find it funny that it's been dead for 14 months and you're barely starting to plan a replacement. Imagine if you worked for a real business 🤣 DimeCadmium (talk) 17:41, 9 June 2024 (UTC)
- To be fair, in big business if something is problematic it just gets removed. Remember when Facebook used to have all this games (like farmville, mouse hunt etc)? Facebook was actually a game platform and then they just stopped. And of course there's that Google's graveyard 😉. I'm sad it happened to graphs, as so much community effort went into this, but it's not uncommon or unseen... Nux (talk) 20:19, 9 June 2024 (UTC)
Consider stats-communications research
[edit]I will be quite glad to have anything. Thank you for your work on this.
There is a large body of research on what types of graphical representations of data are most effective for accurately conveying information. For instance, estimating area and volume understandably has a larger error than estimating length. People can also judge relative lengths far better than relative angles, for instance, so accurately estimating proportions off a bar chart is more accurate than estimating them off a pie chart.
I'd strongly suggest consulting with some researchers familiar with this field. Limited effort is best deployed on the most effective and versatile ways of conveying information.
As someone who has sometimes wanted some unusual-but-useful graph forms, like violin plots, but never got around to implementing them, I strongly favour easy extensibility. That said, existing plotting software, like R, has a lot of inconsistencies, and cookbooks are often the most useful way to learn it. This may suggest that an overarching ontology for graphs, with a consistent, easy-to-learn syntax, is next to impossible.
As an relatively simple mitigating measure, I'd like a structure for posting instructions and code for making a plot to Commons. Not running the code, just posting the source, for studying and reuse. For instance, in my abysmally-titled Commons:File:Citrus tern cb simplified.svg, I included very poorly-formatted replication instructions. This enabled another editor to update the plot when full genome sequences of the fruit were published.
For large datasets, ease of data import is important.
I also suggest a text rendering of the graphs, to be used as an alt-text for screenreaders.
Again, thanks! HLHJ (talk) 03:41, 19 May 2024 (UTC)
- In many cases alt is not needed and might be misleading. Like when you have a table and then a graph - graph in this case is like an icon on a text button (icon repeats the text). This is a classic example of where you don't use alt. And also I do hope that graphs won't be in `img` tag so they have some interactivity. Nux (talk) 19:24, 19 May 2024 (UTC)
A brief informal early June update
[edit]Hey everyone, this is Szymon, you may know me from my work on Vector 2022. I'm joining the task force, so from now on, when you want to talk/ask about graphs, you may also ping me on wikis and platforms like Discord or Telegram. 🖖
I've just added our group on the Wikimedia Product page, so now it's official and transparent. We're forming the engineering part of the group, and we're close to getting it done. In addition, we're beginning to sketch out what the new service would look like. You will hear more from us in the coming weeks. Thanks! SGrabarczuk (WMF) (talk) 23:25, 4 June 2024 (UTC)
- Hi Szymon, glad to see you here. Just amplifying from above, in addition to "what the new service looks like" (a nice exercise) please consider sketching "what backfill will be made available for people replacing past uses that will no longer be supported" and "how this uses existing libraries and infrastructure, and invites community co-maintenance, so WMF staff are not the only ones who can / do routinely maintain the core of this service". Sj (talk) 23:52, 4 June 2024 (UTC)
- Glad to see a promising early May update posted in early June! I used to exploit (or abuse?) the graph extension to create a module that displays QR codes on-wiki. Now I truly wish to have a clear view of what features would be retained and what would be dropped, and how it would affect such creative uses. MilkyDefer 14:42, 10 June 2024 (UTC)
- Lol, I meant to write "early June", obviously this was about early June. Thanks for pointing this out @MilkyDefer! SGrabarczuk (WMF) (talk) 16:21, 10 June 2024 (UTC)
- So what are the updates on this? In the update in mid-April the plan was to start working on the new extension from the start of July. We are almost at the end of June but none of the decisions have been taken and no discussions of what library or what types of graphs to include has taken place. Not even any summary of the result of the above discussion to see if the community agrees with this approach (does anybody care anyway? Is this something we have a say on? I'm not sure). What's going on? Ita140188 (talk) 07:27, 21 June 2024 (UTC)
- Hey @Ita140188 (and others reading this!), both Chris and I weren't available lately, otherwise you'd definitely have had updates before Friday last week :) First, thanks for these comments and your continued involvement. I'm positive we're going to have many discussions together over the next months.
- Secondly, here are some of the latest news:
- We have picked a name for the new project—Charts. There's a new Gerrit repo, a Phab tag, and an Extension page. We are also listed in the annual plan as WE3.3 (our part was added last week).
- Everyone is onboarded, and we're kicking off regular work this week. We have defined all major steps ("epics" in the Phabricator language), and deciding on the library will be among the first ones.
- I'm working on a project page where we will have an outline of the project scope, a list of task force members, an FAQ, and whatnot. There, we'll post our next official update. Of course I will let you know when this page is posted.
- I know I'm not addressing all of your questions, Ita140188. Lots of things will be clarified within just a couple of weeks, so I need to ask you for just a bit of patience. I hope what I've shared is useful already, though! Thanks, SGrabarczuk (WMF) (talk) 23:03, 24 June 2024 (UTC)
- Great to see. Thanks Szymon! Sj (talk) 01:06, 25 June 2024 (UTC)
- Hello, today I'd like to invite you to look at the new project page: Extension:Chart/Project and add it to your watchlists. In the lead and the first section, you will find information about our strategy, audience, etc., so lots of things you have helped us to shape by sharing your thoughts. There's also our first project update capturing where we are at now. You will find news about libraries there (@Ita140188).
- In addition, I'd like to invite you to watch the project on Phab (technically, "watch" or "join") and add comments there.
- Finally, I have a question to you ... but let's discuss that one on Extension talk:Chart/Project. Thanks! SGrabarczuk (WMF) (talk) 02:39, 3 July 2024 (UTC)
- SGrabarczuk : Could someone please update Extension:Graph and Extension:Graph/Plans ? Sj (talk) 16:23, 30 August 2024 (UTC)
- Hey @Sj, what part of the page are you hoping to see changed? Is this edit the kind of update you had in mind? SGrabarczuk (WMF) (talk) 16:57, 30 August 2024 (UTC)
- SGrabarczuk : Could someone please update Extension:Graph and Extension:Graph/Plans ? Sj (talk) 16:23, 30 August 2024 (UTC)
Misreasoning in 'Rationale'
[edit]The article says "In English Wikipedia, graphs are used on about 10,000 articles, which is 0.15% of all articles, and across all Wikipedias, they are used on about 178,000, which is 0.28% of all articles." That is rather silly way to build a rationale.
Wikipedia contains countless (or more precise: millions) of articles which are seldom viewed by a human, mostly by bots and crawlers. Much fewer articles are in high demand. Yet those popular articles are what we should care about most. Please take this into account, and use a weighed average including pageviews instead.
Here are popular articles with a graphical timeline, including pageviews over the last 30 days
The_Beatles (323,000), Julius Caesar (282,000), History_of_Jerusalem (229,000), Queen_(band) (161,000), Imagine Dragons (135,000), The_Beach_Boys (111,000), Supreme_Court_of_the_United_States (100,000), Arctic_Monkeys (65,000), History_of_video_game_consoles (277,000), Star_Trek:_The_Next_Generation (75,000), History_of_Apple_Inc (22,000), History_of_New_York_City (16,000), History_of_Jerusalem (15,000)
Here are a few examples of the millions with less than 15 views over the last 30 days (and those might well be unidentified bots and crawlers after all): Nematopoa (13), Edell (13), Loxoconchidae (1)
Please correct the rationale and ask translators to do likewise. Thanks Erik Zachte (talk) 20:08, 15 October 2024 (UTC)
- Extension:EasyTimeline is completely unrelated and is working completely fine. Aaron Liu (talk) 20:18, 15 October 2024 (UTC)
- It's working, sure. I wouldn't say it's working fine. On plwiki we replaced some of the graphs previously made with Vega to have anything rather then nothing. But rendering is mostly inferior. Fonts are of very bad quality and much less readable then almost any font you would find in a browser.
- Kerning issues: The glyphs are too tightly spaced and also inconsistent. Letters appear mostly cramped, making it hard to distinguish letters.
- Poor antialiasing: The font suffers from bad antialiasing, with jagged or pixelated edges on the curves and diagonals of the glyphs.
- Inconsistent widths of strokes: The lines within the glyphs are uneven, with some strokes appearing thinner or thicker than others. This inconsistency detracts from the font's overall clarity and coherence.
- Legibility Problems: The combination of tight spacing, rough edges, and uneven strokes reduces legibility, making the font difficult to read, in particular for smaller fonts.
- You'll find examples of these problems right at the top of Extension:EasyTimeline#Charts_examples (look more closely at labels). The problems are actually even worse on mobile. Some of these problems with fonts would be less noticeable on mobile as more device pixels are available (compared to a typical desktop). It would be better, but it's not — all EasyTimeline images are static with 1x resolution and no 2x alternative for higher-resolution screens. In fact, charts should ideally be vector images to be more readable and potentially more dynamic. Nux (talk) 22:05, 15 October 2024 (UTC)
- Yeah, I agree that rendering is inferior. What I mean is that it works as intended. Aaron Liu (talk) 23:44, 15 October 2024 (UTC)
- It's working, sure. I wouldn't say it's working fine. On plwiki we replaced some of the graphs previously made with Vega to have anything rather then nothing. But rendering is mostly inferior. Fonts are of very bad quality and much less readable then almost any font you would find in a browser.