Jump to content

Topic on Talk:Growth/Personalized first day/Structured tasks/Add an image/Flow

Update 2021-08-09: starting Iteration 1

16
MMiller (WMF) (talkcontribs)

Hello @HLHJ @Mike Peel @Czar @John Cummings @T Cells @Pigsonthewing @Sdkb @John Broughton @Pelagic @NickK @LittlePuppers @Zoozaz1 -- thank you all for being part of the discussion around the "add an image" structured task! It's been several months since I posted an update around this work, but I hope to re-engage you all to help think about the next steps we're taking. In short, after about a year of community discussions, user tests, and prototypes, the Growth team has decided to build a first iteration of this task for the web. We're going to build it quickly and minimally, and try it just on our four pilot wikis (Arabic, Czech, Vietnamese, and Bengali Wikipedias). While there are still lots of risks and open questions, all the information we've gathered make us think we should continue to proceed and learn -- we think if we are smart about it, this structured task can be successful. So we're going to build carefully, and we're going to be looking to pivot if things aren't going well.

The materials about our plans and our mockups are under the new "Iteration 1" heading on the project page. To see the validation work we did leading up to this decision, see the "Idea validation" section. I know it's been a while, so here's a quick refresher on the main steps we've taken so far:

  1. Tested a simple algorithm for matching images to Wikipedia articles based on Wikidata to make sure it is accurate.
  2. Had conversations on this page and on Arabic, Czech, Bengali, and Vietnamese Wikipedias.
  3. Ran user tests with newcomers on a prototype to see if newcomers like and understand the task.
  4. Learned from the Android team's simple version of the workflow for the Android app, which did not save edits, but did collect a lot of data on how users do the task and how they perform.

The data from that last step (the Android version) is what allowed us to decide to move forward with the Growth team version.

I hope you all can take a look at the plans and mockups we've posted. We have multiple design concepts and a few decisions we've made (that are not set in stone!) around how we want to scope this first version so that it will be simple for users and quick to build.

What do you think of these plans? Which design concepts stand out to you as strong and weak? What pitfalls do you see or ideas do you have? In particular, we're hoping to hear ideas about how best to implement "quality gates" (you'll see what I mean on the project page). Thank you all!

Czar (talkcontribs)

Thanks for reaching out!

  • Feed A/B: If the image+question mark logo overlaid Concept A, I think you'd serve both intents of an enticing illustration that doesn't lead the viewer to assume its association
  • Onboarding Concept B, no doubt. I'd be curious what users would sit and consider that full page tutorial before seeing the actual task. Concept B at least shows the editor in context, though it is much more cluttered. Alas, that is why I would consider this task a high risk implementation for mobile!
  • Adding the image: Hard to tell what's happening. Concept A gives more context, but if Concept B is easier to implement, I think the trade-offs are in favor of testing this faster rather than completely
  • Caption: I'd recommend skipping this for the MVP. I wouldn't expect new users to know enough about either the article subject or the photo subject to make a meaningful connection between the illustration and the text. So either leave the caption blank or machine translate the most basic description (i.e., translate text from another Wikipedia's caption or pull from Commons structured data). The hypothesis you're testing is mainly whether new users will engage in adding photos on mobile and whether their assessments are generally accepted by the community ("engagement and efficacy", not "captions" IMO)
  • Rejection Concept B, no doubt. Checkboxes better than radio buttons (multi-select) and puts users back in the flow rather than adding another "submit" step.
  • Quality gates: This might be controversial but I'd put this on the community rather than the tool. When a Wikipedia signs up for this trial, there should be some common area where they are regularly reviewing submissions and have a shut-off valve to either pause the experiment or bar specific users until structural issues are resolved. This could be as simple as a wikitext json page where active editors can easily add/remove usernames and/or designate what threshold they'd like to use to limit participation.
  • Side note re: this finding from validation stage, community members are cautiously optimistic about this task This summary seems overly generous from my read of these talk page discussions. :) From my end, typo correction assistance would be a better lead for editor growth and a lower lift for WMF design and editor participation (easier to pull in casual readers/less onboarding needed). This image stuff is extremely complex, high risk to implement for the development time involved, and—I won't harp on it—too complex for new editors to do well.
MMiller (WMF) (talkcontribs)

Thank you for detailed feedback, @Czar. I'm going to pass it all along to our team's designer, but I'll also respond to some and perhaps you have additional thoughts:

  • Onboarding: our inclination toward the full page tutorial is that we want the user to absorb the importance of what they're doing -- they'll be making a real edit by actually adding an image to an article for all readers to see. But because it covers the whole screen, it teaches more of the "why" than the "how". We've been talking about how we might get both kinds of onboarding across. I agree that we are at risk of clutter and that the interface is quite busy. That's one of the reasons we wanted to plunge into mobile first -- to make sure that we would figure out how this should work on mobile, as opposed to designing mobile as an afterthought of desktop.
  • Captions: we're definitely concerned that this will be the hardest part for the newcomers, since it requires more effort and skill than just selecting "Yes" or "No" for the image. But we thought it necessary to include captions because it's our understanding that if images are added to articles without captions, they'll likely be reverted. Do you think that's the case? Can you think of a way to get around it? If we do implement with captions, we'll want to make the Commons metadata clearly available so the user can reference it while writing the one that belongs in the article.
  • Rejection: it's interesting to hear that you're in favor of no "summary" step if the user isn't publishing an edit. Our team has been divided on this: on the one hand, we don't want to burden them with unnecessary steps, but on the other hand we want them to get in the habit of seeing a summary screen before finalizing some work.
  • Quality gates: we'll be enabling patrollers to keep an eye on edits from this task via an edit tag. For instance, here's the French recent changes feed filtered to just "add a link" edits. But for smaller wikis, they may not have enough patrollers to keep up with all the incoming edits, and we're worried about patrollers waking up in the morning to hundreds of image additions that need to be reverted. Therefore, we think those wikis will likely appreciate some automated gates. But, as you say, that doesn't mean every wiki needs to have the same rules! Have you see our work on "community configuration"? It's a way for local administrators to configure the Growth features for all users on their wikis, and it's a perfect place to let them set thresholds on things like "how many image edits per day are allowed" -- some wikis could just set it to zero, to monitor and regulate things themselves.
  • Cautious optimism: thanks for calling this out. I definitely don't want to be mischaracterizing community opinions to take too rosy of a view. Our team really relies on community input, and I take it seriously. I think that if I had to summarize my read of the conversations it would be something like, "If this task works, it would be great. But that's a big 'if' because there are many potential pitfalls." Does that sound right to you? I think it's important to mention that beyond the English discussions on this page, we also received feedback on Arabic, Czech, Vietnamese, and Bengali Wikipedias, which was translated by our team's ambassadors. Some of those smaller wikis are more optimistic about this task because their wikis are trying to grow quickly, and are willing to tolerate more bumps and messiness along the way. Taking it all together, our team felt that this task was worth building a first iteration as cheaply as we can. We'll be reusing a lot of the "add a link" work, and I'm going to be very open to realizing that things aren't working, and then changing course. Also, you mention that working with images is extremely complex -- I am hoping that in doing some of this work we can start to see ways to make it not so complex. Now, about typo correction -- I have good news! We're just starting the research on that, to figure out what may be possible algorithmically. I hadn't announced it yet because I wanted to focus on this image conversation first. But please feel free to check out how we're thinking about it and add any thoughts to the talk page.
Czar (talkcontribs)
  • The more I think about it, the more I think my hypothesis is that editing on mobile is a fundamentally different experience than editing from desktop and I don't know what kind of transference there is betwixt the two. Like in the case of the full-screen onboarding, the task is extremely context-heavy, hence the onboarding, but can there ever be enough context to make the task feel natural? I'd wager not but the experiment will tell. Focused tasks that are bite-sized (low-context) in nature with high-profile impact (user immediately sees the effects on the article page) would seem to be a good fit if there are specific workflows built for that purpose, e.g., add short description, add a link, copy edit a single word. Those low-context tasks do not require full-screen introductions. With this onboarding design, I wonder if this task is combining too many desires into one: Educating the user on how to do this well (such that they can contribute productively and hopefully become a long-term contributor) which is the direct opposite of giving the fastest route into editing. Again I'd wager that simple editing tasks that pique the user's curiosity and make them return for multiple sessions would be a better route to growth than giving them the info upfront.
  • On images being reverted without captions—this will really depend on the community and you might want to ask them directly. I've dropped images in other language Wikipedias before (usually without captions) because I didn't take the time to learn how their infobox standards work in their community. To my knowledge the edits stuck, or at least I didn't receive messages that they were reverted. It depends how clear/obvious the addition is (i.e., does it depict the subject or is it a form of decoration?) This is what makes Wikidata tasks interesting for newcomers, as having them build out structural connections based on what is already written/given/depicted has the potential to autopopulate sections in multiple Wikipedias at once. To the question, a productive editor would take a new user's captionless addition and upgrade it into an infobox or add a caption if needed rather than outright reverting it, but it depends on the local editing standards and what the community considers productive.
  • Totally agreed with your team that it's better to get editors into the practice of seeing and writing specific edit summaries, but per above, I'm also of the mind that it's more important for the native app editing experience to be frictionless than to replicate the desktop experience.
  • For quality gates, that "community configuration" sounds great—exactly what I was picturing! One idea is to push decision making on your indicator metric(s) to the community, to have them set an acceptable revert ratio, like if more than 1 of 10 edits are reverted, trigger a notification or talk page message and shut off access, etc.
  • re: adding images being complex, I was thinking of w:en:WP:JOBS (I know you posted on the talk page), which rates image-related tasks as "intermediate-level" based on the complexity of image copyright, syntax, and rules. I'd sooner just say it's a high-context action as opposed to typos, copy editing, categorization, reverting vandalism, all of which are ostensibly simpler to accomplish in short sessions on a tiny mobile screen.
  • My characterization of the talk page responses for this feature would be "lukewarm" more than cautious optimism, with about half giving feedback on improving the experience (potentially influenced by social desirability) and half suggesting that this may be too complex a direction for newcomers to do well, and altogether few outright endorsing the direction. Mind that, especially for English-language users, we're used to most proposals being killed in the cradle, which might appear as curmudgeonly to outsiders, but from experience I can say it's a really effective way to make sure time is well spent (if something isn't going to work, wouldn't you rather know well in advance?) On a more personal note, I've really enjoyed seeing this project evolve over the last year and want to give you and your team kudos for being so open. My understanding is that few WMF projects have solicited editor feedback like this (or at least that's the stereotype) and I imagine there's a connection with a lesson I know from my own profession, how endless commentary from the cheap seats ends up stifling projects, so I know it's a tough line to walk and am grateful that you and your team still see value in engaging. :) This all said, of course I'm really excited about typo correction!
MMiller (WMF) (talkcontribs)

Thanks for the notes, @Czar. Obviously there are still a ton of open questions, but as you say, the "experiment will tell". I'm glad you're in support of us diving in and starting to see what does and doesn't work in the real world on the wikis.

Regarding mobile, do you feel like there are some editing tasks that are just not suited for mobile, no matter how well we design the interfaces? That would be a tough pill to swallow, because of the growing number of people in the world who are mobile-only. I'm hopeful that any wiki task (including writing articles wholesale) could theoretically be accomplished from a mobile device as our designs evolve.

Mike Peel (talkcontribs)

I still think you're going in the wrong direction with this, and in a way that will overshadow Wikidata integration directly into the wikis. This is particularly true for the Portuguese Wikipedia, which I see in your list of potential wikis to expand to - they have Wikidata infoboxes already live, and it would be much better to add infoboxes to articles rather than just images.

The ideal thing I'd like to see is something that goes through Commons categories, asks users to select the best image from a selection, and then adds that to the Wikidata item. That would fit in much better with existing Wikidata-based workflows, and is something that can't easily be automated. I know that's a very different tool and workflow from what you're thinking of: this is just me wishing. :-)

MMiller (WMF) (talkcontribs)

Hi @Mike Peel -- thanks for bringing this back up, and for originally teaching me about Wikidata infoboxes during the last round of this conversation. After learnig more about it, we decided that this first iteration would not add images to articles that have infoboxes at all. That way, we at least won't be placing images outside of the infoboxes they belong in.

The main way the Growth team is coming at this task is that we're trying to build editing workflows that get newcomers engaged, and make them want to come back and continue participating. While I think they might be engaged by adding images, and could be capable of doing it well, I think that adding infoboxes is a taller order, requiring a deeper understanding of Wikidata and the rest of the ecosystem. Does that sound right, or do you think such a task could also be newcomer-friendly? For this first iteration of images work, I think we want to follow the shortest path to discovering whether we're "on to something" here -- i.e. do we have a task that is engaging for newcomers? I think if the answer is "yes", then it would make sense to talk about the insides of the task, in terms of which images get placed onto articles in what way. Perhaps we'll be able to rework it so that we do enable the newcomers to get images onto the articles via Wikidata infoboxes, designed in such a way that it is understandable for them. Does that make sense?

In terms of how much Wikidata infoboxes are being used -- is it your sense that all Wikipedias are starting to use them broadly? Or are they popular on some wikis and not catching on in others?

I like your idea of going through Commons categories to choose a P18! Just to make sure I understand, would these be two Wikidata items that are candidates for such a task?

I'll bring this idea up with the Structured Data team, who have more experience working on Commons, and see what they think.

Mike Peel (talkcontribs)

Yes, those examples are correct - although you're normally best looking at the 'multilingual sites' link rather than P373 (I'm hoping we'll get rid of the property at some point, since it's duplication and often wrong).

Sdkb (talkcontribs)

I'm not as familiar with infoboxes that draw images from Wikidata, but I do share a somewhat similar concern as Mike. What we seem to need is additional images uploaded to Commons, better data on Commons images to make them findable, and better tools to help editors do the finding. This task doesn't seem to do any of those things directly, rather just facilitating the spreading around of images an experienced editor has already identified. The data on the quality of the algorithmic suggestions looks promising, but I'm still a bit skeptical.

MMiller (WMF) (talkcontribs)

@Sdkb -- I think you're touching on something important. The algorithm is quite simple, in that all it does is aggregate up connections between images and entities that have already been established by human editors -- no computer vision or AI. An advantage of that is that the algorithm is very transparent: it will be easy for any of us to see why an image was recommended for an article. But the disadvantage is that it doesn't really generate any new knowledge. Therefore, I've been thinking that the ideal scenario in the future would be if upstream features and workflows encourage people to add images to Commons and to describe and tag them well -- going all the way to placing them on Wikidata items. Then the algorithm can keep picking up the new images and new connections, and offering them as additions to articles. Basically, the algorithm and this "add an image" workflow would be opening up a pipeline between Commons and Wikipedia -- but the images would have to keep flowing from Commons. Does that sound right to you? It sort of gets at how the labor might be divided up, with some people supplying and classifying images in Commons/Wikidata, and others down the assembly line attaching them to the right articles.

I think that Wikidata infoboxes do some of this automatically, by placing the most important image on the article. But longer articles are enhanced through having many images illustrating the various sections, a task that probably takes a good deal of human judgment (until our capabilities with structured data and structured content are much more advanced). While this first iteration of "add an image" only puts a single image on an unillustrated article, I could imagine it evolving to enable placing subsequent images in specific sections of an article.

Sdkb (talkcontribs)

That idea of linking within a broader ultimate system makes sense. Given that the ultimate system hasn't been developed yet, it's a little hard to know how essential or beginner-friendly this component of it will be, but there's possibility.

Sdkb (talkcontribs)

Re the quality gates, I think we ought to take a multi-pronged approach. Editors moving too quickly, accepting too high a percentage of suggestions, and getting reverted are all signs they may be doing a poor job. Especially initially as this rolls out, the software should be aggressive about prompting them to improve, and if they don't, cutting off their access to the tool. The one thing that doesn't look like a red flag to me is doing a lot in a single day—if people get sucked into it, we can let them be.

MMiller (WMF) (talkcontribs)

Thanks, @Sdkb. I agree that three behaviors you listed (speed, overacceptance, and reverts) are the best ways to guess if someone is editing poorly. We're going to see which of these we'll be able to implement. Restricting people to a certain number of suggestions per day is only the bluntest way (and likely the easiest to implement technically), but as you say, it restricts the productive editors as well as the unproductive ones.

Off the top of your head, what do you think might be the minimum amount of time someone should spend on the task, such that if they're spending less, we should be concerned?

Sdkb (talkcontribs)

Makes sense. And I'm not quite sure about the minimum time; it's hard to tell without the context of how long most editors spend.

Pigsonthewing (talkcontribs)
MMiller (WMF) (talkcontribs)

Hi all -- I just wanted to let you all know that I posted the findings from our user tests of the designs for Iteration 1. There are a few interesting things to look at in the Design section (with more details on the project page):

  • Interactive prototypes of the design concepts we tested. You can actually click through them and try out real image suggestions.
  • Findings from the user tests when we asked 32 newcomers to try out the prototypes.
  • Link to the final mockups based on the user test findings.

Designs have not changed dramatically from what you last saw in August, but we did learn some important things from the tests. Users understand the task quite well -- understood that they are adding images to Wikipedia articles, that they need to use their judgment while considering the article text and metadata, and that the caption will be posted with the image in the article.

We're now doing the engineering work to build these designs. We're interested in your thoughts and reactions at this point (and any point in the process), but I will say that right now our next big learning moment will come when we actually deploy Iteration 1 to a small number of pilot wikis (likely Arabic, Czech, Bengali, and Spanish Wikipedias), all of whom will have community buy-in for the test. We've taken in lots of community thoughts, and learned through several rounds of tests. We're starting to get to the point (about two months away) when we'll finally have some real results to discuss together and to decide whether/how to proceed! There are definitely still risks and open questions, and we'll need to learn how those pan out in reality so that we can adjust. I also recognize that you all have brought up the idea of directing the efforts toward adding and populating Wikidata infoboxes -- that's still in the backs of our minds as we use Iteration 1 to understand how well newcomers can handle tasks like this around matching images to content (and we're building it in the least expensive way first, to speed up the learning).

Thank you all for following along.

Reply to "Update 2021-08-09: starting Iteration 1"