Jump to content

Wikimedia Apps/iOS Suggested edits project/Alt Text Experiment

From mediawiki.org

Background

Why are we working on this?

Images are increasingly prevalent, and can provide information: Recent research shows that images within Wikipedia articles drive higher levels of engagement with articles, and provide information in addition to illustration.[1] If a user cannot view the image due to low or no vision, or internet connectivity, they can utilize the alternative text (or alt text). Alt text is text that can be associated with an image that serves the same purpose and conveys the same essential information as the image.[2]

Many images in Wikipedia articles lack alt text: In 2021, research showed that 46% of images on English Wikipedia had captions, 10% of images had alternative text, and only 3% had effective alternative text.[3] Addressing the lack of Alt Text of images in Wikipedia Articles, and on Commons has long been a topic of discussion, research, and organizing. Various alt text editing events have been organized by Affiliates in Poland, Ireland, Argentina, and through global events such as Image Description Week 2022.

Improving the accessibility of Wikipedia: Recommendation #2 from the Movement Strategy is to improve user experience by supporting compliance with the most advanced accessibility guidelines using free and open-source software (WCAG for web, W3C mobile web best practices, etc.). A recent study of image accessibility on Wikipedia across languages suggested that one part of the solution is to invest in tools that can surface articles and images without accessibility coverage.[4]

The iOS Wikipedia app was Apple's editor's choice in 2017 as a result of the app's user accessibility features. Accessibility is an important factor to our design and development process on the apps teams, so a task to fill the gap in images with quality alt text is fitting for our team.

Our early prototype showed promise:We created proof-of-concept experimental build for adding alt text to images on Wikipedia in the context of their article, and shared it with users at the 2023 GLAM conference.

After getting feedback, we feel confident with scaling our experiment in the production version of app. Part of this experiment builds on top of our recent addition to the iOS app of the "Add an image" suggested edit.

User stories

  • As a Spanish Wikipedia editor in LATAM using the Add an Image feature for the first time on the Wikipedia iOS app, I would like to build confidence to add alt-text while on the bus, so that I can add alt-text to future image related tasks without concern of making the state of alt-text worse on Wikipedia for low vision users.
  • As a Portuguese Wikipedia editor in Brazil, I would like to become aware of opportunities to add alt-text to images in articles I care about, so that I can ensure all editors are able to gain the full context of those articles
  • As a user navigating articles with a screen reader, I want quality alt-text to be available , so that I have the same additional context about an article as users that are not using screen readers
  • As a user with limited data, I want to read alt-text, so even if images are not loaded, I am aware of what is in the image

Objective

If we conduct an A/B/C test with the alt-text suggested edits prototype in the production version of the iOS app we can learn if adding alt-text to images is a task newcomers are successful with and ultimately, decide if it's impactful enough to implement as a suggested edit on the Web and/or in the Apps.

This work is part of the 2024-2025 annual plan Wiki Experiences 1, focused on Contributor Experience.

The scaled experiment will not be a feed of images in need of alt text. Rather, we will prompt users to add alt text to images after they have made a related edit: either adding the image using image recommendations, or after editing an article. If the user accepts the prompt, they will be taken through a dedicated flow to add alt text to the image, with guidance and examples.

Through this experiment, we have the opportunity to

  • Learn if newcomers and experienced editors can successfully add alt text through a structured task with guidance
  • Try out 2 different moments to suggest additional edits to users:
    • After a user completes a suggested edit
    • After a user completes an article edit
  • Increase the number of images on target Wikipedias that contain useful alt text

Experiment Design

Research questions

  • Does reminding users to add alt text prove an effective way to increase the number of images with alt text, and number of "Add an image" edits that include alt text?
  • Should adding Alt text for images in articles be a stand-alone suggested edit? (And if so, would it be an appropriate task for newcomers?)
  • Would users appreciate prompts for other improvements they can make to the article they are editing?
  • Does a guidance help newcomers and experienced editors complete the task?

Experiment B: Add alt text Prompt after "Add an Image" flow

Currently, only around 16% of edits completed through Image Recommendations add alt text. We want to learn if reminding users to add alt text prove an effective way to increase the number of edits that include alt text. Editors will only be shown the prompt once.

  • Hypothesis: if we prompt users once to add missing alt text to their image after they have published an image using the Image Recommendations suggested edit, we'll see 60% of editors choose to additionally publish alt text, and 10% will add alt-text for subsequent image recommendation edits made in the next 15 days.
  • Audience: logged-in editors have completed an edit in the "Add an image" feature where alt text = null, and who have not already been sorted into experiment group C
  • Group A = 50% of "Add an image" editors are sorted into Control group when they forget alt text. They are not prompted to add missing alt text.
  • Group B = 50% of "Add an image" editors are sorted into Experiment group when they forget alt text. They are prompted once once to add missing alt text
  • Assignment into group A/B happens when a user completes an "Add an image" suggested edit where alt text = null.

Experiment C: Add alt text Prompt after standard editing flow

We want to learn if users would appreciate prompts for other improvements they can make to the article they are editing. This is a type of in-time suggested edit. Editors will only be shown the prompt once.

  • Hypothesis: if we prompt users to add missing alt text after they have published an edit on any article containing an image in need of alt text, 4% of editors will go on to add alt text to the image.
  • Audience: logged-in editors who made any type of edit on an article with an image lacking alt text, and who have not already been sorted into experiment group B
  • Group D = 50% of editors are sorted into Control group, they do not see any prompt after completing an eligible edit
  • Group C = 50% of editors are sorted into Experiment group, they see 1 alt-text prompt after completing an eligible edit.
  • Sampling into group D/C should happen immediately after a user publishes an edit that’s eligible for the prompt.
  • Eligible edits for follow-up prompt: any edit made by a logged-in user on an article in the main namespace

Feature Requirements

Must haves

  • Entry point will vary based on experiment group (see below)
  • If a user answers "no" to "Add alt text" prompt, gain reason for why
  • Prominent guidance for writing good alt-text
  • Users ability to get context about the image from the article
  • Users ability to access relevant metadata (can take user to Web)
  • Detection of which images do not have alt-text
  • Ability to publish alt-text
  • Alt-text edit should count as a separate edit ONLY when submitted separately
  • Alt text is published with automatic edit summary
  • Users should be prompted provide feedback about the feature
  • Instrumentation that allows us to evaluate the alt-text submitted
  • Input field for alt-text in context of image, preview of alt-text in context of our existing editor
  • Warning when alt-text exceeds 125 characters
  • Entry field should not allow users to add line breaks

Nice to have

  • Positive Reinforcement
  • Do not allow users to copy and paste in image caption
  • Suggest Alt-text (think Machine Assisted Article Descriptions)
  • Can playback what was written in preview
  • Surface Categories and Depicts


Experiment B Must Haves Experiment Group B (50% of Image Recs Suggested Edits users). They will enter the dedicated flow for Alt Text AFTER adding an image to an article using Image Recommendations.

Entry point: After image recommendations edit is submitted; if alt-text = null:

  1. Educate user on importance of alt-text and ask if they’d be willing to add alt-text to image
  2. If user selects yes; launch alt-text adding flow for most recently added image

After task completion:

  1. Show a survey when to ask users if it should be a separate task
  2. Return users to next image recommendations suggestion


Experiment C Must Haves Experiment Group C: 50% of all editors receive encouragement to add alt-text to an image when editing an article with an image

Entry point:

  • After edit where either of the following were true
  1. No edit was made to an image, but there is an image in the article without alt text
  2. Edit made to image but alt-text = null
  • Show prompt that an image in the article is in need of alt-text and explain importance of adding alt-text and ask users if they’re willing to add alt-text
  • If users selects Yes, launch alt-text adding flow for one image in article in need of alt text

After task completion

  1. Ask satisfaction
  2. Ask if this should be a dedicated task
  3. Bring user back to article they were reading

Experiment C Nice to Haves

  • Affordance to reveal / review alt-text on images in an article
  • If someone is abandoning an edit, launch the prompt for adding an image if they have over 50 edits

Target Wikis

While we welcome feedback from everyone, we are especially interested in hearing from:

  • Spanish, French and Portuguese speakers in the Americas (North, South and Central) and Caribbean
  • Chinese speakers in North America

We are planning to run this experiment in partnership with the following wikis:

  • Spanish Wikipedia
  • French Wikipedia
  • Portuguese Wikipedia
  • Chinese Wikipedia

How will we know we are successful?

Evaluating the quality of Alt Text:

In our preliminary research, we learned that alt text additions are not formally patrolled, so we may not be able to rely on revert rate alone to understand the quality of alt text being added. We plan to partner with an accessibility organization to evaluate the quality of the alt text produced through this experiment. We will ask them to rate each alt text entry from 1-5 with 5 being the highest quality, and 1 being lowest quality. They will also provide a predicted revert score, answering the question: if they saw that alt text added to an image, would they remove it from Wikipedia?

Leading indicators to be measured after 15 days

  1. 100 edits with alt text values, from at least 25 unique editors. At least 25 edits are from newer editors
  2. More than 15 unique editors have been assigned to each experiment group
  3. 70% task acceptance rate for group B, at least 10% acceptance rate for group C (# of people who enter the flow / impressions of prompt)
  4. Revert rate for newer editors edits in any single group does not exceed 18%

Key Indicators

  1. 60% of group B editors publish an additional edit with alt text for the image they were prompted on
  2. 4% of group C users add alt text when prompted after editing an article
  3. Of group B editors who make a subsequent image recommendations edit in the next 15 days, 25% add alt-text as a part of that edit
  4. 200 images are improved with Alt text, by at least 50 unique editors

Decision Matrix for Next Steps

  1. If 71% of edits are scored a 3 or higher* we will scale the feature. If less than 70% of edits are scored a 3 or higher we will improve guidance or use AI to better assist users.
  2. If quality scores* for newer editors are more than 50% worse than quality scores for experienced editors, we will not recommend this task be available to newer editors.
  3. If we see at least 60% say they would use feature that provided a feed of images in need of alt text, then we will have the confidence to pursue a feed of alt-text suggested edits
  4. If 60% or more of respondents say they would be interested in similar edit notifications for articles they are working on, and 60% of respondents are satisfied with the feature (Group C survey responses), we share this information and consider future edit prompts.

Guardrails

  1. Edit return rate of editors in group B or C who have received an Alt text prompt does not differ from controls by more than 10%
  2. Revert rate for edits from experiment groups does not exceed controls by more than 5 percentage points
  3. Human-graded* or actual Revert rate for newer editors in experiment groups does not exceed 18%
  4. Alt text task completion rate for newer editors is above 25% (Completion rate = number of alt text edits published / those who said “yes” to the prompt and started the flow)

Curiosities

  1. How does the task completion rate and revert rate for newer editors' alt text edits compare with that of experienced editors? With that of comparable rates from Growth suggested edits?
  2. How does the human-graded* revert rate compare to Android's Image Captions Suggested Edit?
  3. Is there a difference in metrics by language and geography? (For example, breaking down edits from Latin America vs Europe for Spanish)
  4. What is the most common reason that users decide not to act on the prompt?

* Note: for the quality scores and human-graded revert scores, we will partner with an accessibility organization who will be reviewing and grading alt text.

Definition of newer editors: Editors who had fewer than 10 edits on that wiki they are currently editing at the point they entered the experiment

Designs

Experiment B (Image Recommendations Flow)

Experiment C (Article Editing Flow)

How to follow along

We have created T357437, Alt-Text Suggested Edit Scaled Experiment on iOS as our Phabricator epic. We encourage your collaboration there or on our Talk Page. There will also be periodic updates to this page as we make progress on the experiment.

Updates

September 2024

  • The alternative text experiment is now live in production! You can view the work that went into this release on this task: T357440
  • Our preliminary 15-day analysis shows that:
    • 150 app users have been assigned into the experiment.
    • For those shown the prompt to add alt text, 55.3% are accepting the task after having completed an Image recommendation, and 16.0% are accepting the task after making an article edit. The acceptance rate for image recommendations was lower than we expected, but for article editing it was higher than expected.
    • 7 successful alternative text edits have been published by 7 unique users, and 0% have been reverted so far.
  • We are planning to extend the length of the experiment to 60-days, and to display another announcement about “Add an image” to motivate more users to enter the experiment. The experiment is now scheduled to run until 5 November.
  • We will be partnering with the accessibility organization, Fundación Dalat, for evaluating the results of the experiment. Native speakers for Chinese, French, Spanish, and Portuguese who will grade the respective alternative text entered, and review the edits.

August 2024 Special Update

  • The alt text experiment has been released to Beta Testers, on Spanish, Portuguese, French, and Chinese Wikipedias. Editors making edits on articles containing images without alt text, or adding images using “Add an Image” may see one prompt asking them to add alternative text to an image. The experiment will run until early November.
  • Check out a demo of the flow on Portuguese Wikipedia:
    Two A/B tests running in the iOS App that prompt users to add alternative text after a relevant edit
  • If you would like to download the beta version of the app to test, please follow these testing instructions.

August 2024

  • We began work on the Alternative Text experiment. We added a new capability in the app: developer settings. This will allow us to move quickly and deploy things behind feature flags more in the future. Tasks completed included:
    • Add developer settings option in the app. T334848
    • Add feature flag and toggle into Developer settings for alt text experiment. T370221
    • Update ABTestsController, Assign groups for alt text experiment. T370228
    • Publish Wikitext for Alt Text experiment - Flow B. T370236
    • Trigger Alt-Text task from Image Recommendations (flow B) and create Modal. T370229
    • Bypass edit count logic in existing explore feed card logic for target wikis. T370224
    • Add ability to send analytics events to wmflabs. T371412
  • All tasks are visible in Sub epic: T357437

July 2024

  • We held a Deep dive meeting with the team and stakeholders from the GLAM and Research teams to review the experiment plan, designs, and timeline. Most recent designs have been added above
  • We defined our comparison group for learning how newer editors performed on the task as: editors who had fewer than 10 edits on that wiki they are currently editing at the point they entered the experiment. We’ll use the term “newer editors” to refer to this group, as not to confuse them with brand new accounts.
  • The team began development on the Alt text experiment, you can follow our work on the Engineering Epic: T357440
  • We learned that in iOS, the baseline for image recommendation edits that contain alt text is 16%.
  • We reviewed a sample of the alt text that had been entered by users in Image Recommendations. We saw that 15% of edits evaluated repeated the exact same text in the image caption as in the alt text, and that only 14% of edits evaluated had alt text of reasonable quality. This stresses the importance of adding guidance for writing effective alt text in our flow, and for preventing users from copy / pasting between the fields.
  • We learned about a new form of AI that is creating alt text for images on the London Museum’s website. If results from this experiment show that users struggle with writing alt text, this could be an avenue to explore for adding AI assistance.

June 2024

  • We completed analysis of usability testing for flows B and C in English, and will be updating the designs based on these results and design review. T360567
    • Summary of results:
      • There was clear understanding on what the alt text prompt was asking them to do
        • For flow B: 5/5 understood that they were being asked to add alt text to the previous image because it's missing
        • For flow C: 5/5 understood that they were adding alt text for an image from the article you just corrected the misspelled word in.
      • The onboarding, tooltips, and guidance were clear and helpful to users, with some room for small improvements
        • 10/10 thought the onboarding was clear
        • 9/10 thought the tooltips were clear and helpful
      • Users would be able to zoom, but their first choice was not supported
        • 5/10 tapped on the image first, then pinch and zoom
        • 5/10 would pinch and zoom first on thumbnail, then tap to open
      • Most users could find the guidance they needed
        • 6/10 would open the "Guidance for writing alt text" link
      • Users could find additional information about the image
        • 6/10 knew to click on the info-i next to the image, 1/10 found it after being prompted.
        • 1/10 tapped on the image first, then found the info-i button -Most testers thought the image information provided was complete, while others noted room for small improvements
        • 6/10 felt the information was complete and not missing anything important
      • Testers understood the task, and were able to successfully summarize it afterwards
      • All testers felt confident they could add alt text to a new image

May 2024

  • Analysis of User testing is underway in English and Chinese. The testing was done with two prototypes of the flows for adding Alt text:
    • Flow B prompting users to add alt text to the image they have just finished adding using “Add an image”
      Prototype for Flow B of the Alt text experiment
    • Flow C prompting users to add alt text to an image in an article they have just fixed a typo or made an edit on:
      Prototype for Flow C of the Alt text experiment

September 2023 - April 2024

For earlier updates about the Alt Text suggested edit, see the overview page's updates.

Below are designs for the experiment as of April 2024:

References