Future of Language Incubation
Future of Language Incubation is a new experiment to implement and test recommendations to support language incubation during the 2024â2025 fiscal year. This is a cross-departmental effort involving members from various teams: Language and Product Localization, Research, Product Analytics and Data Persistence SRE, and Community Programs.
The learning from this experiment will help inform their impact on communities and plan a direction for language incubation. See Future of Language Incubation#Strategy and approach.
Below, you can find information about the goals of this project, the history that has informed it, and why the Wikimedia Foundation's Product Department is prioritizing this work.
Future of Language Incubation
Implement and test language incubation recommendations
|
Objectives
[edit]- Communities are supported to effectively close knowledge gaps through tools and support systems that are easier to access, adapt, and improve, ensuring increased growth in trustworthy encyclopedic content. Part of the Wikimedia Foundationâs Annual Plan 2024â25.
- There is a clear picture of the state of languages and the process of supporting existing and new languages to the Wikimedia movement. Part of the Wikimedia Foundationâs Annual Plan 2023â24.
Current status of languages incubation
[edit]As of April 2024, Wikipedia has 326 active language editions. And yet, there are many more living languages in the world (7,164 as per Ethnologue), including many are spoken or signed by millions of people, in which there is no Wikipedia and no wiki at all. This is a blocker to fulfilling our vision that every single human being can freely share in the sum of all knowledge. Currently, Incubator serves as a centralized system for language creation within the Wikimedia ecosystem. It has been operating for 18 years without a platform owner. The current process of language incubation is divided into phases, each involving a series of complex manual processes:
Before Incubation: The process of request creation for a language in Incubator involves various manual steps, including understanding project principles, creating Meta and Translatewiki.net accounts, confirming language eligibility, translating essential messages. Tracking, approving, and rejecting requests by Language Committee members is also a labor-intensive manual process.
During Incubation: Incubator faces technical limitations and lacks many of the modern features found in other Wikipedia wikis (e.g., ContentTranslation and Wikidata integration are missing). This deficiency leads to a poor editing experience for contributors, as highlighted in numerous Wikimedia convenings and previous research. Language wikis often remain in the Incubator for several years before graduating, primarily due to the poor editing experience, fewer community contributors, and a shortage of native speakers. According to April 2024 statistics, the average duration for a language wiki to graduate from the Incubator is 4.4 years (e.g., Fon Wikipedia). Numerous other content restrictions involve the inability to search for content, add citations, and upload files. Additionally, there is a lack of support for essential tools like appropriate keyboards, online dictionaries, spell checks, and grammar tools for many small and underserved languages, which hinder the editing process. Machine translation is not available for smaller languages.
After Incubation: Upon approval of a language by the Language Committee, the setup of the wiki site, content importing, and ongoing maintenance entail a series of manual steps carried out through collaboration among community members, the Language Committee, and server maintainers, which sometimes takes several days or even weeks.
Envisioned future of languages incubation
[edit]In December 2023, several teams initiated discussions on enhancing the language onboarding processes, documented here. Various stakeholders from the community and staff shared their insights, contributing to the recommendations listed here.
Editing on Incubator should feel similar to editing on normal wikis, but we are far from achieving this goal.
We should forget about Incubator completely. And, find another way of starting wiki. Because of the complexities around it, it might take time to improve the technical side of it.
These recommendations aim to establish a streamlined technical infrastructure for creating language wikis and improving the complex processes involved in each of the distinct phases of language incubation: before, during and after. The recommendations cover various approaches, such as automating the addition and approval of new languages within Incubator, extending access to modern wiki features beyond Incubator, enhancing the editing experience within Incubator, and streamlining backend site creation processes. On the social front, they focus on fostering community growth and inclusivity within Wikimedia projects. Additionally, they propose exploring social pathways for language onboarding, including enhancing the discoverability of Incubator, creating welcoming pages, and orienting communities to relevant Wikimedia projects.
Strategy and approach
[edit]2024â25
Hypotheses (experiments we want to run in order to meet our key result):
- Collaborating with Language committee; If we create a selection criteria and collaborate with the Language Committee to identify 5 languages that will receive access to a full-fledged wiki with modern features, we will be able to co-articulate feasibility and readiness requirements for implementation and success measurement across the selected language communities
- Modern Features for Incubator Wikis: If we provide production wiki access to 5 new languages, with or without Incubator, we will learn whether access to a full-fledged wiki with modern features such as those available on English Wikipedia (including ContentTranslation and Wikidata support, advanced editing and search results) aids in faster editing. Ultimately, this will inform us if this approach can be a viable direction for language incubation for new or existing languages, justifying further investigation.
- Automate Backend Site Creation: If we move addwiki.php to MediaWiki Core and customize it to Wikimedia, we will improve code quality in our wiki creation system making it testable and robust, and we will make it easy for creators of new wikis and thereby make significant steps towards simplifying wiki creation process.
- Mapping Onboarding Journeys: If we document the pre-incubator, incubator, and post-incubator journeys for the five pilot wikis with quantitative and qualitative data, we will be able to better support new languages in the future.
Resources
[edit]- Research findings from the Language Diversity Hub examining challenges faced by contributors to small language versions of Wikipedia.
- Session on improving the Incubator led by Amir Aharoni at the Celtic Knot Conference.
- Wikipedia for Indigenous Communities session by Peter Gallert at Wikimania.
- Proposal for a wikiproject incubator to enable innovation by GergĹ Tisza.
- Session on increasing language diversity on Wikimedia projects by Sadik Shahadu at Wikindaba 2023.
- Slides from a Celtic Knot 2024 session that highlighted the current and future state of language incubation, research findings and potential improvements.
- Slides from a Wikimania 2024 session that focused on the state of language technology and onboarding at Wikimedia. Link to the video recording.
See also
[edit]- Gathering at WikiIndaba organized by User:MMunyoki (WMF) focusing on the Incubator.
- WMF's study on Incubator and language representation across Wikimedia projects.
- Previous Google Summer of Code (GSoC) project related to Incubator enhancements.