Jump to content

Language tools/Requirements/Indic language support

From mediawiki.org

This page contains an overview of the different parameters that are essential for proper language support in MediaWiki for a language. It contains the requirements that should be fulfilled for each language in the list of languages.

Requirements

[edit]

Requirements consist of two parts:

  • Languages in scope
  • Support properties

Languages in scope

[edit]

Current list (taken from en:Languages with official status in India[1]):

Language and community information

[edit]
Language ISO 639-3 code[2] ISO 639-1 code[3] MediaWiki code[4] Alternate names[5] Autonym Speakers
(in M)[6]
Language written? Literacy rate
of speakers
Wikimedia community[7] Standard body Other communities Language contacts Other sources
Assamese asm as as Asambe, Asami, Asamiya অসমীয়া 17 Yes   N   N   N   N   N   N  
Bengali ben bn bn Bānglā-Bhāshā, Bāngālā, Bānglā বাংলা 181 Yes   85% N   N   N   N   N  
Bodo brx N   N   Bara, Bodi, Boro, Boroni, Kachari, Mech, Meche, Mechi, Meci बोडो, Bodo, (Assamese script missing) 1 Yes   61% N   N   N   N   N  
Chhattisgarhi hne N   N   Khaltahi, Laria छत्तीसगढ़ी 17 Yes   N   N   N   N   N   N  
Dogri dgo N   N   Dhogaryali, Dogari, Dogri Jammu, Dogri Pahari, Dogri-Kangri, Dongari, Hindi Dogri, Tokkaru डोगरी or ڈوگرى 4.7 Yes   18% N   N   N   N   N  
English eng en en Sekgoa, Anglit English 328 Yes   N   en.wp N   N   N   N  
French fra fr fr Français Français 68 N   N   fr.wp N   N   N   N  
Garo grt N   N   Garrow, Mande, Mandi Mande, (Bengali missing) 1 Yes   55%+ N   N   N   N   N  
Gujarati guj gu gu Gujarati. ગુજરાતી 46 Yes   70% gu.wp N   N   N   N  
Hindi hin hi hi Khadi Boli, Khari Boli मानक हिन्दी 181 Yes   N   hi.wp N   N   N   N  
Kannada kan kn kn Banglori, Canarese, Kanarese, Madrassi ಕನ್ನಡ 35 Yes   60% kn.wp N   N   N   N  
Khasi kha N   N   Kahasi, Kassi, Khasa, Khashi, Khasiyas, Khuchia Khasi 1 Yes   63%+ N   N   N   N   N  
Konkani knn N   N   Bankoti, Concorinum, Cugani, Central Konkan, North Konkan, Konkan Standard, Konkanese, Konkani Mangalorean, Kunabi कोंकणी 4 Yes   N   N   N   N   N   N  
Kok Borok trp N   N   Kakbarak, Kokbarak, Tipura, Tripura, Tripuri, Usipi Mrung Kok-borok, (Bengali missing) 1 Yes   74%+ N   N   N   N   N  
Maithili mai N   N   Apabhramsa, Bihari, Maitili, Maitli, Methli, Tirahutia, Tirhuti, Tirhutia मैथिली 35 Yes   37% N   N   N   N   N  
Malayalam mal ml ml Alealum, Malayalani, Malayali, Malean, Maliyad, Mallealle, Mopla മലയാളം 36 Yes   100 N   N   http://smc.org.in Santhosh N  
Meitei mni N   mni Kathe, Kathi, Manipuri, Meiteilon, Meiteiron, Meithe, Meithei, Menipuri, Mitei, Mithe, Ponna ꯃꯤꯇꯩꯂꯣꯟ 1 Yes   73% Wp.mni.ꯋꯤꯀꯤꯄꯦꯗꯤꯌꯥ N   N   Awangba N  
Marathi mar mr mr Maharashtra, Maharathi, Malhatee, Marthi, Muruthu मराठी 68mr.wp Yes   77% N   N   N   N   N  
Mizo lus N   N   Duhlian Twang, Dulien, Hualngo, Lukhai, Lusago, Lusai, Lusei, Lushai, Lushei, Sailau, Whelngo Mizo 1 Yes   82% N   N   N   N   N  
Nepali nep ne ne Eastern Pahari, Gorkhali, Gurkhali, Khaskura, Nepalese, Parbatiya नेपाली 14 Yes   65% ne.wp (correct page for VP?) N   N   N   N  
Oriya ori or or Odri, Odrum, Oliya, Orissa, Uriya, Utkali, Vadiya, Yudhia ଓଡ଼ିଆ 32 Yes   64% or.wp N   N   N   N  
Eastern Panjabi pan pa pa Gurmukhi, Gurumukhi, Punjabi ਪੰਜਾਬੀ 28 Yes   N   N   N   N   N   N  
Malvi mup N   N   Malavi, Mallow, Malwada, Malwi, Ujjaini (Devanagari script missing) 10 Yes   58%+ N   N   N   N   N  
Sanskrit san sa sa संस्कृतम् 0 Yes   80% sa.wp N   N   N   N  
Santali sat N   N   Har, Hor, Samtali, Sandal, Sangtal, Santal, Santhali, Santhiali, Satar, Sentali, Sonthal (Bengali script missing), (Devanagari script missing), Santhali, (Ol Chiki script missing), (Oriya script missing) 6 Yes   20% N   N   N   N   N  
Sindhi snd sd sd Asambe, Asami, Asamiya سنڌي सिन्धी 21.3 Yes   N   sd.wp N   N   N   N  
Tamil tam ta ta Damulian, Tamal, Tamalsan, Tambul, Tamili தமிழ் 65 Yes   N   ta.wp N   http://www.thamizha.com ta:User:Logicwiki N  
Telugu tel te te Andhra, Gentoo, Tailangi, Telangire, Telegu, Telgi, Tengu, Terangi, Tolangan తెలుగు 70 Yes   N   te.wp (?) N   N   N   N  
Urdu urd ur ur Bihari اردو 61 Yes   N   ur.wp N   N   N   N  

Language support status

[edit]
Language ISO 639-3 code[8] ISO 639-1 code[9] MediaWiki code[10] ISO 15924 scripts In Unicode 5.1? In CLDR? In glibc? Trunk support? Wikimedia support? Active translators? Most used translated? Core 90%+ Wikimedia 90%+ Plural added? Numerals added? Date/time added? Toolbar images? Gender added? Free fonts? WebFonts? Collection support? Narayam mappings Search working? Search in Wikimedia? Script conversion possible? Conversion tables available? Conversion implemented? Supported in Kiwix? Mobile support?
Assamese asm as as Beng Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes N No N No N No Yes Yes N No N No N No Yes Yes Yes Yes N   Yes Yes N   N   N   N   N   N No N  
Bengali ben be be Beng Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes N No N No Yes Yes Yes Yes N No N No N No Yes Yes Yes Yes N   Yes Yes N   N   N   N   N   Yes Yes N  
Bodo brx N   N No Deva Yes Yes Yes Yes N   N   N No N No N No N No N No N No N No N No N No N No N No N   N   N   N   N   N   N   N   N No N No
Chhattisgarhi hne N   N No Deva Yes Yes N   Yes Yes N No N No N No N No N No N No N No N No N No N No N No N   N   N   N   N   N   N   N   N   N No N  
Dogri dgo N   dgo Deva, Arab Yes Yes Yes Done Yes Done Yes Done Yes Done Yes Done Yes Done N No N No N No Yes Done N No N No N No N   Yes Done N   N   N   N   N   N   N   N No N  
English eng en en Latn Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes NA N   N   N   N   N   Yes Yes Yes Yes
French fra fr fr Latn Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes N   Yes Yes N   Yes Yes N   N   N   N   N   N   Yes Yes N  
Garo grt N   N No Beng, Latn N   N   N   N No N No N No N No N No N No N No N No N No N No N No N   N   N   N   N   N   N   N   N   N No N No
Gujarati guj gu gu Gujr Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes N No N No Yes Yes Yes Yes N No N No N No Yes Yes Yes Yes N   Yes Yes N   N   N   N   N   N No N  
Hindi hin hi hi Deva Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes N No N No Yes Yes Yes Yes N No N No N No Yes Yes Yes Yes N   Yes Yes N   N   N   N   N   N No N  
Kannada kan kn kn Knda Yes Yes Yes Yes Yes Yes N   N   N No N No N No N No N No Yes Yes N No N No N No Yes Yes Yes Yes N   Yes Yes N   N   N   N   N   N No N  
Khasi kha N   N No Latn (modern texts), Beng (old texts) N   N   N   N No N No N No N No N No N No N No N No N No N No N No N   N   N   N   N   N   N   N   N   N No N  
Konkani knn N   N No Deva, Knda, Mlym, Arab, Latn (depends on area) Yes Yes N   N   N No N No N No N No N No N No N No Yes Yes Yes Yes N No N No N   N   N   N   N   N   N   N   N   N No N  
Kok Borok trp N   N No Beng, Latn N   N   N   N No N No N No N No N No N No N No N No N No N No N No N   N   N   N   N   N   N   N   N   N No N  
Maithili mai N   mai Tirh, Kthi, Deva N   N   N   Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes N No N No N No N No N No N No N   N   N   N   N   N   N   N   N   N No N  
Malayalam mal ml ml Mlym Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes N No N No N No Yes Yes Yes Yes N   Yes Yes N   N   N   N   N   Yes Yes N  
Meitei mni N   mni Mtei, Beng Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes N No N No Yes Yes N No N No N No Yes Yes N   N   Yes Yes Yes Yes N   Yes Yes N   Yes Yes Yes Yes N  
Marathi mar mr mr Deva (also "Modi", not encoded) N   Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes N No Yes Yes Yes Yes N No N No N No Yes Yes Yes Yes N   Yes Yes N   N   N   N   N   N No N  
Mizo lus N   N No Latn Yes Yes N   N   N No N No N No N No N No N No N No N No N No N No N No N   N   N   N   N   N   N   N   N   N No N  
Nepali nep ne ne Deva. According to Wikipedia in older texts also Takr; and Bhujimol and Ranjana (not encoded) N   Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes N No Yes Yes Yes Yes N No N No N No N   N   N   Yes Yes N   N   N   N   N   Yes Yes N  
Oriya ori or or Orya Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes N No N No Yes Yes Yes Yes N No N No N No Yes Yes Yes Yes N   Yes Yes N   N   N   N   N   Yes Yes N  
Eastern Panjabi pan pa pa Guru, Deva Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes N No N No N No N No Yes Yes Yes Yes N No N No N No Yes Yes Yes Yes N   Yes Yes N   N   N   N   N   N No N  
Malvi mup N   N No Deva N   N   N   N No N No N No N No N No N No N No N No N No N No N No N   N   N   N   N   N   N   N   N   N No N  
Sanskrit san sa sa Deva, Latn, others Yes Yes N   Yes Yes Yes Yes Yes Yes Yes Yes N No N No N No N   Yes Yes N   N   N   N   Yes Yes N   Yes Yes N   N   N   N   N   N No N  
Santali sat N   N No Olck Yes Yes N   N   N No N No N No N No N No N No N No N No N No N No N No N   N   N   N   N   N   N   N   N   N No N  
Tamil tam ta ta Taml Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes N No N No N No Yes Yes Yes Yes N No N No N No Yes Yes Yes Yes N   Yes Yes N   N   N   N   N   N No N  
Telugu tel te te Telu Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes N No N No Yes Yes Yes Yes N No N No N No Yes Yes Yes Yes N   Yes Yes N   N   N   N   N   N No N  
Urdu urd ur ur Arab Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes N No N No N No N No Yes Yes N No N No N No N No Yes Yes N   N   Yes Yes N   N   N   N   N   N No N  

Support properties

[edit]

List to be discussed with team. Current version created by Gerard, Niklas, Santhosh and Siebrand.

  1. Information to include in all overviews
    1. What is the language name (in English)?
    2. What is the ISO 639-3 code (with link to Wikipedia article and Ethnologue)?
    3. What is the ISO 639-1 code (if any)?
    4. What is the MediaWiki langage code (if any)?
  1. Language and community information
    1. What are the autonyms for the language in the scripts it can be written in?
    2. Which alternate names does the language have?
    3. What is the number of speakers (per Ethnologue)?
    4. What is the literacy rate (L1 only; leave blank if not available)?
    5. Does the language have a writing system?
    6. Is there an active Wikimedia community for this language (links to project's village pump page/embassy)?
      1. Is there a standard body for the language? If so, which?
      2. If no active Wikimedia community, is there an active online community (links to projects)?
      3. Who are the 1-3 goto people for testing language support for this language?
      4. Which initiatives can we use to speed up our language support?
  1. Language support information
    1. In which ISO 15924 scripts is the language being written? Add RTL if right-to-left.
      1. Are characters used in the various scripts of language part of the Unicode 5.1 standard[11]?
      2. Do the language/script combinations have CLDR[12] presence?
      3. Do the language/script combinations have glibc[13] presence?
    2. Is language currently supported in MediaWiki trunk (1.19 alpha)?
      1. Is language currently supported in the Wikimedia deployment?
      2. Are there active translators for the language/script combinations?
      3. Have the most often used messages been translated?
      4. Have more than 90% of MediaWiki core messages been translated?
      5. Have more than 90% of the MediaWiki extensions used by Wikimedia been translated?
      6. Is plural correctly defined for the language?
      7. Are numerals, number grouping and separators implemented for all scripts?
      8. Are time and date formatting implemented for all scripts/locales?
      9. Are localised images for the toolbar needed and added?
      10. If language uses gender, has support been added (namespaces)?
      11. Is collation correctly supported for all scripts?
    3. Are there freely licensed fonts available that support the language/script combinations (add names/URLs)?
      1. Is support for WebFonts needed? Motivate if no.
      2. Is there support for the fonts in the WebFonts extension?
      3. Are the scripts/fonts supported in PDF export/Collection extension?
    4. Which used keyboard[14]/script mappings need to be available in Narayam and as on-screen keyboard?
    5. If language uses multiple scripts, is automated script conversion feasible?
      1. Are script conversion tables available?
      2. Has script conversion been implemented?
    6. Is search working properly for the language's scripts in standard MediaWiki (confirmed by goto person)?
      1. Is search working properly for the language's scripts in the Wikimedia setup (confirmed by goto person)?
    7. Supported for offline (Kiwix 80%+, proper directionality support)?
    8. Mobile support (which questions need answering?)

References

[edit]