Jump to content

Content translation/Templates support rewrite 2016

From mediawiki.org

This is a planning and architecture document for the project to rewrite CX's template support in mid-2016. For documentation for translators and template maintainers, see Content translation/Templates.

Current state

[edit]
  1. All block level templates are filtered out from the source page (unless template mapping exists)
  2. Inline templates are not filtered out from the source page.
    1. Example: Cite web templates
  3. While translating, we check if there is a corresponding template in the the target wiki (using template mapping or Wikidata). If there is, then we transfer the template as read-only. Template name is adapted, also parameter names if given in template mapping.
  4. There is no editing support for templates.
  5. No message is shown to the user when a template is removed from the source page or status of adaptation in translation.

Problems

[edit]

Template mapping inside CX source code is not scalable

[edit]

Template mapping inside CX source code is not scalable. In practice, we did not expand the JSON configuration. It is possible that people are not aware of its existence.

Adaptation of template name alone is problematic

[edit]

Adaptation of the template name alone is highly problematic.  Here is an example, the mandatory parameters titol and url not adapted to catalan from spanish, while the template name Ref-web is adapted.  Published article will have the following error: Mandatory parameter titol not found

This is extra work for translator to clean them up manually after publishing.

Translator is not informed

[edit]

Template adaptation is silent: no warning or information in the UI about whether a template being removed from source article or status of adaptation in translation.

Non-editable areas inside the sections are not visually marked. It may be confusing for the translator.

Fallback is not really sensible

[edit]

The idea of using the text inside template in translation as fallback is not always the desired outcome.

Challenges

[edit]

Templates are local to wikis

[edit]

Some templates exist in most of wikis. Some don’t. It is possible that there is no corresponding template in any other language.

Connected does not means same template

[edit]

A single template in one wiki may be implemented in completely different way in another wiki.

It may be a group of 3 templates that exist in another wiki for a single template in one wiki.

Nested templates

[edit]

A single template can contain multiple templates as parameter values.

Knowing template parameters

[edit]

TemplateData extension documents template parameters in machine readable way. But this is incomplete: It exist only for a subset of templates. In smaller wikis, this rarely exist.

As a fallback, we can extract the template parameters from the template source code.

Parameter correspondence

[edit]

If we identify a template pair, knowing which parameter maps to which parameter is not easy. Many wikis just use english parameters – often same copy of English template. In this case we can do a template mapping easily.

Minor changes in template name, for example URL vs url, access-date vs accessdate or access  date or access_date can be solved by a normalized search.

Some wikis use english parameter name as alias for parameter, we can also do a lookup in template parameter aliases (example).

Notable WMF initiatives

[edit]

Increasing templatedata coverage

[edit]
  1. TemplateData: Implement inferred templatedata as fallback based on the contents of the template (T54581) – We can extract template parameters from template source code. But extra details like the type, description, default value, mandatory/optional values will be missing.
  2. Capture interlanguage template information as part of Extension:TemplateData – “We briefly discussed this during the development process, and decided that it was better to wait for the perennial proposal for cross-wiki templates to be implemented (and have TemplateData work seamlessly from that) rather than try to hack it in locally.” – James Forrester

Centralized template repository / Global templates

[edit]
  1. Central Global Repository for Templates, Lua modules, and Gadgets (T121470) – This is one of top 10 wish from the Community Wishlist Survey
  2. RFC: Shadow namespaces (T91162) – An idea similar to instantcommons where non-locally-existing templates will fallback to another wiki – mediawiki.org for example
  3. RFC: Sharing templates and modules between wikis (T122086) – poor man's version (investigation)
  4. Support crosswiki template inclusion (T6547 ) – transclusion => interwiki templates, etc.

Wikidata driven templates

[edit]
  1. Mainly info-boxes-CX will be able to do this adaptation when more templates are moving in this direction.

CX Template adaptation strategy (Draft)

[edit]

Note: This is just an idea dump, written by Santhosh for discussion alone. Will be refined and expanded soon

  1. Use an incremental approach:  Support the templates that we can.
  2. Do not generate partially or wrongly adapted templates in published article: We are doing this now and we need to change this. Title name only adaptations need to change.
  3. Avoid publishing errors because of wrongly adapted templates: There are bugs in our code causing publishing error with templates. Mainly technical work-some work already started.
  4. Do not silently drop the block level templates from source: In the UI, we may show them in grayed out or similar style and inform translator that we cannot adapt the template. We may give an optional way to copy it to translation with clear indication that translator will have to edit it after publishing. The missing context may be also an issue. But this should not interfere with the translation – having lot of this kind of immovable objects on the way can be distracting.
  5. Visually represent inline templates with adaptation status. We can have a card on click of them, informing it was partially adapted/fully adapted (all parameters mapped), whether translator need to edit them manually after publishing. Plus a button to remove. People often complain that CX generated articles create work for experienced users. If we can hint this to translators, they may edit the articles themselves after publishing or even aware that some work is needed to clean up.
  6. Enhance the adaptation module so that it knows the adaptation status instead of “this will be ok, if not others will edit it later” approach. For this use template data and target template information. We need to use this to inform users about possible extra manual edit requirement. Mainly technical work-some work already started.
  7. Do not diverge from mainstream WMF initiatives for cross wiki templates: There are multiple efforts from volunteers to do template translation (TemplateTranslation Lua module, A bot by Arnau Duran Ferrero at ca.wiki, templateParamWizard from Hebrew wiki). We need to adapt the good parts of this ideas, but since template adaptation potentially involves a large community collaborated mapping project, going behind smaller set of such data sources may not be productive (or can be even risky) in longer term perspective. It seems the efforts in WMF is mostly diverging for centralized templates and TemplateData defining machine readable data. And TemplateData will abstract the implementation changes to templates since VE heavily rely on it. Since CX plans to switch to VE, we don’t want to diverge from TemplateData approach
  8. Crowd source the template mapping inside CX: While the discussion and development towards a centralized template repo is going on, we can ask translators do a one time manual template mapping and remember it. We can give this to other translators for same language pair for same template. People will/can correct the mapping optionally if we inform that a specific template was partially adapted. More awesome if this data can be stored in a JSON namespce in meta or mediawiki.org just like where template mapping is stored in wikipages.