Jump to content

Help:System message

From mediawiki.org
(Redirected from Localisation file format/en)
PD Note: When you edit this page, you agree to release your contribution under the CC0. See Public Domain Help Pages for more info. PD
i18n docs
Labelled diagram of the Special:Upload form, showing various system messages.

A system message is a snippet of plain text (nowiki), wikitext, CSS, or JavaScript that can be used to customize the behavior of MediaWiki and its appearance for each language and locale. MediaWiki uses messages for any user-facing part of the interface, allowing for internationalization and localization of the MediaWiki UI, for both core and extensions. All messages used in MediaWiki are defined in a messages file.

Overriding messages on-wiki

Messages can be overridden from their default values by editing them on-wiki. Each message has a wiki page in the MediaWiki namespace with its message key as the name of the page. For example, the "aboutsite" message is stored at MediaWiki:aboutsite. By default this namespace is restricted from editing unless the user has the "editinterface" permission. A list of all message pages can be found on Special:AllMessages. Editing interface messages is typically straightforward, just like editing a normal wiki page, but it is restricted to users with the editinterface permission, which is assigned to administrators (and interface administrators) by default.

Example row on the old Special:AllMessages.

The Special:AllMessages table contains two columns: the linked interface name, and the text. The text is horizontally split to show the default text above, and the customized text below. When a custom message does not exist, only the default will be shown. To customize a message, click the upper link in the left column (the name of the message). This link is red if the default text is in use, because the edit page is empty.

The lower links in the left column cells lead to the discussion pages for that message.

Overriding messages on the wiki is recommended only in the following cases:

  • The message has a severe mistake that must be fixed as soon as possible. In this case, it's recommended to also fix this mistake in the source code if it's in English or in the translation on translatewiki if it's not. When the correction is deployed, the page with the local customization should be deleted.
  • If the local wiki uses different terminology. For example, many messages use the word "page", but the English Wikipedia often says "article" instead.
  • The local message is trying to add some unique functionality, for example for a gadget or a template. (In such a case, it is still recommended to consider changing the source message or to encapsulate this functionality in an extension, so that other wikis would be able to enjoy it conveniently, without having to copy customizations manually.)

Finding messages and documentation

How each message is used by MediaWiki, variables available, parameters used, limitations, et cetera is explained with the complete documentation in the qqq pseudo-language files, as per message documentation guidelines. Some longer explanation pages may exist for some interface messages at the older Category:Interface messages .

In the wiki base of translatewiki.net, qqq is the page which holds the user documentation of the message (in English because it is the same shown to all readers).

In the same way as /en, /zu, /fr, ..., /qqq is a subpage of the article and is viewable directly.

From this point of view qqq is considered as a language in parameter language= of the request.
MediaWiki version:
1.18

In MediaWiki 1.18 and above, you can find a message key by browsing a wiki in the special pseudo-language code qqx, which can be done by appending ?uselang=qqx to the URL, or &uselang=qqx if the URL already contains a ? character (example). All the messages will then be replaced by their message keys, so you can identify which message is responsible. Messages that are always in the content language will not be shown using qqx.

In case the page uses tabs like e.g. special page "Preferences" you will have to add the tab after the uselang parameter, e.g. Special:Preferences?uselang=qqx#mw-prefsection-rendering.

MediaWiki version:
1.38
Gerrit change 765385

Before MediaWiki 1.38, fallback message keys were not shown, which made it difficult to identify the source of some messages, notably the page navigation tabs. Since MediaWiki 1.38 fallback message keys are shown separated by slashes (/).

MediaWiki version:
1.43
Gerrit change 1025837

Before MediaWiki 1.43, override message keys (using hooks like MessageCacheFetchOverrides ) were not shown either, which made it difficult to identify the source of messages overridden by extensions (such as WikimediaMessages ). Since MediaWiki 1.43 override message key is shown after an equals sign (=).

Localisation file format

All messages used in MediaWiki are defined in a messages file.

There are two types of message files in MediaWiki: JSON and PHP. As of April 2014, core MediaWiki and most of the maintained extensions were migrated to the JSON format. You should use JSON for all new development. For more information about the migration to JSON see Requests for comment/Localisation format.

JSON

Starting from late 2013 a new file format for messages was introduced: JSON. This is plain JSON, familiar as a common generic data storage format. Every key in it is a message key, and the value is the message text. In addition, the special @metadata key is used to store information about the translation, such as the translation authors.

Using JSON makes the localisation files more secure because it's not executable. It is also compatible with jquery.i18n, a JavaScript library developed as part of Project Milkshake, which provides MediaWiki-like frontend localisation capabilities and is used by some extensions that want to be less dependent on MediaWiki, such as VisualEditor and UniversalLanguageSelector.

Because the wider suite of internationalisation and localisation tools were called "Project Milkshake", some people call this format "banana".

File location

In MediaWiki core, localisation files are placed in the languages/i18n directory. MediaWiki extensions usually place theirs in an i18n/ subdirectory. If a large number of messages exist within a project, one may want to split these into two or more topical subdirectories for maintainability. In MediaWiki context, the $wgMessagesDirs configuration key is used to list these subdirectories. Here's an example from the VisualEditor extension for MediaWiki:

{
  "MessagesDirs": {
    "VisualEditor": [
      "lib/ve/modules/ve/i18n",
      "modules/ve-mw/i18n",
      "modules/ve-wmf/i18n",
      "lib/ve/lib/oojs-ui/i18n"
    ]
  }
}

You add new messages to the English "en" messages file en.json and document them in the message documentation file with the special pseudo-language code "qqq" – qqq.json. See also: Adding new messages.

Metadata

Currently the following metadata fields are used in the files:

authors
A JSON list of the authors of the messages. For English (en) and message documentation (qqq) these are added manually when the messages file is edited. For all other languages this is inserted automatically when the message file is exported from translatewiki.net. Message documentation can be edited on translatewiki.net, and documentation editors are inserted to the qqq.json file automatically as well.
message-documentation
This is the pseudo-language code for storing the message documentation. For MediaWiki this is always qqq. (This appears in some extensions, but it's not actually processed in any way. It's not mandatory.)

Conventions

Special characters like line breaks are escaped ("\n").

Unicode characters that represent letters in different alphabets are stored as real characters and not as character codes, because these files are sometimes read by people and because this makes the files smaller ("誼" and not "\u8ABC"). In any case, developers have few reasons to edit messages in any languages except English, because these are usually edited through translatewiki.net.

HTML code is not escaped either, so "<strong>Warning</strong>" and not "\u003cstrong\u003eWarning\u003c/strong\u003e".

The JSON files are indented using tabs.

PHP

This section refers to the use of MessagesXx.php files for localizing messages, which has been deprecated in 2014. However, the files are still used for other language-specific configuration .

The older localisation file format is PHP. This is essentially a PHP array with all the messages. In core MediaWiki each language resides in its own file in the languages/message directory of the MediaWiki source code. In the extensions all the languages and the message documentation (qqq) are in the same file: ExtensionName.i18n.php, usually in the main directory of the extension.

To migrate system messages from PHP to JSON, use the generateJsonI18n.php script. It will move the messages to JSON files and replace the text of the PHP file with a shim that points to the JSON files. This boilerplate code is needed for backwards compatibility with MediaWiki 1.19. It is not used in new extensions that do not require MediaWiki 1.19 compatibility.

Using messages

MediaWiki uses a central repository of messages which are referenced by keys in the code. This is different from, for example, the gettext system, which extracts the translatable strings from the source files. The key-based system makes some things easier, like refining the original texts and tracking changes to messages. The drawback is that the list of used messages and the list of source texts for those keys can get out of sync. In practice this isn't a big problem, and the only significant problem is that sometimes extra messages that are not used anymore still stay up for translation.

To make message keys more manageable and easier to search for, always write them completely and don't rely too much on creating them dynamically. You may concatenate parts of message keys if you feel that it gives your code better structure — but only do this when there definitely are multiple possibilities,[1] and be sure to put a comment nearby with a list of the possible resulting keys. For example:

// Messages that can be used here:
// * myextension-connection-success
// * myextension-connection-warning
// * myextension-connection-error
$text = wfMessage( 'myextension-connection-' . $status )->parse();

See also the coding conventions for dynamic identifiers.

To use a message in JavaScript, you have to list it in the definition of your ResourceLoader module, in the "messages" property.

The detailed use of message functions in PHP and JavaScript is on Manual:Messages API . This is an important documentation page, and you should read it before you write code that uses messages.

Message sources

Code looks up system messages from these sources:

  • The MediaWiki namespace. This allows wikis to adopt, or override, all of their messages, when standard messages do not fit or are not desired.
    • MediaWiki:Message-key is the default message,
    • MediaWiki:Message-key/language-code is the message to be used when a user has selected a language other than the wiki's default language.
  • From message files:
    • Core MediaWiki itself and most currently maintained extensions use a file per language, named zyx.json, where zyx is the language code for the language.
    • Some older extensions use a combined message file holding all messages in all languages, usually named MyExtensionName.i18n.php.
    • Many Wikimedia Foundation wikis access some messages from the WikimediaMessages extension, allowing them to standardise messages across WMF wikis without imposing them on every MediaWiki installation.
    • A few extensions use other techniques.

Caching

System messages are one of the more significant components of MediaWiki, primarily because it is used in every web request. The PHP message files are large, since they store thousands of message keys and values. Loading this file (and possibly multiple files, if the user's language is different from the content language) has a large memory and performance cost. An aggressive, layered caching system is used to reduce this performance impact.

MediaWiki has lots of caching mechanisms built in, which make the code somewhat more difficult to understand. Since 1.16 there is a new caching system, which caches messages either in cdb files or in the database. Customised messages are cached in the filesystem and in memcached (or alternative), depending on the configuration.

The table below gives an overview of the settings involved:

Location of cache storage $wgLocalisationCacheConf
'store' => 'db'
 
'store' => 'detect'
(default)
'store' => 'files'
 
'store' => 'array'
(experimental since MW ≥ 1.26)
$wgCacheDirectory = false
(default)
l10n cache table l10n cache table error (undefined path) error (undefined path)
= path l10n cache table local filesystem (CDB) local filesystem (CDB) local filesystem (PHP array)
MediaWiki versions:
1.27.0 – 1.27.2
Gerrit #Id3e2d2

In MediaWiki 1.27.0 and 1.27.1, the autodetection was changed to favor the file backend. In case 'store' => 'detect' (the default), the file backend is used with the path from $wgCacheDirectory . If this value is not set (which is the default), a temporary directory determined by the operating system is used. If a temporary directory cannot be detected, the database backend is used as a fallback. This was reverted from 1.27.2 and 1.28.0 because of conflict of files on shared hosts and security issues (see T127127 and T161453).

Function backtrace

To better visually depict the layers of caching, here is a function backtrace of what methods are called when retrieving a message. See the below sections for an explanation of each layer.

  • Message::fetchMessage()
  • MessageCache::get()
  • Language::getMessage()
  • LocalisationCache::getSubitem()
  • LCStore::get()

MessageCache

The MessageCache class is the top level of caching for messages. It is called from the Message class and returns the final raw contents of a message. This layer handles the following logic:

The last bullet is important. Language fallbacks allow MediaWiki to fall back on another language if the original does not have a message being asked for. As mentioned in the next section, most of the language fallback resolution occurs at a lower level. However, only the MessageCache layer checks the database for overridden messages. Thus integrating overridden messages from the database into the fallback chain is done here. If not using the database, this entire layer can be disabled.

LocalisationCache

See LocalisationCache.php

LCStore

The LCStore class is merely a back-end implementation used by the LocalisationCache class for actually caching and retrieving messages. Like the BagOStuff class, which is used for general caching in MediaWiki, there are a number of different cache types (configured using $wgLocalisationCacheConf ):

  • "db" (default) - Caches messages in the database
  • "file" (default if $wgCacheDirectory is set) - Uses CDB to cache messages in a local file
  • "accel" - Uses APC or another opcode cache to store the data

The "file" option is used by the Wikimedia Foundation, and is recommended because it is faster than going to the database and more reliable than the APC cache, especially since APC is incompatible with PHP versions 5.5 or later.

Adding new messages

Choosing the message key

See also: Manual:Coding conventions

The message key must be globally unique. This includes core MediaWiki and all the extensions and skins.

Stick to lower case letters, numbers, and dashes in message names; most other characters are between less practical or not working at all. Per MediaWiki convention, first character is case-insensitive and other chars are case-sensitive.

Please follow global or local conventions for naming. For extensions, use a standard prefix, preferably the extension name in lower case, followed by a hyphen (-). Exceptions are:

Messages used by the API
These must begin with apihelp-, apiwarn-, apierror-. After this prefix put the extension prefix. (Note that these messages should be in a separate file, usually under includes/i18/api.)
Log-related messages
These must begin with logentry-, log-name-, log-description.
User rights
The key for the name of the right as displayed on Special:ListGroupRights must begin with right-. The name of the action that completes the sentence "You do not have permission to $2, for the following reasons:" must begin with action-.
Revisions tags
Revisions tags must begin with tag-.
Special page titles
Special page titles must begin with special-.
Extension descriptions
Extension descriptions must begin with the extension name and end with -desc.

They appear in the table on Special:Version, and their content must briefly explain what the extension does.

Gender

English messages almost never need different words that change because of a user's gender. English only needs this in the third-person pronouns "he" and "she", but these are surprisingly rare in messages. When this is necessary, use he or they.

However, many other languages need different words depending on the user's gender, not only for third-person pronouns, but also for other pronouns, as well as for verbs in different tenses (e.g. "created", "deleted"), nouns (e.g. "mentor", "administrator"), adjectives (e.g. "new"), etc. It is therefore often useful to use GENDER in English messages, even when there's only English word. This gives translators a hint that GENDER can be used in a message. It also avoids warnings on translatewiki about missing parameters when an optional username parameter is missing (this happens especially often in log entry messages).

Other things to note when creating messages

  1. Make sure that you are using suitable handling for the message (parsing, {{-replacement, escaping for HTML, etc.)
  2. If your message is part of core, it should usually be added to languages/i18n/en.json, although some components, such as Installer, EXIF tags, and ApiHelp have their own message files.
  3. If your message is in an extension add it to the i18n/en.json file or the en.json file in the appropriate subdirectory. In particular, API messages that are only seen by developers and not by most end users are usually in a separate file, such as i18n/api/en.json. If an extension has a lot of messages, you may create subdirectories under i18n. All the message directories, including the default i18n/, must be listed in the MessagesDirs section in extension.json or in the $wgMessagesDirs variable.
  4. Take a pause and consider the wording of the message. Is it as clear as possible? Can it be misunderstood? Ask for comments from other developers or localisers if possible. Follow the Internationalisation hints.
  5. Add documentation to qqq.json in the same directory.
  6. The sequence of the messages in the file should roughly conform to the features of your project. Put messages from the same feature next to each other. This helps translators stay focused and be efficient and consistent.
  7. Put the messages that are expected to be the most basic and the most frequently used in the beginning of the file, and the messages that are rarer and more technically advanced towards the end.

Messages that should not be translated

  1. Ignored messages are those which should exist only in the English messages file. They are messages that should not need translation, because they reference only other messages or language-neutral features, e.g. a message of "{{SITENAME}}".
  2. Optional messages may be translated only if changed in the target language.

To flag such messages:

Removing existing messages

Remove it from en.json and qqq.json. Don't bother with other languages. Updates from translatewiki.net will handle those automatically.

In addition, check whether the message appears anywhere in translatewiki configuration, for example in the list of optional or most used messages (a simple git grep should be enough). Remove it from these lists if needed.

Changing existing messages

  1. Consider updating the message documentation.
  2. Change the message key if old translations are not suitable for the new meaning. This also includes changes in message handling (parsing, escaping, parameters, etc.). Improving the phrasing of a message without technical changes is usually not a reason for changing a key. At translatewiki.net, the translations will be marked as outdated so that they can be targeted by translators. Changing a message key does not require talking to the i18n team or filing a support request. However, if you have special circumstances or questions, ask in #translatewiki connect or in the support page at translatewiki.net .
  3. If the extension is supported by translatewiki.net , please only change the English source message and/or key, and the accompanying entry in qqq.json. If needed, the translatewiki.net team will take care of updating the translations, marking them as outdated, cleaning up the file or renaming keys where possible. This also applies when you're only changing things like HTML tags which you could change in other languages without speaking those languages. Most of these actions will take place in translatewiki.net and will reach Git with about one day of delay.

Message documentation

There is a pseudo-language code qqq for message documentation. It is one of the ISO 639 codes reserved for private use. There, we do not keep translations of each message, but collect English sentences about each message: telling us where it is used, giving hints about how to translate it, and enumerating and describing its parameters, link to related messages, and so on. In translatewiki.net, these hints are shown to translators when they edit messages.

Programmers must document each and every message. Message documentation is an essential resource – not just for translators, but for all the maintainers of the module. Whenever a message is added to the software, a corresponding qqq entry must be added as well; revisions which don't do so are marked "V-1" until the documentation is added.

Documentation in qqq files should be edited directly only when adding new messages or when changing an existing English message in a way that requires a documentation change, for example adding or removing parameters. In other cases, documentation should usually be edited in translatewiki. Each documentation string is accessible at https://translatewiki.net/wiki/MediaWiki:message-key/qqq, as if it were a translation. These edits will be exported to the source repositories along with the translations.

Useful information that should be in the documentation includes:

  1. Message handling (parsing, escaping, plain text).
  2. Type of parameters with example values.
  3. Where the message is used (pages, locations in the user interface).
  4. How the message is used where it is used (a page title, button text, etc.).
  5. What other messages are used together with this message, or which other messages this message refers to.
  6. Anything else that could be understood when the message is seen on the context, but not when the message is displayed alone (which is the case when it is being translated).
  7. If applicable, notes about grammar. For example, "open" in English can be both a verb and an adjective. In many other languages the words are different and it's impossible to guess how to translate them without documentation.
  8. Adjectives that describe things, such as "disabled", "open" or "blocked", must always say what are they describing. In many languages adjectives must have the gender of the noun that they describe. It may also happen that different kinds of things need different adjectives.
  9. If the message has special properties, for example, if it is a page name, or if it should not be a direct translation, but adapted to the culture or the project.
  10. Whether the message appears near other message, for example in a list or a menu. The wording or the grammatical features of the words should probably be similar to the messages nearby. Also, items in a list may have to be properly related to the heading of the list.
  11. Parts of the message that must not be translated, such as generic namespace names, URLs or tags.
  12. Explanations of potentially unclear words, for example abbreviations, like "CTA", or specific jargon, like "template", "suppress" or "stub". (Note that it's best to avoid such words in the first place!)
  13. Screenshots are very helpful. Don't crop – an image of the full screen in which the message appears gives complete context and can be reused in several messages.

A few other hints:

  • Remember that very, very often translators translate the messages without actually using the software.
  • Most usually, translators do not have any context information, neither of your module, nor of other messages in it.
  • A rephrased message alone is useless in most circumstances.
  • Don't use designers' jargon like "hamburger", "nav", or "comps".
  • Consider writing a glossary of the technical terms that are used in your module. If you do it, link to it from the messages' documentation.

You can link to other messages by using {{msg-mw|message key}}. Please do this if parts of the messages come from other messages (if this cannot be avoided), or if some messages are shown together or in same context.

translatewiki.net provides some default templates for documentation:

  • {{doc-action|[...]}} - for action- messages
  • {{doc-right|[...]}} - for right- messages
  • {{doc-group|[...]|[...]}} - for messages around user groups (group, member, page, js and css)
  • {{doc-accesskey|[...]}} - for accesskey- messages

Have a look at the template pages for more information.

Internationalisation hints

Besides documentation, translators ask developers to consider some hints so as to make their work easier and more efficient and to allow an actual and good localisation for all languages. Even if only adding or editing messages in English, one should be aware of the needs of all languages. Each message is translated into more than 300 languages and this should be done in the best possible way. Correct implementation of these hints will very often help you write better messages in English, too.

Localisation#Help and contact info lists the main places where you can find the assistance of experienced and knowledgeable people regarding i18n.

Use message parameters and switches properly

That's a prerequisite of a correct wording for your messages.

Avoid message re-use

The translators discourage message re-use. This may seem counter-intuitive, because copying and duplicating code is usually a bad practice, but in system messages it is often needed. Although two concepts can be expressed with the same word in English, this doesn't necessarily mean they can be expressed with the same word in every language. "OK" is a good example: in English this is used for a generic button label, but in some languages they prefer to use a button label related to the operation which will be performed by the button. Another example is practically any adjective: a word like "multiple" changes according to gender in many languages, so you cannot reuse it to describe several different things, and you must create several separate messages.

If you are adding multiple identical messages, please add message documentation to describe the differences in their contexts. Don't worry about the extra work for translators. Translation memory helps a lot in these while keeping the flexibility to have different translations if needed.

Avoid fragmented or "patchwork" messages

Languages have varying word orders, and complex grammatical and syntactic rules. Messages formed by multiple pieces of text, possibly with some indirection, also called "string concatenation", in code that cannot be directly controlled by translators, are called "lego" or "patchwork" messages in developers' jargon. It's practically impossible to translate "lego" messages correctly.

Make every message a complete phrase. Several sentences can usually be combined much more easily into a text block, if needed. When you want to combine several strings in one message, pass them in as parameters, as translators can order them correctly for their language when translating.

Messages quoting each other

An exception from the rule may be messages referring to one another: 'Enter the original author's name in the field labelled "{{int:name}}" and click "{{int:proceed}}" when done'. This makes the message consistent when a software developer or wiki operator alters the messages "name" or "proceed" later. Without the int-trick, developers and operators would have to be aware of all related messages needing adjustment, when they alter one.

Write messages in natural language

As much as possible, write messages in natural, human language. Try reading the message aloud and think: is this something that sounds like correct, grammatical English that humans speak? If it's complex, hard to pronounce, or in any way unnatural in English, it will be even harder for translators and for users in other languages.

Avoid punctuation that is too technical or bureaucratic or that can't be read aloud. Slash (/) should usually be replaced with or. And/or should be replaced with and or or. Sentences with comma splice should be split into shorter sentences.

Don't use terms and templates that are specific to particular projects

MediaWiki is used by very diverse people, within the Wikimedia movement and outside of it. Even though it was originally built for an encyclopedia, it is now used for various kinds of content. Therefore, use general terms. For example, avoid terms like "article", and use "page" instead, unless you are absolutely sure that the feature you are developing will only be used on a site where pages are called "articles". Don't use "village pump", which is the name of an English Wikipedia community page, and use a generic term, such as "community discussion page", instead.

Don't assume that a certain template exists on all wikis. Templates are local to wikis. This applies to both the source messages and to their translations. If messages use templates, they will only work if a template is created on each wiki where the feature is deployed. It's best to avoid using templates in messages completely. If you really have to use them, you must document this clearly in the message documentation and in the extension installation instructions.

Separate times from dates in sentences

Some languages have to insert something between a date and a time which grammatically depends on other words in a sentence. Thus, they will not be able to use date/time combined. Others may find the combination convenient, thus it is usually the best choice to supply three parameter values (date/time, date, time) in such cases, and in each translation leave either the first one or last two unused as needed.

Avoid {{SITENAME}} in messages

{{SITENAME}} has several disadvantages. It can be anything (acronym, word, short phrase, etc.) and, depending on language, may need the use of {{GRAMMAR}} on each occurrence. No matter what, each message having {{SITENAME}} will need review in most wiki languages for each new wiki on which your code is installed. In the majority of cases, when there is not a general GRAMMAR configuration for a language, wiki operators will have to add or amend PHP code so as to get {{GRAMMAR}} for {{SITENAME}} working. This requires both more skills, and more understanding, than otherwise. It is more convenient to have generic references like "this wiki". This does not keep installations from locally altering these messages to use {{SITENAME}}, but at least they don't have to, and they can postpone message adaption until the wiki is already running and used.

Avoid references to visual layout and positions

What is rendered where depends on skins. Most often screen layouts of languages written from left-to-right are mirrored compared to those used for languages written from right-to-left, but not always, and for some languages and wikis, not entirely. Handheld devices, narrow windows, and so on may show blocks underneath each other, that would appear side-by-side on larger displays. Since site- and user-written JavaScript scripts and gadgets can, and do, hide parts, or move things around in unpredictable ways, there is no reliable way of knowing the actual layout.

It is wrong to tie layout information to content languages, since the user interface language may not be the page's content language, and layout may be a mixture of the two depending on circumstances. Non-visual user agents like acoustic screen readers and other auxiliary devices do not even have a concept of visual layout. Thus, you should not refer to visual layout positions in the majority of cases, though semantic layout terms may still be used ("previous steps in the form", etc.).

MediaWiki does not support showing different messages or message fragments based on the current directionality of the interface (see T30997).

The upcoming browser and MediaWiki support for East and North Asian top-down writing[2] will make screen layouts even more unpredictable, with at least eight possible layouts (left/right starting position, top/bottom starting position, and which happens first).

Avoid references to screen colours

The colour in which something is rendered depends on many factors, including skins, site- and user-written JavaScript scripts and gadgets, and local user agent over-rides for reasons of accessibility or technological limitations. Non-visual user agents like acoustic screen readers and other auxiliary devices do not even have a concept of colour. Thus, you should not refer to screen colours. (You should also not rely on colour alone as a mechanism for informing the user of state, for the same reason.)

Avoid markup that doesn't need to be translated

HTML markup not requiring translation, such as enclosing ‎<div> tags, rulers above or below, and similar, should usually not be part of messages. It's an unnecessary burden on translators, and is often accidentally altered or skipped in the translation process. The translation interface has no syntax highlighting or validation, and mistakes are common.

Avoid complex wikitext markup as well. Wikitext is sometimes terser than writing the same thing in PHP, and it's tempting to write something like:

This is the [[{{MediaWiki:Validationpage}}|stable version]], [{{fullurl:{{#Special:Log}}|type=review&page={{FULLPAGENAMEE}}}} checked] on <i>$2</i>.
[{{fullurl:{{FULLPAGENAMEE}}|oldid=$1&diff=cur}} $3 pending {{PLURAL:$3|change|changes}}] {{PLURAL:$3|awaits|await}} review.

However, this is difficult for translators, especially when translating to right-to-left languages, because parts of the message must remain in English, resulting in text direction changing many times in one line:

هذه هي [[{{MediaWiki:Validationpage}}|النسخة المستقرة]]، [{{fullurl:{{#Special:Log}}|type=review&page={{FULLPAGENAMEE}}}} المفحوصة] في <i>$2</i>.
[{{fullurl:{{FULLPAGENAMEE}}|oldid=$1&diff=cur}} {{PLURAL:$3||تغيير واحد معلق|تغييران معلقان|$3 تغييرات معلقة|$3 تغييرا معلقا|$3 تغيير معلق}}] {{PLURAL:$3||ينتظر|ينتظران|تنتظر|ينتظر}} المراجعة.

It's best to pass any link targets as message parameters, and use only simple markup like [$1 Label] and [[$1|Label]].

Translated messages are often longer than you think!

Skimming foreign language message files, you almost never find translated messages shorter than Chinese ones and rarely shorter than English ones. However, you will often find translations that are much longer than English ones.

Especially in forms, in front of input fields, English messages tend to be terse, and short. That is often not kept in translations. Languages may lack the technical vocabulary present in English, and may require multiple words or even complete sentences to explain some concepts. For example, the brief English message "TSV file:" may have to be translated in a language as literally:

Please type a name here which denotes a collection of computer data that is comprised of a sequentially organised series of typewritten lines which themselves are organised as a series of informational fields each, where said fields of information are fenced, and the fences between them are single signs of the kind that slips a typewriter carriage forward to the next predefined position each. Here we go: _____ (thank you)

This is, admittedly, an extreme example, but you get the trait. Imagine this sentence in a column in a form where each word occupies a line of its own, and the input field is vertically centered in the next column. :-(

Avoid using very close, similar, or identical words to denote different things, or concepts

For example, pages may have older revisions (of a specific date, time, and edit), comprising past versions of said page. The words revision, and version can be used interchangeably. A problem arises, when versioned pages are revised, and the revision, i.e. the process of revising them, is being mentioned, too. This may not pose a serious problem when the two synonyms of "revision" have different translations. Do not rely on that, however. It is better to avoid the use of "revision" aka "version" altogether, then, so as to avoid it being misinterpreted.

Basic words may have unforeseen connotations, or not exist at all

There are some words that are hard to translate because of their very specific use in MediaWiki. Some may not be translated at all. For example, there is no word "user" relating to "someone who uses something" in several languages. Similarly, in Kölsch the English words "namespace" and "apartment" translate the same word. Also, in Kölsch, they say "corroborator and participant" in one word since any reference to "use" would too strongly imply "abuse". The term "wiki farm" is translated as "stable full of wikis", since a single-crop farm would be a contradiction in terms in the language, and not understood, etc..

Use ‎<code>, ‎<var>, and ‎<kbd> tags where needed

When talking about technical parameters, values, or keyboard inputs, mark them appropriately as such using the HTML tags ‎<code>, ‎<var>, or ‎<kbd>. Thus they are typographically set off form the normal text. That clarifies their sense to readers, avoiding confusion, errors and mis-representations. Ensure that your message handler allows such markup.

Symbols, colons, brackets, etc. are parts of messages

Many symbols are localisable, too. Some scripts have other kinds of brackets than the Latin script has. A colon may not be appropriate after a label or input prompt in some languages. Having those symbols included in messages helps to make better and less Anglo-centric translations, and also reduces code clutter.

For example, there are different quotation mark conventions used in «Norwegian», ”Swedish”, »Danish«, „German”, and 「Japanese」.[3]

If you need to wrap some text in localized parentheses, brackets, or quotation marks, you can use the parentheses ($1) or brackets [$1] or quotation-marks "$1" messages like so:

wfMessage( 'parentheses' )->rawParams( /* text to go inside parentheses */ )->escaped()
wfMessage( 'brackets' )->rawParams( /* text to go inside brackets */ )->escaped()
wfMessage( 'quotation-marks' )->rawParams( /* text to go inside quotation marks */ )->escaped()

Do not expect symbols and punctuation to survive translation

Languages written from right to left (as opposed to English) usually swap arrow symbols being presented with "next" and "previous" links, and their placement relative to a message text may, or may not, be inverted as well. Ellipsis may be translated to "etc.", or to words. Question marks, exclamation marks, colons will be placed other than at the end of a sentence, not at all, or twice. As a consequence, always include all of those in the text of your messages, and never try to insert them programmatically.

Use full stops

Do terminate normal sentences with full stops. This is often the only indicator for a translator to know that they are not headlines or list items, which may need to be translated differently.

Make sure that the anchor describes the target page well. Always avoid commonplace and generic words. For example, "Click here" is an absolute no-go,[4] since target pages are almost never about "click here". Instead, Use precise action words telling what a user will get to when following the link, such as "You can upload a file if you wish."

See also Help users predict where they are going, and mystery meat navigation, and The main reasons why we shouldn't use click here as link text.

Avoid jargon and slang

Avoid developer and power user jargon in messages. Try to use a simple language whenever possible. Avoid saying "success", "successfully", "fail", "error occurred while", etc., when you want to notify the user that something happened or didn't happen. This comes from developers' perspective of seeing everything as true or false, but users usually just want to know what actually happened or didn't, and what they should do about it (if at all). So:

  • "The file was successfully renamed" -> "The file was renamed"
  • "File renaming failed" -> "There is a file with this name already. Please choose a different name."

Be aware of whitespace and line breaks

MediaWiki's localised messages usually get edited within the wiki, either by wiki operations on live wikis, or by the translators on translatewiki.net. You should be aware of how whitespace, especially at the beginning or end of your message, will affect editors:

  • Spaces and line breaks (newlines) at the end of the message are always automatically removed by the wikitext editor. Your message must not end with a space or line break, as it will be lost when it's edited on the wiki.
  • Spaces and line breaks at the beginning are not automatically removed, but they are likely to be removed by accident during editing, and should be avoided.

Start and end your message with active text; if you need a newline or paragraph break around it, your surrounding code should deal with adding it to the returned text.

There are some messages which require a space at the end, such as 'word-separator' (which consists of just a space character in most languages). To support such use cases, the following HTML entities are allowed in messages and transformed to the actual characters, even if the message otherwise doesn't allow wikitext or HTML formatting:[5]

On a related note, any other syntax elements affected by pre-save transforms also must not be used in messages, as they will be transformed when the message is edited on the wiki.

Use standard capitalisation

Capitalisation gives hints to translators as to what they are translating, such as single words, list or menu items, phrases, or full sentences. Correct (standard) capitalisation may also play a role in search engines' assessment of your pages. MediaWiki uses sentence case (The quick brown fox jumps over the lazy dog) in interface messages.

Always remember that many writing systems don't have capital letters at all, and some of those that do have them, use them differently from English. Therefore, don't use ALL-CAPS for emphasis. Use CSS, or HTML ‎<em> or ‎<strong> per below:

Emphasis

In normal text, emphasis like boldface or italics and similar should be part of message texts. Local conventions on emphasis often vary, especially some Asian scripts have their own. Translators must be able to adjust emphasis to their target languages and areas. Try to use "‎<em>" and "‎<strong>" in your user interface to allow mark-up on a per language or per script basis.

In modern screen layouts of English and European styles, emphasis becomes less used. Do convey it in your #Message documentation still, as it may give valuable hints as to how to translate. Emphasis can and should be used in other cultural contexts as appropriate, provided that translators know about it.


See also

Notes