Jump to content

Extension:ParserFunctions/String functions/zh

From mediawiki.org
MediaWiki extensions manual
StringFunctions
Release status: stable
Implementation Parser function
Description 用字符串函数增强了解析器函数
Author(s) Ross McClureJuraj Simlovic
Latest version 2.0.3 (2008-11-30)
MediaWiki 1.7+
Database changes No
License MIT License
Download See extension ParserFunctions
  • $wgStringFunctionsLimitSearch
  • $wgStringFunctionsLimitReplace
  • $wgStringFunctionsLimitPad
Quarterly downloads 310 (Ranked 11th)
Public wikis using 15,766 (Ranked 4th)

解析器函数扩展定义增加的解析器函数用来处理字符串。版本2.0解决了与<nowiki>之间的矛盾,并消除了对在服务器上安装的PHP的mbstring扩展依赖性。

注意: 维基媒体用户
在2013年,我们决定不在任何维基媒体wiki网站上启用该扩展(参见phabricator:T8455)。作为变通方案,请使用字符串模板Module:String.。
注意: 此扩展已过时。所有这些功能现已整合解析器函数扩展,但只在管理员在LocalSettings.php中设置$wgPFEnableStringFunctions = true;时方可生效。这些功能的文档仍在此保留。
MediaWiki的核心魔术字可实现urlencodepadleft/padright

功能

[edit]

这个模块定义了功能lenposrpossubpadreplaceexplodeurlencodeurldecode

所有的这些功能都不会复杂,并且需要依靠相关程序。

注意:

  1. Some parameters of these functions are limited through global settings to prevent abuse. See section Limits hereafter.
  2. For functions that are case sensitive, you may use the magic word {{lc:your_string_here}} as a workaround in some cases.
  3. To determine whether a MediaWiki server enables these functions, check the list of supported Extended parser functions in Special:Version.

#len:

[edit]
#len解析器函数已于1.2.0版本开始合并至解析器函数扩展中

The #len function returns the length of the given string. The syntax is:

{{#len:string}}

The return value is always a number of characters in the source string (after expansions of template invocations, but before conversion to HTML). If no string is specified, the return value is zero.

注意:

  • This function is safe with UTF-8 multibyte characters. Example:
    • {{#len:Žmržlina}}
      returns 8.
  • Leading and trailing spaces or newlines are not counted, but intermediate spaces and newlines are taken into account. Examples:
    • {{#len:Icecream     }}
      returns 8.
    • {{#len: a   b }}
      returns 5.
  • Characters given by reference are not converted, but counted according to their source form.
    • {{#len:&nbsp;}}
      returns 6 (named characters references).
    • {{#len:&#32;}}
      returns 5 (numeric characters references, not ignored despite it designates a space here).
  • Tags such as <nowiki> and other tag extensions will always have a length of zero, since their content is hidden from the parser. Example:
    • {{#len:<nowiki>This is a </nowiki>test}}
      returns 4.

#pos:

[edit]
#pos解析器函数属于解析器函数扩展的版本1.2.0

#pos函数返回字符串内给定的搜索对象的位置。语法

{{#pos:字符串|搜索对象|偏移量}}

偏移量在被指定时表示函数开始搜索的位置。

如果搜索对象找到了,则返回的值是以从0开始第一个字符串内的指定对象。

如果搜索对象找不到,函数会返回空值。

注意:

  • 本函数大小写敏感。
  • The maximum allowed length of the search term is limited through the $wgStringFunctionsLimitSearch global setting.
  • This function is safe with UTF-8 multibyte characters. Example: {{#pos:Žmržlina|lina}} returns 4.
  • As with #len, <nowiki> and other tag extensions are treated as having a length of 1 for the purposes of character position. Example: {{#pos:<nowiki>This is a </nowiki>test|test}} returns 1.

#rpos:

[edit]
#rpos解析器函数属于解析器函数扩展的版本1.2.0

#rpos函数返回字符串内给定搜索项的从右往左的位置。语法:

 {{#rpos:字符串|搜索对象}}

如果搜索对象被找到,则会返回从右往左数的位置。

如果搜索对象找不到,函数会返回-1。

提示: When using this to search for the last delimiter, add +1 to the result to retrieve position after the last delimiter. This also works when the delimiter is not found, because "-1 + 1" is zero, which is the beginning of the given value.

注意:

  • This function is case sensitive.
  • The maximum allowed length of the search term is limited through the $wgStringFunctionsLimitSearch global setting.
  • This function is safe with UTF-8 multibyte characters. Example: {{#rpos:Žmržlina|lina}} returns 4.
  • As with #len, <nowiki> and other tag extensions are treated as having a length of 1 for the purposes of character position. Example: {{#rpos:<nowiki>This is a </nowiki>test|test}} returns 1.

#sub:

[edit]
The #sub parser function was merged into the ParserFunctions extension as of version 1.2.0

#sub函数返回给定的字符串的子字符串。语法:

{{#sub:字符串|开始位置|长度}}

The start parameter, if positive (or zero), specifies a zero-based index of the first character to be returned.

示例:{{#sub:Icecream|3}}返回cream

{{#sub:Icecream|0|3}}返回Ice

If the start parameter is negative, it specifies how many characters from the end should be returned.

Example: {{#sub:Icecream|-3}} returns eam.

The length parameter, if present and positive, specifies the maximum length of the returned string.

Example: {{#sub:Icecream|3|3}} returns cre.

If the length parameter is negative, it specifies how many characters will be omitted from the end of the string.

Example: {{#sub:Icecream|3|-3}} returns cr.

注意:

  • 如果长度参数为零,整个函数都会包括到字符串的最后。
    • 示例:{{#sub:Icecream|3|0}}返回cream{{#sub:Icecream|0|3}}返回Ice
  • If start denotes a position beyond the truncation from the end by negative length parameter, an empty string will be returned.
    • Example: {{#sub:Icecream|3|-6}} returns an empty string.
  • This function is safe with UTF-8 multibyte characters. Example: {{#sub:Žmržlina|3}} returns žlina.
  • As with #len, <nowiki> and other tag extensions are treated as having a length of 1 for the purposes of character position. Example: {{#sub:<nowiki>This is a </nowiki>test|1}} returns test.

#pad:

[edit]
The #pad parser function was NOT merged into the ParserFunctions extension. As an alternative see the padleft and padright parser functions provided by MediaWiki core.

The #pad function returns the given string extended to a given width. The syntax is:

{{#pad:string|length|padstring|direction}}

The length parameter specifies the desired length of the returned string.

The padstring parameter, if specified, is used to fill the missing space. It may be a single character, which will be used as many times as necessary, or a string, which will be concatenated as many times as necessary and then trimmed to the required length. Example: {{#pad:Ice|10|xX}} returns xXxXxXxIce.

If the padstring is not specified, spaces are used for padding.

The direction parameter, if specified, can be one of these values:

  • left - the padding will be on the left side of the string. Example: {{#pad:Ice|5|x|left}} returns xxIce.
  • right - the padding will be on the right side of the string. Example: {{#pad:Ice|5|x|right}} returns Icexx.
  • center - the string will be centered in the returned string. Example: {{#pad:Ice|5|x|center}} returns xIcex.

If the direction is not specified, the padding will be on the left side of the string.

The return value is the given string extended to length characters, using the padstring to fill the missing part(s). If the given string is already longer than length, it is neither extended nor truncated.

注意:

  • The maximum allowed value for the length is limited through the $wgStringFunctionsLimitPad global setting.
  • This function is only partially safe with UTF-8 multibyte characters. These characters will be treated appropriately if they appear in the original string, but will not be respected if they appear in the padding. Examples:
    • {{#pad:Zmrzlina|12|z}} returns zzzzZmrzlina
    • {{#pad:Žmržlina|12|z}} returns zzzzŽmržlina
    • {{#pad:Žmržlina|12|ž}} returns žžŽmržlina (padded to less than the specified length, since only half the required padding characters are being used)
  • Tags such as <nowiki> and other tag extensions are not permitted in the padding. If the padstring contains such a tag, it will be truncated.

#replace:

[edit]
The #replace parser function was merged into the ParserFunctions extension as of version 1.2.0

The #replace function returns the given string with all occurrences of a search term replaced with a replacement term.

{{#replace:字符串|搜索的项|被替换的项}}

If the 搜索的项 is unspecified or empty, a single space will be searched for.

If the 被替换的项 is unspecified or empty, all occurrences of the search term will be removed from the string.

注意:

  • This function is case sensitive.
  • The maximum allowed length of the search term is limited through the $wgStringFunctionsLimitSearch global setting.
  • The maximum allowed length of the replacement term is limited through the $wgStringFunctionsLimitReplace global setting.
  • Even if the replacement term is a space, an empty string is used. This is a side-effect of the MediaWiki parser. To use a space as the replacement term, put it in nowiki tags.
    • Example: {{#replace:My_little_home_page|_|<nowiki> </nowiki>}} returns My little home page.
    • Note that this is the only acceptable use of nowiki in the replacement term, as otherwise nowiki could be used to bypass $wgStringFunctionsLimitReplace, injecting an arbitrarily large number of characters into the output. For this reason, all occurrences of <nowiki> or any other tag extension within the replacement term are replaced with spaces.
  • This function is safe with UTF-8 multibyte characters. Example: {{#replace:Žmržlina|ž|z}} returns Žmrzlina.
  • If multiple items in a single text string need to be replaced, one could also consider Extension:ReplaceSet. It adds a parser function for a sequence of replacements.
Case insensitive replace

Currently the syntax doesn't provide a switch to toggle case sensitivity setting. But you may make use of magic words of formatting (e.g. {{lc:your_string_here}} ) as a workaround. For example if you want to remove the word "Category:" from the string regardless of its case, you may type:

{{#replace:{{lc:{{{1}}}}}|category:|}}

But the disadvantage is the output will become all lower cases. If you want to keep the casing after replacement, you have to use multiple nesting level (i.e. multiple replace calls) to achieve the same thing.

#explode:

[edit]
The #explode parser function was merged into the ParserFunctions extension as of version 1.2.0

The #explode function splits the given string into pieces and then returns one of the pieces. The syntax is:

{{#explode:string|delimiter|position}}

The delimiter parameter specifies a string to be used to divide the string into pieces. This delimiter string is then not part of any piece, and when two delimiter strings are next to each other, they create an empty piece between them. If this parameter is not specified, a single space is used.

The position parameter specifies which piece is to be returned. Pieces are counted from 0. If this parameter is not specified, the first piece is used (piece with number 0). When a negative value is used as position, the pieces are counted from the end. In this case, piece number -1 means the last piece. Examples:

  • {{#explode:And if you tolerate this| |2}} returns you.
  • {{#explode:String/Functions/Code|/|-1}} returns Code.
  • {{#explode:Split%By%Percentage%Signs|%|2}} returns Percentage.

The return value is the position-th piece. If there are fewer pieces than the position specifies, an empty string is returned.

注意:

  • This function is case sensitive.
  • The maximum allowed length of the delimiter is limited through $wgStringFunctionsLimitSearch global setting.
  • This function is safe with UTF-8 multibyte characters. Example: {{#explode:Žmržlina|ž|1}} returns lina.

#urlencode: and #urldecode:

[edit]
The #urlencode and #urldecode parser functions were NOT merged into the ParserFunctions extension. As an alternative for #urlencode see the urlencode parser function provided by MediaWiki core. There is no replacement for the #urldecode parser function.

These two functions operate in tandem: #urlencode converts a string into a URL-safe syntax, and #urldecode converts such a string back. The syntax is:

{{#urlencode:value}}
{{#urldecode:value}}

注意:

  • These functions work by directly exposing PHP's urlencode() and urldecode() functions.
  • For anchors within a page use {{anchorencode}} instead of {{#urlencode}}. The results of a call to {{anchorencode}} are compatible with intra-page references generated with [[#link]] syntax, while {{#urlencode}}-generated values are not necessarily so.
  • urlencode has been integrated into Extension:ParserFunctions. Note that within the ParserFunction extension the function is called with {{urlencode:value}} instead of {{#urlencode:value}}. The ParserFunctions extension has been integrated into MediaWiki as of version 1.18; for examples see Help:Magic Words.
  • urldecode works the other way round and turns URL encoded strings into readable strings. A character-code-reference can be found at www.w3schools.com.

限制

[edit]

This module defines three global settings:

These are used to limit some parameters of some functions to ensure the functions operate in O(n) time complexity, and are therefore safe against DoS attacks.

$wgStringFunctionsLimitSearch

[edit]

This setting is used by #pos, #rpos, #replace, and #explode. All these functions search for a substring in a larger string while they operate, which can run in O(n*m) and therefore make the software more vulnerable to DoS attacks. By setting this value to a specific small number, the time complexity is decreased to O(n).

This setting limits the maximum allowed length of the string being searched for.

The default value is 30 multibyte characters.

$wgStringFunctionsLimitReplace

[edit]

This setting is used by #replace. This function replaces all occurrences of one string for another, which can be used to quickly generate very large amounts of data, and therefore makes the software more vulnerable to DoS attacks. This setting limits the maximum allowed length of the replacing string.

The default value is 30 multibyte characters.

$wgStringFunctionsLimitPad

[edit]

This setting is used by #pad. This function creates a string of the specified length, which can be used to quickly generate very large amounts of data, and therefore makes the software more vulnerable to DoS attacks. This setting limits the maximum allowed length of the resulting padded string.

The default value is 100 multibyte characters.

另见

[edit]