Jump to content

Manual:Parser.php

From mediawiki.org
(Redirected from Parser)

Description

[edit]

This file contains the class Parser, which contains the method parse, which converts Wikitext to HTML.

Getting the Parser

[edit]

In many contexts, such as when creating a parser function or special page, you should have access to a Parser instance.

How to create a new Parser instance

[edit]

If you do not have access, you can create a new instance by using ParserFactory and calling its create method.[1] The constructor of the ParserFactory takes several arguments. To create a ParserFactory with default options, use MediaWikiServices::getInstance()->getParserFactory().

One-liner for getting a new Parser instance:

$localParser = MediaWikiServices::getInstance()->getParserFactory()->create();

This new instance is not ready for use yet. The next step is to set ParserOptions.

The parsing cycle

[edit]
  1. Call helper function Parser::internalParse(), which in turns calls
    1. Parser::replaceVariables, which replaces magic variables, templates, and template arguments with the appropriate text.
      1. It calls Parser::preprocessToDom, which preprocesses some wikitext and returns the document tree.
      2. Next it creates a PPFrame object and calls its expand() method to do the actual template magic.
    2. Sanitizer::removeHTMLtags(), which cleans up HTML, removes dangerous tags and attributes, and removes HTML comments.
    3. Parser::handleTables, which handles and renders the wikitext for tables.
    4. Parser::handleDoubleUnderscore, which removes valid double-underscore items, like __NOTOC__, and records them in array $Parser->mDoubleUnderscores.
    5. Parser::handleHeadings, which parses and renders section headers.
    6. Parser::handleInternalLinks, which processes internal links ([[ ]]) and stores them in $Parser->mLinkHolders (a LinkHolderArray object),
    7. Parser::handleAllQuotes, which replaces single quotes with HTML markup (‎<i>, ‎<b>, etc).
    8. Parser::handleExternalLinks, which replaces and renders external links.
    9. Parser::handleMagicLinks, which replaces special strings like "ISBN xxx" and "RFC xxx" with magic external links.
    10. Parser::handleHeadings, which:
      • auto numbers headings if that options is enabled,
      • adds an [edit] link to sections for users who have enabled the option and can edit the page,
      • adds a Table of contents on the top for users who have enabled the option, and
      • auto-anchors headings.
  2. Next, parse() calls Parser::doBlockLevels, which renders lists from lines starting with ':', '*', '#', etc.
  3. Parser::replaceLinkHolders is called, which calls LinkHolderArray::replace on $Parser->mLinkHolders to replace link placeholders with actual links, in the buffer Placeholders created in Skin::makeLinkObj()
  4. Next, the text is language converted (when applicable) using the convert method of the appropriate Language object.
  5. Parser::replaceTransparentTags used to be called, which replaced transparent tags with values which are provided by the callback functions in $Parser->mTransparentTagHooks. Transparent tag hooks are like regular XML-style tag hooks, except they operate late in the transformation sequence, on HTML instead of wikitext.
  6. Sanitizer::normalizeCharReferences is called, which ensures that any entities and character references are legal for XML and XHTML specifically.
  7. If HTML tidy is enabled, MWTidy::tidy is called to do the tidying.
  8. Finally the rendered HTML result of the parse process is stored in the ParserOutput object $Parser->mOutput, which is returned to the caller of Parser::parse.

The following hooks are made available at various stages in the parsing cycle:

Version Hook Description
1.5.0 ParserAfterTidy Used to add some final processing to the fully-rendered page output.
1.6.0 ParserBeforeInternalParse
1.6.0 ParserClearState Called at the end of Parser::clearState().
1.6.0 ParserGetVariableValueSwitch Assigns a value to a user defined variable.
1.6.0 ParserGetVariableValueTs Used to change the value of the time for the {{LOCAL...}} magic word.
1.6.0 ParserGetVariableValueVarCache Used to change the value of the variable cache or return false to not use it.
1.6.0 ParserTestParser Called when creating a new instance of Parser for parser tests.
1.10.0 InternalParseBeforeLinks Used to process the expanded wiki code after <nowiki>, HTML-comments, and templates have been treated.
1.10.1 BeforeParserFetchTemplateAndtitle Allows an extension to specify a version of a page to get for inclusion in a template.
1.10.1 BeforeParserrenderImageGallery Allows an extension to modify an image gallery before it is rendered.
1.12.0 ParserFirstCallInit Called when the parser initialises for the first time.
1.12.0 ParserMakeImageParams Alter the parameters used to generate an image before it is generated.
1.18.0 BeforeParserFetchFileAndTitle Before an image is rendered by Parser.
1.19.0 ParserSectionCreate Called each time the parser creates a document section from wikitext.
1.22.0 ParserLimitReportFormat Replacement for deprecated ParserLimitReport
1.22.0 ParserLimitReportPrepare Replacement for deprecated ParserLimitReport
1.36.0 BeforeParserFetchTemplateRevisionRecord Replacement for deprecated BeforeParserFetchTemplateAndtitle

Other methods

[edit]

Accessors

[edit]
  • getCustomDefaultSort () Accessor for $mDefaultSort. Unlike getDefaultSort(), it will return false if none is set.
  • getDefaultSort () Accessor for $mDefaultSort. Will use the empty string if none is set.
  • getOptions () Get the ParserOptions object.
  • getOutput () Get the ParserOutput object.
  • getPreprocessor () Get a Preprocessor object.
  • Relating to revisions:
    • getRevisionId () Get the ID of the revision we are parsing.
    • getRevisionTimestamp () Get the timestamp associated with the current revision, adjusted for the default server-local timestamp.
    • getRevisionUser () Get the name of the user that edited the last revision.
  • getTags () Accessor.
  • getTargetLanguage () Get the target language for the content being parsed.
  • getTitle () Accessor for the Title object.
  • getUser () Get a User object either from $this->mUser, if set, or from the ParserOptions object otherwise.

Set

[edit]
  • Parser::setFunctionHook() Create a parser function , e.g. {{#expr: 1 + 1}} or {{sum:1|2|3}}. The callback function can have the form: function myParserFunction( &$parser, $arg1, $arg2, $arg3 ) { ... }.
  • Parser::setHook() Create an HTML-style tag , e.g. ‎<yourtag>special text‎</yourtag>. The callback should have the following form: function myParserHook( $text, $params, $parser, $frame ) { ...}

Dynamic properties

[edit]

Do not add class properties to the Parser that have not been declared by the Parser. Exceptions aside, dynamic properties are deprecated in PHP 8.2 and they may not sit well with the shift to parallellised envisaged for Parsoid. If you are write an extension and need to store custom data in the Parser, see extension data for one possible way out.

See also

[edit]
[edit]

References

[edit]
  1. Prior to MediaWiki 1.36, it was still possible to construct a Parser class directly.