Parsoid/Todo:PHP parser integration
Extension expansion
[edit]Most extensions don't depend on order and frame state, so can be expanded in parallel and out-of-order. The following extensions are the exceptions among the 455 extensions in the wikimedia extensions.git repository.
Extension tags depending on frame state
[edit]The following extensions define extension tags (which are not run by the PHP preprocessor) that depend on the frame state (grep -r 'frame->expand' extensions
; grep -r 'frame->getArguments' extensions
):
- Arrays (frame->expand, shared state so order-dependent)
- Carp (debugging extension, low-level frame access)
- ExtTab / ET_ParserFunction (frame->expand)
- FacebookOpenGraph (parser->replaceVariables, parser->recursiveTagParse)
- HTMLTags (parser->replaceVariables)
- HashTables (frame->expand, frame->getArguments, order-dependent)
- LabeledSectionTransclusion (frame->expand)
- Loops (frame->expand, frame->getArgument, order/nesting-dependent)
- Poem (parser->recursiveTagParse)
- RSS (parser->recursiveTagParse on an optional per-RSS-item wikitext-based template)
- SelectTag (parser->recursiveTagParse)
- SoundManager2Button (parser->recursiveTagParse)
- Spark (parser->replaceVariables)
- Validator (parser->recursiveTagParse)
- WikitextLoggedInOut (parser->recursiveTagParse)
Parser functions depending on frame state
[edit]These extensions only define parser functions (which are run by the preprocessor) that depend on the frame state:
- CreatePage (frame->expand)
- GeoData (frame->expand)
- PageInCat (frame->expand)
- ParserFun (frame->expand, frame stack access, ...)
- ParserFunctions (frame->expand etc)
- RegexFun (low-level frame access)
- ReplaceSet (frame->expand)
- Scribunto (frame->getArguments)
- SemanticForms (frame->expand)
- SemanticMediawiki (frame->expand etc, not 100% sure if it registers tags too)
- SubpageFun (frame->expand)
- WikiLovesMonuments (frame->expand, frame->getArgument)
Order-dependent parser functions:
- UserFunctions (dynamic user-defined parser functions)
Parser functions adding global state:
- Description2 (frame->expand); also adds an output hook which adds a global meta tag to the parser output
Order-dependent extensions
[edit]These typically maintain internal state between calls and expect all hooks in a page to be called sequentially.
Enabled on WMF wikis
[edit]- Cite: We have a strategy on how to handle this by re-rendering references sections and numbering as a post-processing step on the full DOM.
Third party
[edit]Preprocessor (Function hooks):
- Arrays: explicitly defines mutable state in arrays. WONTFIX.
Possible solutions
[edit]- Add an expandTemplatesAndMostTagExtensions API method to the MW API that expands all templates and most extensions (possibly all except Cite).
- Top-level template expansion is probably an ok granularity for incremental updates- highly dynamic extensions can still be inserted without a template wrapper to avoid template re-expansions
- Avoids the need to serialize out & send back frame information for most extensions
- Should add encapsulation tags around extension output so that we can treat it differently for sanitation purposes
- Parse all templates in a single action=parse call, separated with unique strings so that the results can be split per template transclusion
- Problem: Single-threaded, hides a lot of information we would like to have.
- Instrument the PHP preprocessor to provide a serialized frame parameter for unexpanded extension tags
- Lets us perform the expansion independently
- Add API method for direct extension calls rather than action=parse
- Can support wikitext-returning tag extensions (TODO: find those!)
- Extension calls still needed for top-level extensions even with an expandTemplatesAndMostTagExtensions API method
Information we would like to get from action=expandtemplates and extension expansions
[edit]- List of templates and parser functions used in the expansion
- Lets us track dependencies and cacheability for selective re-rendering (examples: #time-dependent output as used on en:Main Page, which templates should trigger fragment re-rendering etc).
- A TTL, if time-sensitive (for example for #time). The minimum of the TTLs of all parser functions used (if any).
- Re-render events this content block depends on. Can be empty or any combination of the events listed in the fragment index section.
- (Maybe:) Serialized frame for tag extensions in template expansion output
- Lets us expand those extension tags with the proper frame
- BUT: Parent frame access is not generally provided by common extensions: User:GWicke/Test
See also Parsoid/Page metadata.