Jump to content

Extension:External Data/Glossary

From mediawiki.org
Data processing functions
Data processing functions are parser functions that display or otherwise utilise the external data: #external_value, #for_external_table, #display_external_table, #format_external_table, #store_external_table, #clear_external_data. In Standalone mode they also retrieve the necessary data; in Legacy mode, they display the data previously retrieved by Data retrieval functions.
Data retrieving functions
Data retrieval functions are parser functions defined in Legacy mode that retrieve the data to be processed further down the wikitext by Data processing functions. They are: #get_external_data (the universal function replacing any of the below functions), #get_web_data, #get_soap_data, #get_file_data, #get_db_data, #get_ldap_data, #get_program_data, #get_inline_data.
Data source
Data source is either a URL, a host, a second-level domain, an inline text to be parsed or a configured in LocalSettings.php connection to a database, LDAP server, etc., or a generic source *. Texts cannot be configured; URLs can but do not need to; other must be. For a web or SOAP connection, four data source are analysed to obtain the necessary configuration: *, second-level domain, host, URL. For other types of connections, only two: * and the specific connection itself. A data source can contain the parameters that are usually passed to the parser function overriding them. Data sources are configured in the $wgExternalDataSources array.
Data source ID
Data source ID is the string identifying it and serving as a key of the $wgExternalDataSources array. For URLs, it is the URL, its host, second-level domain, for other types of connection, its identifier, that can be called, deending on its type, domain, db, program, text, etc. Any of the above, except text can be replaced with source; and for the Hidden data sources, only source can be used; and in most cases, this parameter can be passed to parser function as its only anonymous argument.
Format
for data sources that return text, e.g., web, SOAP, server-side program, the format of the returned text than defines, how it is parsed into external variables. Available formats are: 'CSV', 'CSV with header', 'GFF', 'JSON', 'JSON with JSONpath', 'YAML', 'YAML with JSONpath', 'XML', 'XML with XPath', 'HTML', 'HTML with XPath', 'ini' or 'text'.
Format auto-detection
attempted detection of the actual text format based on file or URL extension, some other arguments to the parser function and the parsed content itself; set by |format=auto.
Hidden data source
A hidden data source is a Data source with a setting [ ... 'hidden' => true ... ]. It is supposed to contain all settings usually passed to the parser function, thus hiding all its details, even its type, from wiki user . In Legacy mode can only be invoked with get_external_data, its identifier can only be passed to it as source.
Intuitive parsing
Intuitive parsing happens when the wikitext is passed to for_external_table as its second parameter rather than the first, i.e., when a pipe character is inserted after the semi-colon. In this mode, the wikitext, including variables' values in {{{...}}} is parsed correctly. This is the recommended mode.
Limited parsing
Limited parsing happens when the wikitext is passed to for_external_table as its first parameter. In this mode, the wikitext, including variables' values in {{{...}}} is parsed counter-intuitively, and parser functions in the wikitext are applied to variables' names rather than their values, which severely limits wikitext functionallity. This mode is kept for backward compatibility.
Legacy mode
Legacy mode is enabled, when $wgExternalDataAllowGetters = true;. Only in legacy mode, Data retrieving functions are available, as well as Lua functions other than mw.ext.externalData.getExternalData(). This mode is kept for backward compatibility.
Lua functions
Lua functions retrieve external data and return it as a two-dimentional row-based Lua table. They are: are mw.ext.externalData.getExternalData() (in both modes) and mw.ext.externalData.getWebData(), mw.ext.externalData.getFileData(), mw.ext.externalData.getDbData(), mw.ext.externalData.getSoapData(), mw.ext.externalData.getLdapData(), mw.ext.externalData.getProgramData(), mw.ext.externalData.getInlineData() (in Legacy mode). Lua functions accept the same parameters as Data retrieving funcitons wrapped in an associative Lua table; if Data source ID is the only parameter, it can be passed not wrapped in a table, e.g., mw.ext.externalData.getDbData 'some database'.
Parsing
Some data source types (web, SOAP, program, inline text) fetch external data as text, whicn may need to be parsed as CSV, XML, JSON, etc. The parsing is done by a Data retrieving function in Legacy mode, a Data processing function in Standalone mode or a Lua function. The Format can either be set explicitly by the format parameter, or auto-detected, which works in most cases.
Preset
A preset is a preconfigured data source found in the includes/presets directory. Most of presets describe a connection to a Docker container, Dockerfile's and docker-compose.yml snippets for which are also provided. Presets can be used to test the extension for regressions, retrieving some reference data or (with the tag emulation mode) embedding media content, especially various charts, replacing many specialised MediaWiki extensions.
Standalone mode
Standalone mode is the mode of calling Data processing functions, when the parameters needed to fetch the external data are passed to them, and Data retrieving functions are not used. This external data will not be available elsewhere in wikitext. This mode is the only option, if the wikitext is parsed by Parsoid, since it does not guarantee parsing order.
Tag emulation mode
The tag emulation mode is the mode of retrieving and displaying data by a parser tag, if ... 'tag' => 'some tag' ... is defined for the Data source. This is useful, when a program or web request returns SVG that needs to be embedded in the resulting HTML.