Extension:External Data/Glossary
Appearance
- Data processing functions
- Data processing functions are parser functions that display or otherwise utilise the external data: #external_value, #for_external_table, #display_external_table, #format_external_table, #store_external_table, #clear_external_data. In Standalone mode they also retrieve the necessary data; in Legacy mode, they display the data previously retrieved by Data retrieval functions.
- Data retrieving functions
- Data retrieval functions are parser functions defined in Legacy mode that retrieve the data to be processed further down the wikitext by Data processing functions. They are: #get_external_data (the universal function replacing any of the below functions), #get_web_data, #get_soap_data, #get_file_data, #get_db_data, #get_ldap_data, #get_program_data, #get_inline_data.
- Data source
- Data source is either a URL, a host, a second-level domain, an inline text to be parsed or a configured in
LocalSettings.php
connection to a database, LDAP server, etc., or a generic source*
. Texts cannot be configured; URLs can but do not need to; other must be. For a web or SOAP connection, four data source are analysed to obtain the necessary configuration:*
, second-level domain, host, URL. For other types of connections, only two:*
and the specific connection itself. A data source can contain the parameters that are usually passed to the parser function overriding them. Data sources are configured in the$wgExternalDataSources
array. - Data source ID
- Data source ID is the string identifying it and serving as a key of the
$wgExternalDataSources
array. For URLs, it is the URL, its host, second-level domain, for other types of connection, its identifier, that can be called, deending on its type,domain
,db
,program
,text
, etc. Any of the above, excepttext
can be replaced withsource
; and for the Hidden data sources, onlysource
can be used; and in most cases, this parameter can be passed to parser function as its only anonymous argument. - Format
- for data sources that return text, e.g., web, SOAP, server-side program, the format of the returned text than defines, how it is parsed into external variables. Available formats are: 'CSV', 'CSV with header', 'GFF', 'JSON', 'JSON with JSONpath', 'YAML', 'YAML with JSONpath', 'XML', 'XML with XPath', 'HTML', 'HTML with XPath', 'ini' or 'text'.
- Format auto-detection
- attempted detection of the actual text format based on file or URL extension, some other arguments to the parser function and the parsed content itself; set by
|format=auto
. - Hidden data source
- A hidden data source is a Data source with a setting
[ ... 'hidden' => true ... ]
. It is supposed to contain all settings usually passed to the parser function, thus hiding all its details, even its type, from wiki user . In Legacy mode can only be invoked with get_external_data, its identifier can only be passed to it assource
. - Intuitive parsing
- Intuitive parsing happens when the wikitext is passed to for_external_table as its second parameter rather than the first, i.e., when a pipe character is inserted after the semi-colon. In this mode, the wikitext, including variables' values in
{{{...}}}
is parsed correctly. This is the recommended mode. - Limited parsing
- Limited parsing happens when the wikitext is passed to for_external_table as its first parameter. In this mode, the wikitext, including variables' values in
{{{...}}}
is parsed counter-intuitively, and parser functions in the wikitext are applied to variables' names rather than their values, which severely limits wikitext functionallity. This mode is kept for backward compatibility. - Legacy mode
- Legacy mode is enabled, when
$wgExternalDataAllowGetters = true;
. Only in legacy mode, Data retrieving functions are available, as well as Lua functions other thanmw.ext.externalData.getExternalData()
. This mode is kept for backward compatibility. - Lua functions
- Lua functions retrieve external data and return it as a two-dimentional row-based Lua table. They are: are
mw.ext.externalData.getExternalData()
(in both modes) andmw.ext.externalData.getWebData()
,mw.ext.externalData.getFileData()
,mw.ext.externalData.getDbData()
,mw.ext.externalData.getSoapData()
,mw.ext.externalData.getLdapData()
,mw.ext.externalData.getProgramData()
,mw.ext.externalData.getInlineData()
(in Legacy mode). Lua functions accept the same parameters as Data retrieving funcitons wrapped in an associative Lua table; if Data source ID is the only parameter, it can be passed not wrapped in a table, e.g.,mw.ext.externalData.getDbData 'some database'
. - Parsing
- Some data source types (web, SOAP, program, inline text) fetch external data as text, whicn may need to be parsed as CSV, XML, JSON, etc. The parsing is done by a Data retrieving function in Legacy mode, a Data processing function in Standalone mode or a Lua function. The Format can either be set explicitly by the
format
parameter, or auto-detected, which works in most cases. - Preset
- A preset is a preconfigured data source found in the
includes/presets
directory. Most of presets describe a connection to a Docker container,Dockerfile
's anddocker-compose.yml
snippets for which are also provided. Presets can be used to test the extension for regressions, retrieving some reference data or (with the tag emulation mode) embedding media content, especially various charts, replacing many specialised MediaWiki extensions. - Standalone mode
- Standalone mode is the mode of calling Data processing functions, when the parameters needed to fetch the external data are passed to them, and Data retrieving functions are not used. This external data will not be available elsewhere in wikitext. This mode is the only option, if the wikitext is parsed by Parsoid, since it does not guarantee parsing order.
- Tag emulation mode
- The tag emulation mode is the mode of retrieving and displaying data by a parser tag, if
... 'tag' => 'some tag' ...
is defined for the Data source. This is useful, when a program or web request returns SVG that needs to be embedded in the resulting HTML.