Extension:External Data/Databases
You can use External Data to get data from outside databases; most relational DB types are supported, as well as MongoDB. The retrieval is done with either a set of parameters that together represent a SELECT statement, or the name of a prepared statement.
As of version 3.2, the recommended way to retrieve database data is to use one of the display functions (#external_value, #for_external_table, etc.), passing in the necessary parameters for the data retrieval, most notably "source=". You can also retrieve database data by calling the #get_db_data function, or (for version 3.0 and higher) #get_external_data. In all of these cases, you must specify the information, including login information, for the database in the variable $wgExternalDataSources in LocalSettings.php.
For any of these parser functions, you can also call its corresponding Lua function.
A note about security: If you are going to retrieve database data, you should think about the security implications. Configuring a database in LocalSettings.php will allow anyone with edit access to your wiki to run arbitrary SQL statements against that database, unless prepared statements are configured for the database connection. You should use a database user that has the minimum permissions for what you are trying to achieve. It is possible that complex SQL constructions could be passed to this function to cause it to do things vastly different from what it was designed for.
Configuration
[edit]Each database being accessed needs to be configured separately in LocalSettings.php. For normal databases (i.e., everything except for SQLite), add the following stanza for each database:
$wgExternalDataSources['ID'] = [
'server' => 'server URL',
'type' => 'DB type',
'name' => 'DB name',
'user' => 'username',
'password' => 'password'
];
Where:
- ID - a label for this database which is used when calling #get_db_data
- server URL - the hostname on which the database lives
- DB type - the type of database, i.e. mysql, postgres, mssql, oracle, sqlite, db2 or mongodb
- DB name, username, password - details for accessing the database.
An example of a set of values would be:
$wgExternalDataSources['ID'] = [
'server' => '127.0.0.1',
'type' => 'mysql',
'name' => 'employeesDatabase',
'user' => 'guest',
'password' => 'p@ssw0rd'
];
The following optional settings can also be added:
$wgExternalDataSources['ID']['flags'] = 'MediaWiki DB flags';
$wgExternalDataSources['ID']['prefix'] = 'table prefix';
Example values for these variables are:
$wgExternalDataSources['employee-db']['flags'] = DBO_NOBUFFER & DBO_TRX;
$wgExternalDataSources['employee-db']['prefix'] = 'emp_';
Support for database systems
[edit]MySQL, Postgres (i.e. PostgreSQL), DB2 and MongoDB should work fully by default (though there are syntax limitations, and differences, for MongoDB - see below). For MS SQL/SQL Server, SQLite and Oracle, you may need to perform some special handling.
Postgres
[edit]If you cannot connect to a PostgreSQL database, it may be because your PHP installation is lacking the PostgreSQL database module, php-pgsql. On many Linux systems, you can install it by calling the following, then restarting the web server:
yum install php55-php-pgsql
Amend the above configuration in LocalSettings.php to change the server type to "postgres":
$wgExternalDataSources['ID']['type'] = 'postgres';
SQLite
[edit]To connect to SQLite, you need something like the following in LocalSettings.php:
$wgExternalDataSources['ID'] = [
'type' => 'sqlite',
'directory' => '/directory/to/DB/file',
'name' => 'Name of file, without .sqlite'
];
Oracle
[edit]Connecting to Oracle may work by default. If it doesn't work, the following may help:
- Make sure that the Oracle client, and the PHP version being used, are using the same architecture: they have to either both be 32-bit, or both be 64-bit.
- Make sure that the value of $edgDBServer for the installation matches something in the corresponding Oracle client .ora files. The value may need to look like "serverName/dbName", as opposed to "serverName".
- If none of the above are the issue, you could try using the OdbcDatabase extension, which should work as well.
MongoDB
[edit]For MongoDB, there are no special connection parameters, although the username and password may be optional.
There are two optional query parameters: aggregate
and find query
.
Under PHP 7.*, the extension mongodb
and library mongodb
(composer require mongodb/mongodb
) is required. Unfortunately, due to the way that MediaWiki continuous integration is built, this library cannot be simply added to composer.json
for this extension (see T259743).
MongoDB is a non-SQL (or "NoSQL", if you prefer) database system, with its own querying language.
When accessing MongoDB, you can either pass in a standard MongoDB query, or use the standard SQL-like syntax of #get_db_data.
To use standard MongoDB querying, pass the query to the parameter |find query=
or |aggregate=
.
You can also use the standard querying functionality. There are some restrictions and differences, however, for the "where" clause:
- only "AND"s can be used, not "OR"s
- for the "LIKE" comparison, no text should be placed around the comparator - it should look like "Username LIKE Jo", not "Username LIKE '%Jo%'".
Because MongoDB returns values in JSON that may be complex, and contain compound values, you can get data that is stored in such a way by separating field names with dots. For instance, if the return data contains a value for a field called "Measurements" that is an array, holding values for fields called "Height" and "Width", then the "data=" parameter to #get_web_data could have a value like "height=Measurements.Height,width=Measurements.Width".
You can do Memcached- or APCU-based caching of values retrieved from MongoDB; to do that, you need the following two lines in LocalSettings.php:
$wgMainCacheType = CACHE_MEMCACHED /* or CACHE_ACCEL */;
$wgDataSources['MongoDB ID']['cache seconds'] = ''number of seconds'';
To enable ModgoDB under PHP 7.4, mongodb
extension should be enabled (sudo apt install php-mongodb && sudo phpenmod mongodb
; and also mongodb
library should be installed with Composer: composer require mongodb/mongodb "^1.6.0"
(this will be necessary until bug T259743 is resolved).
Microsoft SQL Server
[edit]MediaWiki formerly natively supported MS SQL Server; this support went away in version 1.34. However, starting with version 3.1, External Data provides its own support for accessing MS SQL Server via ODBC.
The typical settings for accessing MS SQL Server will be:
$wgExternalDataSources['mssql'] = [
'driver' => 'ODBC Driver 17 for SQL Server',
'server' => 'localhost,5500',
'type' => 'odbc',
'name' => 'NorthWind',
'user' => 'Test',
'password' => 'Test0000'
];
Note the 'driver'
setting and the comma between domain name and port in the 'server'
parameter. The odbc PHP extension has to be installed, as well as the Microsoft ODBC driver for SQL Server.
The file /opt/microsoft/msodbcsql17/etc/odbcinst.ini
should contain
[ODBC Driver 17 for SQL Server] Description=Microsoft ODBC Driver 17 for SQL Server Driver=/opt/microsoft/msodbcsql17/lib64/libmsodbcsql-17.5.so.2.1
Usage
[edit]The examples below all refer to #get_db_data. To retrieve DB data using any of the other parser (or Lua) functions, you can use all of the same parameters, but specify "source=", instead of "db=", for the ID stored in LocalSettings.php.
To get data from an external database, call the following:
{{#get_db_data: db=database ID. This parameter can be passed as 'source' or even anonymously, provided there are no equal signs in it |from=from clause |where=where clause |limit=limit |order by=order by clause |group by=group by clause |data=data mappings |suppress error }}
An explanation of the fields:
Parameter name | Description | Additional notes |
---|---|---|
db=
|
The identifying label configured in LocalSettings.php | |
from=
|
An SQL "FROM" clause, i.e. one or more tables | Can be as simple as tableName or as complex as tableName1 = alias1, tableName2 = alias2, etc.
|
join on=
|
Corresponds to an SQL "JOIN ... ON" clause; used if there is more than one table being queried. | An example value would be tableName1.ID = tableName2.id_field, etc.
|
where=
|
An SQL "WHERE" clause (optional) | |
limit=
|
An SQL "LIMIT" clause, i.e. a number, limiting the number of results returned (optional) | |
order by=
|
An SQL "ORDER BY" clause (optional) | |
data=
|
Mapping of database column names to local variables. | The syntax is localVariable1=databaseColumn1, localVariable2=databaseColumn2, … . In the example below, "employeeName" is the name of the database column and "name" is the local variable.
Unless the database is MongoDB, |
suppress error
|
Prevents any error message from getting displayed if there is a problem retrieving the data (optional) |
An example call, using the "employee database" example from above:
{{#get_db_data: db=employee-db |from=tblPersonalInfo |where=employeeID='{{{id}}}' |limit=50 |order by=employeeName ASC |data=dbemail=employeeEmail,name=employeeName }}
Prepared statements
[edit]A safer approach is to define one or more prepared statements for the database connections defined in LocalSettings.php
, in 'prepared'
item of the data source configuration array, which can be a string, containing a SQL query with parameters, for the only statement, or an associative array ([ 'query id' => 'SQL' ]
), for several. Items of this associative array can be either strings holding the prepared statement that take no parameters or only parameters of the default type s
(strings), or arrays of the form [ 'query' => 'SELECT ...' /* the prepared statement */, types => 'si' /* parameter types, one character per each */ ]
. See mysqli_stmt::bind_param
for parameter types. Setting parameter types is only relevant for mySQL.
Parameters to the prepared statement are passed as a comma-separated list in parser function argument parameters
. If there are several prepared statements defined for the same connection, the needed statement ID is passed as query
parameter. If prepared statements are defined, arbitrary queries will not be created for the same connection.
Examples
[edit]MySQL: only one statement allowed for the connection
[edit]$wgExternalDataSources['rfam2'] = [
'server' => 'mysql-rfam-public.ebi.ac.uk:4497',
'type' => 'mysql',
'name' => 'Rfam',
'user' => 'rfamro',
'password' => '',
'prepared' => <<<'SQL'
SELECT fr.rfam_acc, fr.rfamseq_acc, fr.seq_start, fr.seq_end
FROM full_region fr, rfamseq rf, taxonomy tx
WHERE rf.ncbi_id = tx.ncbi_id
AND fr.rfamseq_acc = rf.rfamseq_acc
AND tx.ncbi_id = ? -- the only parameter to the query.
AND is_significant = 1 -- exclude low-scoring matches from the same clan
LIMIT 20;
SQL,
'types' => 's'
];
{{#get_db_data:db = rfam2 | parameters=10116 <!-- this parameter is used to substitute question marks in the prepared statement --> | data=account=rfam_acc,sec=rfamseq_acc,start=seq_start,end=seq_end }}
MySQL: several statements per connection
[edit]$wgExternalDataSources['rfam3'] = [
'server' => 'mysql-rfam-public.ebi.ac.uk:4497',
'type' => 'mysql',
'name' => 'Rfam',
'user' => 'rfamro',
'password' => '',
'prepared' => [
// Parameter types set explicitly.
'sequences' => [
'query' => <<<'SEQ'
SELECT fr.rfam_acc, fr.rfamseq_acc, fr.seq_start, fr.seq_end
FROM full_region fr, rfamseq rf, taxonomy tx
WHERE rf.ncbi_id = tx.ncbi_id
AND fr.rfamseq_acc = rf.rfamseq_acc
AND tx.ncbi_id = ? -- the only parameter to the query.
AND is_significant = 1 -- exclude low-scoring matches from the same clan
SEQ,
'types' => 's'
],
// Default parameter types.
'sno' => <<<'SNO'
SELECT fr.rfam_acc, fr.rfamseq_acc, fr.seq_start, fr.seq_end, f.type
FROM full_region fr, rfamseq rf, taxonomy tx, family f
WHERE
rf.ncbi_id = tx.ncbi_id
AND f.rfam_acc = fr.rfam_acc
AND fr.rfamseq_acc = rf.rfamseq_acc
AND tx.tax_string LIKE ? -- the only parameter to the query.
AND f.type LIKE '%snoRNA%'
AND is_significant = 1 -- exclude low-scoring matches from the same clan
SNO
]
];
{{#get_db_data: db = rfam3 | query=sequences <!-- this parameter is used to choose one of the prepared statements --> | parameters=10116 <!-- this parameter is used to substitute question marks in the prepared statement --> | data=account=rfam_acc,sec=rfamseq_acc,start=seq_start,end=seq_end }}
PostgreSQL
[edit]Download and restore 'dvdrental' example database. Create user 'ED' with password 'ED' and grant to it read rights on 'public' scheme in 'dvdrental' database.
Create the function films_shorter_than()
:
create or replace function films_shorter_than (int) returns table (title varchar, description text, length smallint)
language plpgsql as $$
begin
return query select film.title, film.description, film.length
from film
where film.length <= $1;
end;
$$;
Add 'postgresql' database connection to LocalSettings.php:
$wgExternalDataSources['postgresql'] = [
'server' => 'localhost',
'type' => 'postgres',
'name' => 'dvdrental',
'user' => 'ED',
'password' => 'ED',
'prepared' => <<<'POSTGRE'
SELECT title, description, length
FROM films_shorter_than($1)
ORDER BY length ASC
LIMIT 25;
POSTGRE
];
Now, you can obtain a list of films shorter than 55 minutes with:
{{#get_external_data: db = postgresql | limit = 25 | data = title=title,description=description,length=length | parameters = 55 }} {| class="wikitable" ! Title !! Description !! Length {{#for_external_table:<nowiki/> {{!}}- {{!}} {{{title}}} {{!}}{{!}} {{{description}}} {{!}}{{!}} {{{length}}} }} |}
Microsoft SQL Server
[edit]It is assumed that Microsoft SQL server and the ODBC driver for it are installed, as well as the odbc extension for PHP; and the Northwind database is imported.
/opt/microsoft/msodbcsql17/etc/odbcinst.ini
should have the entry:
[ODBC Driver 17 for SQL Server] Description=Microsoft ODBC Driver 17 for SQL Server Driver=/opt/microsoft/msodbcsql17/lib64/libmsodbcsql-17.5.so.2.1
Then necessary configuration setting for External Data will be:
$wgExternalDataSources['Northwind'] = [
'driver' => 'ODBC Driver 17 for SQL Server',
'server' => 'localhost,5500',
'type' => 'odbc',
'name' => 'NorthWind',
'user' => 'Test',
'password' => 'Test0000',
'prepared' => [
'employees' => <<<'ODBC'
SELECT TOP 10 TitleOfCourtesy, FirstName, LastName, Title, City, COUNT(OrderID) AS NoOrders
FROM Employees LEFT JOIN Orders ON Employees.EmployeeID = Orders.EmployeeID
WHERE Employees.Region = ?
GROUP BY TitleOfCourtesy, FirstName, LastName, Title, City
ORDER BY NoOrders DESC;
ODBC
]
];
and the parser function call retrieving and displayng the Washington employees data will be:
{{#get_db_data: db = Northwind | query = employees | parameters = WA | data = title = TitleOfCourtesy, first = FirstName, last = LastName, position = Title, city = City, orders = COUNT(OrderID) AS NoOrders }} {| class="wikitable" ! Name !! Position !! Orders !! City{{#for_external_table:<nowiki/> {{!}}- {{!}} {{{title}}} {{{first}}} {{{last}}} {{!}} {{{position}}} {{!}} {{{orders}}} {{!}} {{{city}}} }} |}
External links
[edit]- "MediaWiki mySQL Demo" - YouTube video by Matthew Shell showing the use of External Data with prepared statements for MySQL, June 2024