有关MediaWiki数据库布局的信息,例如表格及其内容的描述,请参阅Manual:数据库布局 从历史上看,在MediaWiki中,这也记录在maintenance/tables.sql中,但是,从MediaWiki 1.35开始,作为Abstract Schema计划的一部分,它正在逐渐转变为sql/tables.json 这意味着sql/tables.jsonmaintenance script 转换为sql/mysql/tables-generated.sql,从而更容易生成模式文件以支持不同的数据库引擎。




php run.php sql

然后,您可以写出数据库查询。或者,您可以提供一个文件名,然后MediaWiki将执行该文件,并酌情替换任何MW特殊变量。 有关更多信息,请参见手册:Sql.php



## 数据库设置
$wgDBtype           = "mysql";
$wgDBserver         = "localhost";
$wgDBname           = "your-database-name";
$wgDBuser           = "your-database-username";  // Default: root
$wgDBpassword       = "your-password";



mysql -u $wgDBuser -p --database=$wgDBname

$wgDBuser信息替换$wgDBnameLocalSettings.php。 然后将提示您输入密码$wgDBpassword,此后您将看到mysql>提示符。



可通过Wikimedia\Rdbms\IDatabase类访问抽象层。 可以通过在注入的IConnectionProvider上调用getPrimaryDatabase()(首选)或MediaWikiServices来获取此类的实例。 函数wfGetDB()正在逐步淘汰,不应在新代码中使用。



use MediaWiki\MediaWikiServices;

$dbProvider = MediaWikiServices::getInstance()->getConnectionProvider();
$dbr = $dbProvider->getReplicaDatabase();

$res = $dbr->newSelectQueryBuilder()
  ->select( /* ... */ ) // see docs

foreach ( $res as $row ) {
	print $row->foo;


$dbw = $dbProvider->getPrimaryDatabase();
    ->insertInto( /* ... */ ) // see docs
    ->caller( __METHOD__ )->execute();




The SelectQueryBuilder class is the preferred way to formulate read queries in new code. In older code, you might find select() and related methods of the Database class used directly. The query builder provides a modern "fluent" interface, where methods are chained until the fetch method is invoked, without intermediary variable assignments needed. For example:

$dbr = $dbProvider->getReplicaDatabase();
$res = $dbr->newSelectQueryBuilder()
	->select( [ 'cat_title', 'cat_pages' ] )
	->from( 'category' )
	->where( 'cat_pages > 0' )
	->orderBy( 'cat_title', SelectQueryBuilder::SORT_ASC )
	->caller( __METHOD__ )->fetchResultSet();

As described below, MW 1.42 introduces a helper method, expr(), which lets you wrap the field, operator and value as an expression. Using this, the where clause in the above example can be rewitten as

->where( $dbr->expr( 'cat_pages', '>', 0 ) )

This example corresponds to the following SQL:

SELECT cat_title, cat_pages FROM category WHERE cat_pages > 0 ORDER BY cat_title ASC

JOINs are also possible; for example:

$dbr = $dbProvider->getReplicaDatabase();
$res = $dbr->newSelectQueryBuilder()
	->select( 'wl_user' )
	->from( 'watchlist' )
	->join( 'user_properties', /* alias: */ null, 'wl_user=up_user' )
	->where( [
		'wl_user != 1',
		'wl_namespace' => '0',
		'wl_title' => 'Main_page',
		'up_property' => 'enotifwatchlistpages',
	] )
	->caller( __METHOD__ )->fetchResultSet();

This example corresponds to the query:

SELECT wl_user
FROM `watchlist`
INNER JOIN `user_properties` ON ((wl_user=up_user))
WHERE (wl_user != 1)
AND wl_namespace = '0'
AND wl_title = 'Main_page'
AND up_property = 'enotifwatchlistpages'

You can access individual rows of the result using a foreach loop. Each row is represented as an object. For example:

$dbr = $dbProvider->getReplicaDatabase();
$res = $dbr->newSelectQueryBuilder()
	->select( [ 'cat_title', 'cat_pages' ] )
	->from( 'category' )
	->where( 'cat_pages > 0' )
	->orderBy( 'cat_title', SelectQueryBuilder::SORT_ASC )
	->caller( __METHOD__ )->fetchResultSet();      

foreach ( $res as $row ) {
	print 'Category ' . $row->cat_title . ' contains ' . $row->cat_pages . " entries.\n";

There are also convenience functions to fetch a single row, a single field from several rows, or a single field from a single row:

// Equivalent of:
//     $rows = fetchResultSet();
//     $row = $rows[0];
$pageRow = $dbr->newSelectQueryBuilder()
	->select( [ 'page_id', 'page_namespace', 'page_title' ] )
	->from( 'page' )
	->orderBy( 'page_touched', SelectQueryBuilder::SORT_DESC )
	->caller( __METHOD__ )->fetchRow();

// Equivalent of:
//     $rows = fetchResultSet();
//     $ids = array_map( fn( $row ) => $row->page_id, $rows );
$pageIds = $dbr->newSelectQueryBuilder()
	->select( 'page_id' )
	->from( 'page' )
	->where( [
		'page_namespace' => 1,
	] )
	->caller( __METHOD__ )->fetchFieldValues();

// Equivalent of:
//     $rows = fetchResultSet();
//     $id = $row[0]->page_id;
$pageId = $dbr->newSelectQueryBuilder()
	->select( 'page_id' )
	->from( 'page' )
	->where( [
		'page_namespace' => 1,
		'page_title' => 'Main_page',
	] )
	->caller( __METHOD__ )->fetchField();

In these examples, $pageRow is an row object as in the foreach example above, $pageIds is an array of page IDs, and $pageId is a single page ID.

While you can use tables() to add multiple tables, it is highly recommended to use join() or leftJoin() instead. Any aliases for additional tables must be added to join() or leftJoin(), not in tables().



SQL UPDATE statements should be done with the UpdateQueryBuilder .

$dbw = $this->dbProvider->getPrimaryDatabase();
	->update( 'user' )
	->set( [ 'user_password' => $newHash->toString() ] )
	->where( [
		'user_id' => $oldRow->user_id,
		'user_password' => $oldRow->user_password,
	] )
	->caller( $fname )->execute();



SQL INSERT statements should be done with the InsertQueryBuilder.

$dbw = $this->dbProvider->getPrimaryDatabase();
$targetRow = [
	'bt_address' => $targetAddress,
	'bt_user' => $targetUserId,
	/* etc */
	->insertInto( 'block_target' )
	->row( $targetRow )
	->caller( __METHOD__ )->execute();
$id = $dbw->insertId();



SQL DELETE statements should be done with the DeleteQueryBuilder.

$dbw = $this->dbProvider->getPrimaryDatabase();
	->deleteFrom( 'block' )
	->where( [ 'bl_id' => $ids ] )
	->caller( __METHOD__ )->execute();
$numDeleted = $dbw->affectedRows();



SQL REPLACE statements should be done with the ReplaceQueryBuilder.

$dbw = $this->dbProvider->getPrimaryDatabase();
	->replaceInto( 'querycache_info' )
	->row( [
		'qci_type' => 'activeusers',
		'qci_timestamp' => $dbw->timestamp( $asOfTimestamp ),
	] )
	->uniqueIndexFields( [ 'qci_type' ] )
	->caller( __METHOD__ )->execute();



SQL UNION statements should be done with the UnionQueryBuilder.

$dbr = $this->dbProvider->getReplicaDatabase();
$ids = $dbr->newUnionQueryBuilder()
	->add( $db->newSelectQueryBuilder()
		->select( 'bt_id' )
		->from( 'block_target' )
		->where( [ 'bt_address' => $addresses ] )
	->add( $db->newSelectQueryBuilder()
		->select( 'bt_id' )
		->from( 'block_target' )
		->join( 'user', null, 'user_id=bt_user' )
		->where( [ 'user_name' => $userNames ] )
	->caller( __METHOD__ )

Batch queries

If you need to insert or update multiple rows, try to group them together into a batch query for increased efficiency. It's important to keep the table declaration (e.g. update(), insertInto(), etc.), caller(), and execute() outside the loop. Anything related to creating or updating rows can go inside the loop (e.g. row()).

$queryBuilder = $this->getDb()->newInsertQueryBuilder()
	->insertInto( 'ores_classification' )
	->caller( __METHOD__ );
foreach ( [ 0, 1, 2, 3 ] as $id ) {
	$predicted = $classId === $id;
	$queryBuilder->row( [
		'oresc_model' => $this->ensureOresModel( 'draftquality' ),
		'oresc_class' => $id,
    	'oresc_probability' => $predicted ? 0.7 : 0.1,
		'oresc_is_predicted' => $predicted ? 1 : 0,
		'oresc_rev' => $revId,
	] );


The following helper methods should be used when appropriate, because they build SQL queries that are compatible with all supported database types, and they assist with auto escaping.



Should be used in WHERE statements whenever anything is being compared that isn't a simple equals statement. For example, $dbr->expr( 'ptrp_page_id', '>', $start ).

This method can be chained with ->and() and ->or(). For example, $db->expr( 'ptrp_page_id', '=', null )->or( 'ptrpt_page_id', '=', null )


Different database engines format MediaWiki timestamps differently. Use this to ensure compatibility. Example: $dbr->expr( 'ptrp_reviewed_updated', '>', $dbr->timestamp( $time ) )



Should be used in WHERE statements when you do not want to SQL escape anything. If comparing a field to a user value (much more common), use $dbr->expr() instead. RawSQLExpression does not escape, so it should never be used with user input. Use sparingly! Example: $dbr->expr( new RawSQLExpression( 'rc_timestamp < fp_pending_since' ) )



Should be used in WHERE statements when you do not want to SQL escape anything. If comparing a field to a user value (much more common), use $dbr->expr() instead. RawSQLValue does not escape, so it should never be used with user input. Use sparingly! Example: $dbr->expr( 'fp_pending_since', '>', new RawSQLValue( $fieldName ) )


Older MediaWiki code may use wrapper functions like $dbr->select() and $dbw->insert(). Very old MediaWiki code may use $dbw->query(). None of these are considered good practice now, and should be upgraded to the query builders mentioned above.

在某些情况下,它们可以处理诸如表前缀和转义之类的事情。 如果您确实需要编写自己的SQL,请阅读tableName()和addQuotes()的文档。您将需要他们两个。请记住,不正确使用addQuotes()可能会给您的Wiki带来严重的安全漏洞。 You will need both of them. Please keep in mind that failing to use addQuotes() properly can introduce severe security holes into your wiki.

使用高级方法而不是构造自己的查询的另一个重要原因是要确保无论数据库类型如何,代码都能正确运行。 当前,支持最好的是MySQL/MariaDB。SQLite也有很好的支持,但是它比MySQL或MariaDB慢得多。有对PostgreSQL的支持,但不如MySQL稳定。

在下面,列出了可用的包装函数。 有关包装函数的参数的详细说明,请参阅Database类的文档。 特别注意查看Database::select中关于$table$vars$conds$fname$options$join_conds的说明,这些参数被许多其他包装函数使用。

参数 $table$vars$conds$fname$options$join_conds 不应是 nullfalse(在 REL 1.35 之前一直有效),而是空字符串 '' 或空数组 []
function select( $table, $vars, $conds, .. );
function selectField( $table, $var, $cond, .. );
function selectRow( $table, $vars, $conds, .. );
function insert( $table, $a, .. );
function insertSelect( $destTable, $srcTable, $varMap, $conds, .. );
function update( $table, $values, $conds, .. );
function delete( $table, $conds, .. );
function deleteJoin( $delTable, $joinTable, $delVar, $joinVar, $conds, .. );

Convenience functions


For compatibility with PostgreSQL, insert ids are obtained using nextSequenceValue() and insertId(). The parameter for nextSequenceValue() can be obtained from the CREATE SEQUENCE statement in maintenance/postgres/tables.sql and always follows the format of x_y_seq, with x being the table name (e.g. page) and y being the primary key (e.g. page_id), e.g. page_page_id_seq. For example:

$id = $dbw->nextSequenceValue( 'page_page_id_seq' );
$dbw->insert( 'page', [ 'page_id' => $id ] );
$id = $dbw->insertId();

For some other useful functions, e.g. affectedRows(), numRows(), etc., see Manual:Database.php#Functions.


需要编写数据库查询的MediaWiki开发人员应该对数据库以及与之相关的性能问题有所了解。 包含令人无法接受的缓慢功能的补丁将不被接受。 除了从QueryPage派生的特殊页面之外,MediaWiki通常不欢迎未索引的查询。 对于新开发人员来说,提交包含检索大量行的SQL查询的代码是一个常见的陷阱。 请记住,COUNT(*)的复杂度是O(N)的,对表中的行进行计数就像对水桶中的豆子进行计数一样。

Backward compatibility

Often, due to design changes to the DB, different DB accesses are necessary to ensure backward compatibility. This can be handled for example with the global constant MW_VERSION (or global variable $wgVersion before MediaWiki 1.39):

* backward compatibility
* @since 1.31.15
* @since 1.35.3
* define( 'DB_PRIMARY', ILoadBalancer::DB_PRIMARY )
* DB_PRIMARY remains undefined in MediaWiki before v1.31.15/v1.35.3
* @since 1.28.0
* define( 'DB_REPLICA', ILoadBalancer::DB_REPLICA )
* DB_REPLICA remains undefined in MediaWiki before v1.28
defined('DB_PRIMARY') or define('DB_PRIMARY', DB_MASTER);
defined('DB_REPLICA') or define('DB_REPLICA', DB_SLAVE);

$res = WrapperClass::getQueryFoo();

class WrapperClass {

	public static function getReadingConnect() {
		return wfGetDB( DB_REPLICA );

	public static function getWritingConnect() {
		return wfGetDB( DB_PRIMARY );

	public static function getQueryFoo() {
		global $wgVersion;

		$param = '';
		if ( version_compare( $wgVersion, '1.33', '<' ) ) {
			$param = self::getQueryInfoFooBefore_v1_33();
		} else {
			$param = self::getQueryInfoFoo();

		return = $dbw->select(
			$param['join_conds'] );

	private static function getQueryInfoFoo() {
		return [
			'tables' => [
				't1' => 'table1',
				't2' => 'table2',
				't3' => 'table3'
			'fields' => [
				'field_name1' => 't1.field1',
				'field_name2' => 't2.field2',
			'conds' => [ 
			'join_conds' => [
				't2' => [
					'INNER JOIN',
					'field_name1 = field_name2'
				't3' => [
					'LEFT JOIN',
			'options' => [ 

	private static function getQueryInfoFooBefore_v1_33() {
		return [
			'tables' => [
				't1' => 'table1',
				't2' => 'table2',
				't3' => 'table3_before'
			'fields' => [
				'field_name1' => 't1.field1',
				'field_name2' => 't2.field2_before',
			'conds' => [ 
			'join_conds' => [
				't2' => [
					'INNER JOIN',
				't3' => [
					'LEFT JOIN',
			'options' => [ 
	public static function getQueryFoo() {

		$param = '';
		if ( version_compare( MW_VERSION, '1.39', '<' ) ) {
			$param = self::getQueryInfoFooBefore_v1_39();
		} else {
			$param = self::getQueryInfoFoo();

		return = $dbw->select(
			$param['join_conds'] );


诸如Wikipedia之类使用MediaWiki的大型网站,使用大量的MySQL副本服务器来复制对主MySQL服务器的写入。 It is important to understand the complexities associated with large distributed systems if you want to write code destined for Wikipedia.

通常情况下,用于给定任务的最佳算法取决于是否使用副本。 Due to our unabashed Wikipedia-centrism, we often just use the replication-friendly version, but if you like, you can use wfGetLB()->getServerCount() > 1 to check to see if replication is in use.


滞后主要发生在将大的写查询发送到主服务器时。 在主服务器上的写操作是并行执行的,但是将它们复制到副本时,它们是串行执行的。 提交任务后,主服务器将查询写入二进制日志。 副本服务器轮询二进制日志并在查询出现后立即开始执行查询。 他们可以在执行写查询时为读取提供服务,但不会再从二进制日志中读取任何内容,因此将不再执行任何写操作。 这意味着,如果写查询运行了很长时间,则副本将在完成写查询所需的时间上落后于主服务器。

高读取负载会加剧延迟。 当副本滞后30秒以上时,MediaWiki的负载平衡器将停止向副本发送读取请求。 如果负载比率设置不正确,或者通常负载太大,则可能导致副本永久滞后30秒左右。

如果所有副本的滞后时间都超过30秒(根据$DBservers),MediaWiki将停止写入数据库。 This means a lot of load could lead to all edits and other write operations to be refused, with an error returned to the user. 这给了副本一个赶上的机会。


除此之外,MediaWiki尝试确保用户按时间顺序查看Wiki上发生的事件。 只要用户从后续请求中看到一致的画面,就可以容忍几秒钟的延迟。 This is done by saving the primary binlog position in the session, and then at the start of each request, waiting for the replica to catch up to that position before doing any reads from it. If this wait times out, reads are allowed anyway, but the request is considered to be in "lagged replica mode". Lagged replica mode can be checked by calling LoadBalancer::getLaggedReplicaMode(). The only practical consequence at present is a warning displayed in the page footer.

Shell users can check replication lag with getLagTimes.php ; other users can check using the siteinfo API.

Databases often have their own monitoring systems in place as well, see for instance wikitech:MariaDB#Replication lag (Wikimedia) and wikitech:Help:Toolforge/Database#Identifying lag (Wikimedia Cloud VPS).


为了避免过多的滞后,应该拆分写入大量行的查询,通常一次只写一行。 多行 INSERT ... SELECT 查询是最严重的问题,应完全避免。 Instead do the select first and then the insert.

Even small writes can cause lag if they are done at a very high speed and replication is unable to keep up. This most commonly happens in maintenance scripts. To prevent it, you should call Maintenance::waitForReplication() after every few hundred writes. Most scripts make the exact number configurable:

class MyMaintenanceScript extends Maintenance {
    public function __construct() {
        // ...
        $this->setBatchSize( 100 );

    public function execute() {
        $limit = $this->getBatchSize();
        while ( true ) {
             // ...select up to $limit rows to write, break the loop if there are no more rows...
             // ...do the writes...

Working with lag

Despite our best efforts, it's not practical to guarantee a low-lag environment. Replication lag will usually be less than one second, but may occasionally be up to 5 seconds. For scalability, it's very important to keep load on the primary server low, so simply sending all your queries to the primary server is not the answer. So when you have a genuine need for up-to-date data, the following approach is advised:

  1. Do a quick query to the primary server for a sequence number or timestamp
  2. Run the full query on the replica and check if it matches the data you got from the primary server
  3. If it doesn't, run the full query on the primary server

To avoid swamping the primary server every time the replicas lag, use of this approach should be kept to a minimum. In most cases you should just read from the replica and let the user deal with the delay.

Lock contention

由于Wikipedia(和其他一些Wiki)上的写入率很高,MediaWiki开发人员需要非常小心地构造其写入内容,以避免长时间的锁定。 默认情况下,MediaWiki在第一个查询中打开一个任务,并在发送输出之前将其提交。 从查询完成到提交为止,将保持锁定状态。 因此,在执行写查询之前,可以通过尽可能多的处理来减少锁定时间。 通过将对象添加到$PostCommitUpdateList,可以将不需要数据库访问的更新操作延迟到提交之后。

Often this approach is not good enough, and it becomes necessary to enclose small groups of queries in their own transaction. 使用以下语法:

$factory = \MediaWiki\MediaWikiServices::getInstance()->getDBLoadBalancerFactory();
/* Do queries */

Use of locking reads (e.g. the FOR UPDATE clause) is not advised. They are poorly implemented in InnoDB and will cause regular deadlock errors. It's also surprisingly easy to cripple the wiki with lock contention.

Instead of locking reads, combine your existence checks into your write queries, by using an appropriate condition in the WHERE clause of an UPDATE, or by using unique indexes in combination with INSERT IGNORE. Then use the affected row count to see if the query succeeded.


Don't forget about indexes when designing databases, things may work smoothly on your test wiki with a dozen of pages, but will bring a real wiki to a halt. 有关详细信息,请参见上方

For naming conventions, see Manual:Coding conventions/Database .
