Requests for comment/DataStore

From mediawiki.org
(Redirected from DataStore)
Request for comment (RFC)
DataStore
Component General
Creation date
Author(s) Max Semenik (talk)
Document status accepted

Current state[edit]

Although an attempt was made in August 2013, no further action has been conducted since May 2014 (https://gerrit.wikimedia.org/r/#/c/79029/). So this RFC implementation is stalled.

There is no generic persistent key-value store in MediaWiki. This results in creation of multiple tables even when underlying data doesn't need joins with other tables or lookups by various fields.

Examples of such tables that are never going to have more than 1 row:

Another example is external storage that can use a generic key-value store rather than implement a custom version of it.

Proposal[edit]

A few design points:

  • Key-value storage.
  • Generic, allowing numerous implementations with different backends, including NoSQL in the future.
  • Ability to have more than one store, using different engines/servers.
  • Only simple, atomic operations.
  • Records are searchable by key prefix.
  • Keys shouldn't be mangled to allow prefix search and migration from one storage engine to another.
  • To facilitate migration, maximum key length should be enforced across all implementations.

Proposed implementation skeleton is at <https://gerrit.wikimedia.org/r/79029>. It's not an actual, functional code but just a demo of how it could look.

Example usage:

$store = DataStore::getStore( 'default' );
$key = $store->key( 'my', 'cool', 'value' );
$store->set( $key, $value );

...

$value = $store->get( $key );
$prefix = $store->key( 'my', 'cool' );
$store->getByPrefix( $prefix, function( $key, $value ) {
        echo "$key => $value\n";
    } );

Example new store definition:

$wgDataStores['my store'] = array(
    'class' => 'MongoDataStore',
    'server' => 'mongodb://localhost:27017',
    'options' => array(
        'db' => 'mediawiki',
        'username' => '...',
        'password' => '...',
    ),
);

Possible uses[edit]

In addition to what's mentioned above:

  • File storage. Not needed for basic installations, but could be useful for Windows installations that want to use international file names (bugzilla:1780) or installations that can't secure their uploads from execution because they don't have access to Apache configuration.