Offline content generator/Installation/en

From mediawiki.org

These instructions are out of date. The most recent installation instructions can be found at wikitech:OCG#Installing_a_development_instance and in the README of the OCG service itself.

OCG is a tool collection under construction providing offline content rendering. It can be used in conjunction with the Collection Extension to provide book rendereing.

Installation[edit]

Usually public rendering servers may be used. Their address can be provided to the Collection Extension, which will use that. But sometimes it is necessary to install the rendering stack on own machines. In this case all systems were running Debian 7 (Wheezy). The installation on Ubuntu might be similar or even easier.

Prerequisites[edit]

OCG is developed in javascript running on an nodejs rendering server. For rendering the wikitext parser service Parsoid is used, while jobs are managed via redis. Creating the PDFs is done using XeTeX, provided via the texlive distribution.

Everyting for PDF-Rendering
sudo apt-get install texlive-xetex texlive-latex-recommended texlive-latex-extra  \
  texlive-generic-extra texlive-fonts-recommended texlive-fonts-extra             \
  fonts-hosny-amiri ttf-devanagari-fonts fonts-nafees ttf-indic-fonts             \
  ttf-malayalam-fonts fonts-arphic-uming fonts-arphic-ukai fonts-droid            \
  fonts-baekmuk texlive-lang-all latex-xcolor lmodern imagemagick librsvg2-bin    \
  unzip zip

(See mw-ocg-latexer/README.md and mw-ocg-latexer/.travis.yml for the most up-to-date package list.)

nodejs

Unfortunately the package provided by wheezy does not include npm therefore we have to compile node ourselves.

wget http://nodejs.org/dist/v0.10.25/node-v0.10.25.tar.gz
tar xfvz node-v0.10.25.tar.gz
cd node-v0.10.25
./configure
make
sudo make install
curl https://npmjs.org/install.sh | sudo sh


Redis server

Here we can use Debian's package:

sudo apt-get install redis-server


Parsoid

Parsoid is developed by the Wikimedia Foundation and available via git:

git clone https://gerrit.wikimedia.org/r/mediawiki/services/parsoid

There is also some configuration necessary. Therefore copy the provided example config and modify it:

cd parsoid/api
cp localsettings.js.example localsettings.js

The syntax of the configuration file is javascript. Wikis are defined as:

exports.setup = function( parsoidConfig ) {
  parsoidConfig.setInterwiki( 'wiki', 'http://wiki.example.com/api.php' );
};

With this also multiple wikis can be defined.

OCG Service[edit]

Download

OCG is also developed by the Wikimedia Foundation. Therefore the download is recommended via git:

git clone https://gerrit.wikimedia.org/r/mediawiki/services/ocg-collection
cd ocg-collection
git submodule update --init --recursive
./make.sh

This contains prebuilt binary modules and is not the recommended way to download or install OCG.


Configuration

The configuration file is best placed under /etc/mw-ocg-service.js:

module.exports = function (config) {
  config.coordinator.frontend_threads = 2;
  config.coordinator.backend_threads = "auto";
  config.backend.bundler.parsoid_api = "http://<server with parsoid>:8000";
  config.backend.bundler.bin = "mw-ocg-bundler/bin/mw-ocg-bundler";
  config.frontend.address = <IP address the server should be listening on>;
  config.backend.writers = {
    "rl": {
      "bin": "mw-ocg-latexer/bin/mw-ocg-latexer",
      "extension": ".pdf"
    }
  };
  return config;
}

Because this project is still under development, some configuration parameters may change very quick.

Wiki[edit]

As a frontend the Collection extension is used

cd extensions
git clone https://gerrit.wikimedia.org/r/mediawiki/extensions/Collection.git
Configuration
require_once("$IP/extensions/Collection/Collection.php");
$wgCollectionMWServeURL = "http://<rendering server>:17080";

Starting[edit]

Redis is already started, a restart is only necessary, when some configuration is changed. But parsoid and ocg-collection need to be started.

cd parsoid
node api/server.js &
cd ../ocg-collection
node mw-ocg-service/mw-ocg-service.js &

After starting netstat -tuln should output someting including those lines:

Proto Recv-Q Send-Q Local Address           Foreign Address         State      
tcp        0      0 0.0.0.0:8000            0.0.0.0:*               LISTEN     
tcp        0      0 127.0.0.1:6379          0.0.0.0:*               LISTEN     
tcp        0      0 0.0.0.0:17080           0.0.0.0:*               LISTEN     

Troubleshooting[edit]

When a rendering job wasn't successful and it is restarted, it will fail again because some temporary files are still there on the rendering server's filesystem. In this case, those files must be deleted from /tmp/.