Phlogiston/Installation
This page is obsolete. It is being retained for archival purposes. It may document extensions or features that are obsolete and/or no longer supported. Do not rely on the information here being up-to-date. |
This page is currently a draft.
|
Install Prerequisites
[edit]Operating System
[edit]These instructions assume installation of Phlogiston on a Debian GNU/Linux Stretch (9.5) system.
In Labs, to enable the bigger hard drive, go to https://horizon.wikimedia.org/project/puppet/ and activate the puppet "profile::labs::lvm::srv*
Move Postgres's working directories to the new folder.
System-wide Installation
[edit]Nginx
[edit]sudo apt install nginx
Postgresql
[edit]sudo apt install postgresql postgresql-contrib
Python Modules
[edit]sudo apt install python3-venv
R
[edit]Install R repository to get the newest version (from DigitalOcean instructions)
sudo apt install software-properties-common
sudo apt-key adv --keyserver keys.gnupg.net --recv-key 'E19F5F87128899B192B1A2C2AD5F960A256A04AF'
sudo add-apt-repository 'deb https://cloud.r-project.org/bin/linux/debian stretch-cran35/'
sudo apt update
sudo apt install r-base
sudo apt install build-essential
Set up Accounts
[edit]A shell account for Phlogiston
[edit]This account is used to run Phlogiston, store data, and publish for the webserver. By convention it's called phlogiston
. Create it and apply whatever login rules, ssh, configuration, and security as is appropriate.
Set up Python
[edit]As user phlogiston
:
python3 -m venv phlog_env
source phlog_env/bin/activate
pip install python-dateutil psycopg2-binary pytz jinja2
Set up R
[edit]As phlogiston, type R
to enter R command line. In R,
install.packages('RColorBrewer', dep=TRUE)
install.packages('ggplot2', dep=TRUE)
install.packages('ggthemes', dep=TRUE)
install.packages('argparse', dep=TRUE)
install.packages('reshape', dep=TRUE)
install.packages('fivethirtyeight', dep=TRUE)
If prompted where to install, Install locally.
A Postgres account for Phlogiston
[edit]The local account must have access to a PostGreSQL database for data storage and reporting. As root:
sudo su - postgres
createuser -s phlogiston
createdb -O phlogiston phlogiston
Superuser access is required to because load_tables.sql installs the intarray postgresql extension. This also allows the script to create or reset its own data tables. Probably don't do this on a shared server.
Access to Phlogiston directories for postgresql
[edit]The Phlogiston scripts run some commands on the postgresql server, which runs under the postgres user, which needs to have access to phlogiston directories via the phlogiston group.
sudo usermod -a -G phlogiston postgres
sudo service postgresql restart
Install Phlogiston
[edit]Get the phlogiston code by cloning it from github. As the phlogiston user:
sudo su - phlogiston
git clone https://github.com/wikimedia/phlogiston.git
exit
Set up web publishing of results
[edit]Configure Nginx
[edit]Configure Nginx to publish from the phlogiston html output directory. Create the following file as /etc/nginx/sites-available/phlogiston
:
server { server_name localhost; listen 80 default_server; listen [::]:80 default_server ipv6only=on; root /home/phlogiston/html; index index.html index.htm; ssi on; location / { autoindex on; # First attempt to serve request as file, then # as directory, then fall back to displaying a 404. try_files $uri $uri/ =404; # Uncomment to enable naxsi on this location # include /etc/nginx/naxsi.rules } }
And run these commands to configure Nginx to use it
sudo rm /etc/nginx/sites-enabled/default
sudo ln /etc/nginx/sites-available/phlogiston /etc/nginx/sites-enabled
sudo service nginx restart
Set up the reports home page
[edit]mkdir /home/phlogiston/html
cp /home/phlogiston/phlogiston/html/index.html /home/phlogiston/html/
cp /home/phlogiston/phlogiston/html/style.css /home/phlogiston/html/
And edit index.html to reflect the scopes being reported.
First run
[edit]sudo su - phlogiston
cd phlogiston
createdb phlogiston
python3 phlogiston.py --initialize
bash batch_phlog.bash -m reconstruct -l true -s xxx
where xxx is a correctly configured scope. Validate the results.
Automate daily reports
[edit]- Run a complete reconstruction for all scopes:
bash ~/phlogiston/batch_phlog.bash -m reconstruct -l true -s xxx -s yyy
- where xxx and yyy are reporting scopes. Append as many
-s zzz
as needed.
- where xxx and yyy are reporting scopes. Append as many
- Create a cron job for the phlogiston user, of the form
15 4 * * * bash ~/phlogiston/batch_phlog.bash -m incremental -l true -s xxx -s yyy >>~/phlog.log 2>&1
- This runs daily at 4:15 am UTC. Set this to be right after the dump is generated from Phabricator, and to run as often as the dump is updated. (Phlogiston can take hours to run, so anything more than daily may not be practical without optimization.)
- The file
~/phlog.log
can be inspected for status.- In particular,
grep Done ~/phlog.log
will show one line per scope per reconstruction and/or report.
- In particular,
How to use on other Phabricator instances besides Wikimedia Foundation
[edit]Untested:
1) set up a dump script on Phabricator, like this one, to generate dumps like this one.
2) Customize batch_phlog.bash
to point to the new dump file.