Deployment tooling/Notes/What does scap do
This page is obsolete. It is being retained for archival purposes. It may document extensions or features that are obsolete and/or no longer supported. Do not rely on the information here being up-to-date. |
This documentation describes scap prior to it being ported to python.
Scap ("sync-common-all-php") is a collection of shell scripts used to publish code and configuration to the WMF production web servers.
scap
[edit]scap is the driver script for syncing the MW versions and configuration files currently staged on tin.equiad.wmnet to the rest of the MW servers in the production cluster.
- Usage
- scap [--versions=<versions>] [<message>]
- Acquire lock on
/var/lock/scap
- Record start timestamp
- Ensure that
SSH_AUTH_SOCK
is available (needed fordsh
to remote hosts) - Check for command line flag to limit activities to a particular MW version
- Export
MW_VERSIONS_SYNC
variable describing software versions to push with sync scripts. Either:- A specific version given with the
--versions
command line argument (eg 1.23wmf12) - The output of
mwversionsinuse --home
- A specific version given with the
- Lint files in $MW_COMMON_SOURCE/wmf-config and $MW_COMMON_SOURCE/multiversion
- Runs
sync-common
- copies files from tin.eqiad.wmnet:/usr/local/apache/common-local to tin.eqiad.wmnet:/a/common via rsync
- Runs
mw-update-l10n
- Runs
dologmsg
to announce that scap is starting - Runs
scap-1
viadsh
on scap-proxies group - Randomizes list of hosts to update (All hosts listed in
/etc/dsh/group/mediawiki-installation
) - Runs
scap-1
viadsh
- Runs
scap-rebuild-cdbs
viadsh
- Runs
sync-wikiversions
- Compute elapsed runtime
- Runs
dologmsg
to log runtime - Runs
deploy2graphite
to log scap run completion - Deletes temp files
- Releases lock on
/var/lock/scap
sync-common
[edit]sync-common is really just an alias for scap-1 in shell script form.
- Runs
scap-1
scap-1
[edit]scap-1 sets up the local host to receive files via rsync, chooses an rsync server to fetch files from and delegates to scap-2
to actually fetch the files.
- Sources
/usr/local/lib/mw-deployment-vars.sh
- If
$MW_COMMON
directory is not found:- Creates
$MW_COMMON
viainstall -d -o mwdeploy -g mwdeploy "${MW_COMMON}"
- Creates
- If
/usr/local/apache/uncommon
directory is not found:- Creates
/usr/local/apache/uncommon
viainstall -d -o mwdeploy -g mwdeploy /usr/local/apache/uncommon
- Creates
- Initialize
RSYNC_SERVERS
variable to first command line argument (could be empty string) - Initialize
SERVER
as an empty variable - If
$RSYNC_SERVERS
is not an empty string:- Set
SERVER
viasudo /usr/local/bin/find-nearest-rsync $RSYNC_SERVERS
- Set
- If
$SERVER
is still empty:- Set
SERVER
to$MW_RSYNC_HOST
- Set
- Run
scap-2 "$SERVER"
as the usermwdeploy
MW_VERSIONS_SYNC
andMW_SCAP_BETA
from the current execution context are forwarded to the environment of thescap-2
invocation
- Echo "Done"
- Exit 0
scap-2
[edit]scap-2 copies files from the common
module of an rsync server to the MW_COMMON
directory on the local host
- Usage
- scap-2 [<host>]
- Sources
/usr/local/lib/mw-deployment-vars.sh
- Initialize
SERVER
as$1
- If
$SERVER
is still empty:- Set
SERVER
to$MW_RSYNC_HOST
- Set
- Initialize
RSYNC_ARGS
as an array containingMW_RSYNC_ARGS[@]
- If
$MW_VERSIONS_SYNC
is not an empty string:- Add
--include=php-$v/
toRSYNC_ARGS
for each $v in$MW_VERSIONS_SYNC[@]
- Add
--exclude=php-*/
toRSYNC_ARGS
- Add
- Echo that
hostname -s
is copying from$SERVER
- Run
rsync "${RSYNC_ARGS[@]}" "$SERVER"::common/ "${MW_COMMON}"
mw-update-l10n
[edit]mw-update-l10n generates l10n cdb files and exports their contents as a series of json files that have better rsync compression properties for transfer to cluster hosts.
- Usage
- mw-update-l10n [--verbose]
- Sources
/usr/local/lib/mw-deployment-vars.sh
- Asserts that the local host is running some variant of linux
- Checks for a
--verbose
command line argument and toggles off theQUIET
setting if present - Sets
CPUS
to the number of cores on the local host (includes hyperthreading cores) - Sets
THREADS
toCPUS
- 2 - Sets
mwExtVerDbSets
to the output ofmwversionsinuse --extended --withdb
- (eg
1.23wmf11=aawikibooks 1.23wmf12=mediawikiwiki
)
- (eg
- For each version in
$mwExtVerDbSets
:- Split version string into
mwVerNum
(eg 123.wmf11) andmwDbName
(eg aawikibooks) - If
MW_VERSIONS_SYNC
is set andmwVerNum
isn't a version being synced: continue - Make a new temp file and track as
mwTempDest
- Run
mergeMessageFileList.php
for the wikimwDbName
outputting tomwTempDest
- Copy
mwTempDest
to$MW_COMMON_SOURCE/wmf-config/ExtensionMessages-"$mwVerNum".php
- Copy
$MW_COMMON_SOURCE/wmf-config/ExtensionMessages-"$mwVerNum".php
to$MW_COMMON/wmf-config/
unless they are the same location - Run
rebuildLocalisationCache.php
usingTHREADS
threads - Run
refreshCdbJsonFiles
usingTHREADS
threads
- Split version string into
refreshCdbJsonFiles
[edit]refreshCdbJsonFiles generates JSON data files and MD5 checksums from CDB databases.
- Usage
- refreshCdbJsonFiles --directory <DIR> [--threads <N>]
- Validate command line arguments
- Create list of
.cdb
files in target directory - Split list in N parts (N == number of parallel threads requested)
- For each sublist of CDB files:
- Fork a child process
- For each file:
- Compute md5 checksum of file
- If md5(file) === last md5 recorded: continue
- Generate JSON file of key:value pairs found in CDB file to temporary file
- Write md5(file) to $file.MD5
- Move JSON temp file to $file.json
- Wait for children to finish
- Echo status message if any files were updated
scap-rebuild-cdbs
[edit]scap-rebuild-cdbs rebuilds l10n cache CDB database from JSON files
- Sources
/usr/local/lib/mw-deployment-vars.sh
- Sets
CPUS
to the number of cores on the local host (includes hyperthreading cores) - Sets
THREADS
toCPUS
/ 2 - Sets
mwVersions
to eitherMW_VERSIONS_SYNC
or the output ofmwversionsinuse
- For each version in
mwVersions
:
mergeCdbFileUpdates
[edit]mergeCdbFileUpdates updates l10n CDB files from JSON data files
- Usage
- mergeCdbFileUpdates --directory <DIRECTORY> [--threads <N>] [--trustmtime]
- Validate command line arguments
- Create list of
.json
files in target directory - Split list in N parts (N == number of parallel threads requested)
- For each sublist of JSON files:
- Fork a child process
- For each file:
- Continue unless JSON newer than CDB / md5 checksums don't match
- Load JSON data from file
- Create a new CDB file with JSON key:value data
- Rename temporary CDB file over .cdb file
- Wait for children to finish
- Echo status message if any files were updated
sync-wikiversions
[edit]sync-wikiversions copies wikiversions files to hosts in the mediawiki-installation dsh group.
- Sources
/usr/local/lib/mw-deployment-vars.sh
- Ensure that
SSH_AUTH_SOCK
is available (needed fordsh
to remote hosts) - Run
multiversion/refreshWikiversionsCDB
- Ensure that
dsh
is available locally - Run
rsync $MW_RSYNC_HOST::common/wikiversions.{dat,cdb} $MW_COMMON
viadsh
on mediawiki-installation hosts - Runs
dologmsg
to log completion - Runs
deploy2graphite
to log sync-wikiversions completion
mw-deployment-vars.sh
[edit]mw-deployment-vars.sh is a puppet generated shell script that sets several MW related environment variables.
The values of these variables change based on the deployment system in use and the realm of the server. For the sake of this analysis we are only concerned with the values configured for the scap deployment system in the production realm.
- MW_COMMON
- varies by deployment system
- scap:
/usr/local/apache/common-local
- MW_COMMON_SOURCE
- varies by deployment system
- scap:
/a/common
- MW_DBLISTS
- varies by deployment system
- scap:
/usr/local/apache/common-local
- MW_DBLISTS_SOURCE
- varies by deployment system
- scap:
/a/common
- MW_CRON_LOGS
/home/wikipedia/logs/norotate
- MW_RSYNC_HOST
- varies by realm
- production:
tin.eqiad.wmnet
- MW_DSH_ARGS
('-cM' '-g' 'mediawiki-installation' '-o' '-oSetupTimeout=30' '-F30')
- MW_RSYNC_ARGS
('-a' '--delete-delay' '--delay-updates' '--compress' '--delete' '--exclude=**/.svn/lock' '--exclude=**/.git/objects' '--exclude=**/.git/**/objects' '--exclude=**/cache/l10n/*.cdb' '--no-perms')
- MW_CARBON_HOST
- varies by realm
- production:
statsd.eqiad.wmnet
- MW_CARBON_PORT
2003
find-nearest-rsync
[edit]find-nearest-rsync is a perl script that attempts to determine the host with the lowest ICMP ping round trip time (rtt) from a given list of hosts.
- Usage
- find-nearest-rsync [--verbose] <host> [<host> ...]
The host with the lowest rtt will be printed to stdout.
mwversionsinuse
[edit]mwversionsinuse is a shell script to call the local version of multiversion/activeMWVersions
- Sources
/usr/local/lib/mw-deployment-vars.sh
- Runs
"${MW_COMMON}/multiversion/activeMWVersions" "$@"
dologmsg
[edit]dologmsg appends a message to an IRC buffer
- Usage
- dologmsg [MESSAGE]