Requests for comment/Composer managed libraries for use on WMF cluster
Composer managed libraries for use on WMF cluster | |
---|---|
Component | General |
Creation date | |
Author(s) | BDavis (WMF) |
Document status | implemented See meetbot IRC summary) Accepted. brion (talk) 23:00, 23 July 2014 (UTC) |
This is a request for comment about using Composer to manage library dependencies for MediaWiki and MediaWiki extensions on the WMF production and beta clusters and Jenkins jobs.
Background
[edit]The structured logging and bug 63483 [Swift Mailer] patches introduce the concept of specifying external library dependencies, both required and suggested, to mediawiki/core.git via composer.json. Composer can be used by people directly consuming the git repository to install and manage these dependencies. While we eventually decided not to use Swift_Mailer due to other issues, the underlying issue of how to incorporate external libraries in a reasonable way still stands.
Problem
[edit]On the WMF production and beta clusters and the Jenkins job runners we want to avoid using Composer directly to manage library dependencies. Composer allows specifying versions of libraries that should be imported, but those versions are typically only enforced as git tags which are mutable and no checksumming and/or cryptographic signing mechanisms are provided. This would provide a potential exploit vector for attacking the WMF infrastructure as the composer.json files would be public and advertise the external dependencies to attack in order to slipstream code into the WMF environment. Even if this attack vector is unlikely to be be exploited, its risk should be minimized. For the Jenkins environment, speed of setup for tests is also a concern. Uncached downloads/clones of external resources would expand the time needed to setup for each test.
Proposal
[edit]Both security and speed concerns can be mitigated by creating a locally hosted git repository containing a composer.json
file along with a Composer generated autoload.php
and the desired libraries. This composer.json
file would be used to tell Composer the exact versions of libraries to download. Developers would manually run Composer in a checkout of this repository and then commit the downloaded content, composer.lock
file and generated autoloader.php
to the repository for review. We would then be able to branch and use this repository as git submodule in the wmf/1.2XwmfY
branches that are deployed to production and ensure that it is checked out along with mw-core on the Jenkins nodes. By placing this submodule at $IP/vendor
in mw-core we would be mimicking the configuration that direct users of Composer will experience. WebStart.php already includes $IP/vendor/autoload.php
when present so integration with the rest of mw-core should follow from that.
It would also be possible to add this repo to the tarballs for distribution. There will probably need to be some adjustments for that process however and the final result may be that release branches update the mediawiki/core composer.json and provide a composer.lock along with a pre-populated vendor directory. This is an important use-case, but a fully formed solution for it is not presented in this RFC.
There are several use cases to consider for the general solution:
Adding/updating a library
[edit]- Update
composer.json
inmediawiki/vendor.git
- Run
composer update
locally to download library (and dependencies) - Run
composer dump-autoload --optimize
to make an optimizedautoloader.php
- Commit changes
- Push changes for review in gerrit
Hotfix for an external library
[edit]At some point we will run into a bug or missing feature in a Composer managed library that we need to work around with a patch. Obviously we will attempt to upstream any such fixes (otherwise what's the point of this whole exercise?). To keep from blocking things for our production cluster we would want to fork the upstream, add our patch for local use and upstream the patch. During the time that the patch was pending review in the upstream we would want to use our locally patched version in production and Jenkins.
Composer provides a solution for this with its repository package source. The Composer documentation actually gives this exact example in their discussion of the vcs repository type [1]. We would:
- Create a git repository tracking the external library
- Add our patch(es)
- Tag with a semantic version number indicating that we have amended the library (eg upstream "v1.0.1" becomes local "v1.0.1+wmf1")
- Adjust the
composer.json
file inmediawiki/vendor.git
to reference our fork and version - Run Composer in
mediawiki/vendor.git
to pull in our patched version
Adding a locally developed library
[edit]The Platform Core team has been talking about extracting libraries from mw-core and/or extensions to be published externally. This may be done for any and all of the current $IP/includes/libs
classes and possibly other content from core such as FormatJson.
For this use case, we would create a new gerrit repository for each exported project. The project repo would contain a composer.json
manifest describing the project correctly to be published at packagist.org like most Composer installable libraries. In the mediawiki/vendor.git
composer.json
file we would pull these libraries just like any third-party developed library. This isn't functionally much different than the way that we use git submodules today. There is one extra level of indirection when a library is changed. The mediawiki/vendor.git
repo will have to be updated with the new library version before the hash for the git submodule of mediawiki/vendor.git
is updated in a deploy or release branch.
wmf/1.XwmfY branches
[edit]The make-wmf-branch
script (found in mediawiki/tools/release.git
) is used to create the weekly release branches that are deployed by the "train" on each Thursday. This script would be has been updated to branch the new mediawiki/vendor.git
repository and add the version appropriate branch as a submodule of mediawiki/core.git
on the wmf/*
branch. This is functionally exactly what we do for extensions today.
Updating a deployment branch
[edit]SWAT deploys often deploy bug fixes for extensions and core that can't wait for the next train release. It is a near certainty that mediawiki/vendor.git will have the same need. The process for updating mediawiki/vendor.git will be almost the same as updating an extension.
- Follow the adding/updating library or hotfix instructions to get the changes merged into the
mediawiki/vendor.git
master branch. - Cherry-pick the change into the proper deployment branch
- Merge the cherry-pick
- Update the git submodule for
mediawiki/vendor.git
in the appropriate deployed branch - Pull update to deployment server
sync-dir
to deploy to cluster
Security fixes
[edit]This is a special case of upstreaming a patch. A security patch would be applied directly on the deployed branch of mediawiki/vendor.git
as we would do for any extension. The vulnerability and patch must then be submitted upstream in a
responsible manner and tracked for resolution.
CSteipp has asked that we invent some means to ensure that a person is responsible for tracking security vulnerabilities in each upstream library we import. More discussion is needed on this topic.
Jenkins
[edit]The Jenkins jobs that checkout and run tests involving mediawiki/core.git
would need to be amended to also checkout the
mediawiki/vendor.git
in the appropriate location before running tests.
See also
[edit]- RFC Third-party components
- Requests for comment/Extension management with Composer
- Done Gerrit change 136472 Add README and skeletal composer.json
- Done Gerrit change 136620 Add mediawiki/core/vendor as submodule
- Done Gerrit change 136473 Add psr/log 1.0.0
- Done Gerrit change 136474 Add monolog/monolog 1.9.1
- Not done
Gerrit change 136475 Add swiftmailer/swiftmailer 5.2.0 - Done Gerrit change 136620 Add mediawiki/core/vendor as submodule
- Done OpenStack change 70373 Cloner to easily clone dependent repositories
- Done bug 68485 Rename mediawiki/core/vendor to mediawiki/vendor