Wikimedia Release Engineering Team/Quarterly review, April 2014
Wikimedia Release and QA Team/Quarterly review, February_2014 | Wikimedia Release and QA Team Reviews | Wikimedia Release and QA Team/Quarterly review, August 2014 |
Date: April 30th | Time: 18:00 UTC | Slides: pdf | Notes: on-wiki
Who:
- Leads: Greg G, Chris M
- Team: Greg G, Chris M, Antoine, Sam, Bryan, Chris S, Chad, Zeljko, Andre, Rummana
- Other review participants (invited): Robla, Sumana, Quim, Maryana, James F, Terry, Tomasz, Alolita, Erik
Topics: Deploy process/pipeline, release process, bug fixing, code review, code management, security deploy/release, automation prioritization
Big picture
[edit]Release Engineering and QA are where our efforts in Platform can be amplified. When we do things well, we start to see more responsive development with higher quality code. That is our focus. What we want to accomplish:
- More appreciation of, response to, and creation of tests in development
- Better monitoring and reporting out of our development and deployment processes, especially test environments and pre-deployment
- Reduce time between code being merged and being deployed
- Provide information about software quality in a way that informs release decisions
- Help WMF Engineering learn and adapt from experience
...All in an effort to pave the path to a more reliable continuous deployment environment.
Team roles
[edit]Many people outside of the virtual team play an important role in releases, but this review will focus on the work of the following people in the following roles:
- Release engineering: Greg G, Sam, Chris S (security), Bryan Davis
- QA and Test Automation: Chris M, Zeljko, Rummana
- Bug escalation: Andre, Greg G., Chris M, Chris S (security)
- Beta cluster development/maintenance:' Antoine, Sam, Bryan Davis
- Development tools (e.g. Gerrit, Jenkins): Antoine, Zeljko
Last Quarter Review
[edit]Goals
[edit]vis a vis the WMF Engineering 2013-14 goals.
Deployment Tooling
[edit]- Status: in-progress Process through all (useful) pain points from the Dev/Deploy review session (Greg)
- some done, not all
- Scap incremental improvements
- step 1:
- mostly Status: Done - Refactor existing scap scripts to enhance maintainability and reveal hidden complexity of current solution (Bryan)
- "Easy" parts are done. Remaining work was blocked on getting scap running in beta so that changes chould be tested somewhere larger than a Vagrant VM and less potentially catastrophic than production.
- step 2:
- Status: Done - create matrix of tool requirements per software stack (MW, Parsoid, ElasticSearch) (Greg)
- Status: in-progress - Use above matrix to add/fix functionality in scap (or related) tooling for ONE software stack, prioritized by cross stack use (Bryan)
- step 1:
Beta cluster
[edit]Goal: continue to have beta labs emulate production more closely (Antoine, all)
- Status: Done - Make database in beta emulate production (set up db slaves) (Antoine)
- partly Status: Done - Use beta labs as a testing ground for the above Deployment Tooling work (Greg, Bryan, all)
- Infra work in place, so far working out.
- Status: Done - Migrate Beta cluster from pmtpa to eqiad
- Not from last QR but was a big priority
- Much (most?) of the beta cluster configuration was puppetized during the migration. This is a great implevement over the prior cluster in pmtpa which included many hand-built instances.
- Beta now includes a local puppet master which allows cherry-picking work-in-progress puppet changes and applying them across the cluster. This unblocks Antione and others from getting +2 approval in operations/pupet.git for each desired change. It also provides a testing platform for changes prior to usage in production.
- Beta now includes a salt master which allows the use of Trebuchet and general experimentation with salt by non-roots.
Browser tests
[edit]Goal: use the API to create test data for given tests at run time. (Jeff, Chris, Željko) Status: Done in heavy use in MobileFrontend tests, queued for VisualEditor and others
- target dev environments with bare wikis/one off instances//vagrant/"hermetic" test environments
- in support of teams who requested this, for example Mobile and public Mediawiki release (Chris)
- in support of browser tests on WMF Jenkins (Jeff, Željko)
- requires thoughtful use of the API
- first pass: create articles with particular title and content. Create users with particular names and passwords.
- Status: Done although vagrant languishes. One focus for new hire is to bring vagrant back to current
- first pass: create articles with particular title and content. Create users with particular names and passwords.
Goal: create the ability to test headless (Željko, Jeff, Chris) Status: Done but so much more to come now that we have the basic operation working
- targets build systems (Antoine, all)
Goal: run versions of tests compatible with target test environments (Chris, all) Status: Not done tracking this at https://bugzilla.wikimedia.org/show_bug.cgi?id=62509 but have not implemented anything from it
- today we always run the master branch of browser tests. This is inconvenient, as target environments such as test2wiki lag beta labs by at least one week.
- create the ability in Jenkins builds to run the versions of tests appropriate to the versions of extensions in the target wiki.
- discussion is only begun, but this would be worthwhile.
- create the ability in Jenkins builds to run the versions of tests appropriate to the versions of extensions in the target wiki.
Ongoing:
- Continue to move shared code to shared repo; e.g. Login
- Status: in-progress current status: https://www.mediawiki.org/wiki/Quality_Assurance/Browser_testing/Shared_features
- Continue to maintain tests and keep them green, e.g. connection issues
- Status: in-progress
- builds WMF-Jenkins -> beta labs in place
- builds WMF-Jenkins -> SauceLabs coming
- Status: in-progress
Hiring
[edit]- Status: in-progress - Complete hiring and train new
Test Infrastructure EngineerRelease Engineer (Greg, all) - Status: in-progress - Complete hiring and train new
QA Automation EngineerAutomation Engineer (Ruby) (Chris, all)
Dependencies
[edit]Ops dependency:
- Deployment Tooling (see above)
MW Core dependency:
- Deployment Tooling (see above)
- Vagrant
Last quarter actions
[edit]- Status: Done -
GregBryan to send periodic updates about scap refactoring - Status: Not done Greg convene conversation with labs folks post migration re labs-vagrant (including OpenStack API etc)
- Status: in-progress: Have a plan for Vagrant
- determine fit within test infra explicitly
- Status: in-progress: add MW release tarball as goal in next quarterly review
- Status: Not done: figure out if a central developer to generate metrics on unit tests, maintaining the framework, etc
Next Quarter
[edit]Goals
[edit]vis a vis the WMF Engineering 2013-14 goals.
- (continued from last quarter) Process through all (useful) pain points from the Dev/Deploy review session - (Greg)
- Integrate HHVM support into our deployment systems - (Bryan, Greg, ytbh RelEngineer, others from Platform)
- start the scap(py) & trebuchet integration conversation
- dependent upon beta cluster work below
- Support HHVM deployment tooling and puppet configuration testing - (Bryan, Antoine, ytbh RelEngineer)
- Swift cluster in beta
- RFC support
MediaWiki Release
[edit]- Successfully support the release of MediaWiki 1.23 - (Antoine, Greg)
- Kickoff/complete second RFP
- Investigate and create useful release/deployment metrics visualizations - (Greg)
- eg: # of builds per day, # of commits/day, # of deploys/day, etc
Browser tests
[edit]- (From last quarter) Use tags to run builds appropriate to released versions (e.g. don't run master build on test2wiki)
- Retire Cloudbees Jenkins instance
- Integrate WMF Jenkins with new WMF SauceLabs account
- Execute tests in parallel
- Use API to create test data at runtime more widely (not just for MobileFrontend but also VisualEditor, Flow, local dev env etc.)
- Add browsertests to new repos e.g. GettingStarted
Hiring
[edit]- Complete hiring and train new Release Engineer (Greg, all)
- Complete hiring and train new Automation Engineer (Ruby) (Chris, all)
Questions
[edit]- Placeholder for questions during review
Zuul vs Jenkins: What is the future?- conversation didn't happen, and that's ok- Phabricator - maybe a real possibility