Wikimedia Release Engineering Team/Checkin archive/20180625
Appearance
2018-06-25
[edit]Vacations/Important dates
[edit]- June 26 (Tuesday): Željko vacation
- June 26 (Tuesday): Greg half day (afternoon)
- June 29th (Friday): Antoine morning
- July 2 (Monday) Željko vacation
- July 4: US Holiday
- July 16: Mukunda's bday.....funtimes
- August 15: WMF Monthly Holiday
- August 15 (Wednesday): Željko holiday (Assumption of Mary)
- August 23-24 (Thursday-Friday): Željko vacation
- August ~: Antoine
- September a week or so - Antoine
- Middle of August...a few days somewhere - thcipriani
Rotating positions
[edit]Train
[edit]- Maniphest query for deployment blocker tasks: https://phabricator.wikimedia.org/maniphest/?project=PHID-PROJ-fmcvjrkfvvzz3gxavs3a&statuses=open%28%29&group=none&order=newest#R
- June 11 - wmf.8 - Dan (with Tyler doing Thursday)
- June 18 - wmf.9 - Dan (no train, SRE summit)
- June 25 - wmf.10 - Dan <----
- July 02 - wmf.11 - Zeljko - no train, Fourth of July
- July 09 - wmf.12 - Zeljko
- July 16 - wmf.13 - Zeljko
- July 23 - wmf.14 - Antoine
- July 30 - wmf.15 - Antoine
SoS
[edit]- June 11 - Tyler
- June 18 - Tyler
- June 25 - Tyler <----
- July 02 - Dan
- July 09 - Dan
- July 16 - Dan
- July 23 - Zeljko
- July 30 - Zeljko
- August 06 - Antoine
- August 13 - Antoine
Team Business
[edit]Updates
[edit]- Jenkins plugin security release today, status?
- releases-jenkins: up-to-date
- ci-jenkins: need restart window
- Train/SWAT changes
- Greg emailed mark/faidon on Tuesday
- email: https://etherpad.wikimedia.org/p/eu-train-swat
- Train: Be sure to ping JR when the train experiences anything that would be post-mortem worthy :)
- Skill matrix ready!
- https://www.mediawiki.org/wiki/Wikimedia_Release_Engineering_Team/Skill_matrix
- [X] Antoine - We should revisit it, ElasticSearch doesnt ring any bell to me
- thcipriani: +1
- https://www.mediawiki.org/wiki/Wikimedia_Release_Engineering_Team/Skill_matrix
Q1 Goals time!
[edit]Remember what we said back in January:
- https://office.wikimedia.org/wiki/Release_Engineering/FY1819-Planning/Continuous_Deployment_pipeline
- [JR] The Code Health Group will share a prioritize a list of metrics for use in risk assessments of deployments
- [Tyler, Dan, Antoine] Remove minikube from the pipeline and move verify stage to CI "staging"
- Move build stage of production image
- Promote production-context image through pipeline
- [JR, Greg, Antoine, Mukunda] Investigate tooling for better incident response management, make a proposal
- NB: there's going to be discussion with SRE after regarding adoption, most likely
- https://office.wikimedia.org/wiki/Release_Engineering/FY1819-Planning/Code_Health
- [JR] Create a simple Effective Code Stewardship guide for Code Stewards
- [JR, Greg, Zeljko] Review incidents from the past year to determine how many had a testable regression as the cause.
- [Zeljko] From review identify the top 15 target projects.
- [JR, CHG] Define Code Health Metrics (for use in Pipelien Program and anywhere else appropriate)
Drafting onwiki at:
Staging (ohai)
[edit]SRE talked about it at their offsite, read up on the changes at https://docs.google.com/document/d/1CT_pKjwiDmFhZZ9LW9mz0z434-wgr3NFdapUPWUvMNA/edit?ts=5b040955#heading=h.j5ulvrixnnxf
Scrum of Scrums
[edit]- Greg to copy to etherpad after meeting: https://etherpad.wikimedia.org/p/Scrum-of-Scrums
This week
[edit]Release Engineering
[edit]- Blocking
- Blocked
- Updates
- FYI: Release Engineering will start including Europeans in our train rotation meaning that the MW Train will now include European appropriate windows for those weeks. Exact schedule TBA to wikitech-l@, ops@, engineering@, and @wikitech-ambassadors@
- Quarterly cross-dependencies
Last week
[edit]Release Engineering
[edit]- Blocking
- Working on https://phabricator.wikimedia.org/T190710 for Readers
- Blocked
- Updates
- wmf.999 is running on group0 wikis for testing MCR related changes, see https://phabricator.wikimedia.org/T196585
- Quarterly cross-dependencies
Train status and happenings
[edit]
Past week status updates
[edit]- All of it in table form: https://www.mediawiki.org/wiki/Wikimedia_Release_Engineering_Team/Goals/201718Q4
Quaterly Goals for Q4
[edit]Program 1: Outcome 5: Objective 1: Maintain existing shared Continuous Integration infrastructure
[edit]- Migrate away from Nodepool - task T190097
- Migrate MediaWiki PHPUnit tests to Shipyard (docker-based CI) (~40% of Nodepool usage) - task T183512
- Add Composer support to Blubber - task T186547
- Add Python/Tox support to Blubber - task T186545
- Add Ruby/Gem/Bundler/Rake support to Blubber - task T188950
Program 3: Outcome 1: Objective 2: Identify and find stewards for high-priority/high use code segment orphans
[edit]- Broad role out of Code Stewardship model.
- Update Maintainers/Developers page with currently known Code Stewards.
- Use Code Stewardship review process to address gaps.
- Deploy dashboard of Code Stewardship Coverage
Completed Code Stewardship Coverage dashboard Followed up with RelatedSites sunsetting activities
Goals Complete
Program 3: Outcome2: Objective 2: Define and implement a process to regularly address technical debt across the Foundation
[edit]- Roll out of technical debt reduction approaches.
- Identify early adopter engineering teams to test approaches.
Goals Complete
Program 6: Outcome 2: Objective 2: Prove viability of testing staged service containers alongside MediaWiki extension containers
[edit]- Add Composer support to Blubber - task T186547
- Small, standalone, MediaWiki containers built using Blubber
- Limited scope with: Debian Stretch, php7, composer dependencies, Mariadb, Apache (or last two with standalone containers)
- Build MW base containers upon branch cut at master branch point from core
- Build ext specific containers using MW image as base
- In progress:
- Base container image for mediawiki
- Groovy library for pipeline
- Got a CI namespace on the k8s cluster thanks to _Joe_!!!
Quaterly non-goal "Work"
[edit]Program 1: Outcome 1: Objective 1: Scap (Tech Debt Sprint FY201718-Q2)
[edit]
Program 1: Outcome 5: Objective 1: Maintain existing shared Continuous Integration infrastructure
[edit]Program 1: Outcome 6: Milestone 1: Maintain Gerrit
[edit]Program 1: Outcome 6: Milestone 2: Maintain Phabricator
[edit]- Streamline logspam workflows by adding some integration with phabricator
- Store git-lfs (and other phab uploads) in swift: task T182085
- This got more review from Filippo and should be nearly ready to merge.
- Spent much of last week Responding to phabricator abuse
Other work
[edit]Beta Cluster Survey closed. Data to be sliced and diced this week.
Standup!
[edit]Antoine
[edit]- What I plan to do this week
- What I'm blocked on
- Other?
Dan
[edit]- What I plan to do this week
- Breaking out service pipeline groovy code into libraries (integration/pipelinelib) https://phabricator.wikimedia.org/T196940
- Train
- What I'm blocked on
- Other?
Greg
[edit]- What I plan to do this week
- My annual review w/ Victoria is today
- Make sure all the contractors/conversions are handled correctly this week
- Make Spark project slides :)
- ping mark/faidon, announce train changes
- Q1 goals posting
- going to try to remove non-Q1 or high priority things from kanban board
- What I'm blocked on
- T&C/Recruiting
- Other?
Jean-Rene
[edit]- What I plan to do this week
- Beta Cluser Survey data analysis
- Q1 goal planning
- Continue work on Tech Debt review of Search Platform and Platform teams
- Setup PM for Phab incident
- What I'm blocked on
- Other?
Mukunda
[edit]- What I plan to do this week
- Phabricator abuse response https://phabricator.wikimedia.org/T162026
- Phabricator + Swift https://phabricator.wikimedia.org/T182085
- Merge and deploy upstream changes
- Tasks can now change "type" after creation which unblocks https://phabricator.wikimedia.org/T93499
- yay!
- Create incident report for phabricator vandalism incident.
- Tasks can now change "type" after creation which unblocks https://phabricator.wikimedia.org/T93499
- What I'm blocked on
- Other?
Tyler
[edit]- What I plan to do this week
- Q1 planning
- schedule small gerrit downtime time with Mukunda for gerrit duplicate account fix (upstream seems to have blessed the plan)
- Finish fixing scap clean
- What I'm blocked on
- Math containers are hard (restbase ugh)
- Other?
Zeljko
[edit]- What I plan to do this week
- T190994 Q4 Selenium framework improvements
- T179190 Run Selenium Cucumber tests in CI
- T190710 Minerva Ruby and Node.js browser tests running side by side
- T194252 Configure the CI job that runs WikibaseLexeme's browser tests against beta wikidata
- What I'm blocked on
- Need help from Antoine to figure out Docker jobs, scheduled for tomorrow
- Other?
Grooming
[edit]Team Kanban Board Review and Triage
[edit]- closed and touched in the 7 days
- No update for 4 weeks
- No update for 3 weeks
- No update for 2 weeks
- No update for 1 week
- All Open
- Review To Triage column of #releng
Once / month-ish review of backlog(s)
[edit]- releng Review To Triage column of #releng
- releng-kanban Review unassigned in kanban
- releng-kanban Review 'backlog' colum of -kanban
- releng-next - Review for things we need to put on our kanban backlog
- releng-backlog - oh my, the huge backlog of things...