Wikimedia Release Engineering Team/Checkin archive/20180827
Appearance
2018-08-27
[edit]Vacations/Important dates
[edit]- August 29-31: Dan vacation
- August 31: Greg driving to Kentucky
- September 3 (Monday): US Holiday (Labor Day)
- September 7 (Friday) Ćœeljko on a conference
- Mid september - Mid october, Antoine to take off some weeks/days/part time
Rotating positions
[edit]Train
[edit]- Maniphest query for deployment blocker tasks: https://phabricator.wikimedia.org/maniphest/?project=PHID-PROJ-fmcvjrkfvvzz3gxavs3a&statuses=open%28%29&group=none&order=newest#R
- July 02 - wmf.11 - Zeljko - no train, Fourth of July
- July 09 - wmf.12 - Zeljko
- July 16 - wmf.13 - Zeljko
- July 23 - wmf.14 - Zeljko
- July 30 - wmf.15 - Mukunda
- Aug 06 - wmf.16 - Mukunda
- Aug 13 - wmf.17 - Mukunda (No train - Wednesday is a holiday)
- Aug 20 - wmf.18 - Tyler
- Aug 27 - wmf.19 - Dan && Antoine lurking over the shoulders
- Sep 03 - wmf.20 - Antoine
- Sep 10 - wmf.21 - Antoine -- No train due to DC switchover
- Sep 17 - wmf.22 - Antoine
- Sep 24 - wmf.23 - Zeljko
- Oct 01 - wmf.24 - Dan
- Oct 08 - wmf.25 - Dan -- No train due to DC switchover
- Oct 15 - wmf.26 - Mukunda (last 1.32 wmf.XX release, 1.33 starts the next week)
- Oct 22 - wmf.1 - Mukunda
SoS
[edit]- July 04 - Dan
- July 11 - Antoine
- July 18 - Antoine
- July 25 - Tyler
- Aug 01 - Tyler
- Aug 08 - Zeljko
- Aug 15 - Dan (No SoS this week)
- Aug 22 - Zeljko
- Aug 29 - Zeljko
- Sep 05 - Tyler
- Sep 12 - Tyler
- Sep 19 - Dan
- Sep 26 - Dan
- Oct 03 - Zeljko
- Oct 10 - Zeljko
- Oct 17 - Antoine
- Oct 24 - Antoine
- Oct 31 - Mukunda
Team Business
[edit]Hiring
[edit]First Offsite
[edit]- waiting to hear back confirmation from Travel but... I was told that no more offsites can be scheduled next to TechConf in Portland in October, so the week of Nov 5th it is. Monday - Thursday.
Needs attention
[edit]- Create a production test wiki in group0 to parallel Wikimedia Commons - https://phabricator.wikimedia.org/T197616
- Status: Mark H and Amanda reached out to me, I asked for a meeting with Mark H.
- conclusion: use beta cluster
- Re-evaluate use of "Dependent Pipeline" in Zuul for gate-and-submit - https://phabricator.wikimedia.org/T94322
- 2018-08-24 TODO: antoine and dan had followups IIRC
Scrum of Scrums
[edit]- Greg to copy to etherpad after meeting: https://etherpad.wikimedia.org/p/Scrum-of-Scrums
This week
[edit]Release Engineering
[edit]- Blocked by:
- Noise from https://phabricator.wikimedia.org/T201082 during Train deployment (not really blocked but distracted)
- Blocking:
- Updates:
- Train: no major problems, 1.32.0-wmf.19 at group 0 https://phabricator.wikimedia.org/T191065 https://tools.wmflabs.org/versions/
- Log spam: Unknown modifier 'R': [/^page\-User\:BeneBot.+/RfD\-open/text$/] in /srv/mediawiki/php-1.32.0-wmf.16/extensions/Translate/stringmangler/StringMatcher.php https://phabricator.wikimedia.org/T202058
Last week
[edit]Release Engineering
[edit]- Blocked by:
- Feedback needed (on how problems could have been prevented) from many people/teams on a recent MediaWiki train related incident report.
- 1.32.0-wmf.13, 9 blockers, feedback needed for 8 of them: https://wikitech.wikimedia.org/wiki/Incident_documentation/20180717-Train
- Aaron Schulz (Performance), Adam Wight (Scoring Platform), Bartosz DziewoĆski (Contributors), Brad Jorsch (MediaWiki Platform), C. Scott Ananian (Contributors), Daniel Kinzler (Wikimedia Deutschland), Timo Tijhof (Performance), Prateek Saxena (Audiences Design)
- Feedback needed (on how problems could have been prevented) from many people/teams on a recent MediaWiki train related incident report.
- Blocking:
- MediaWiki 1.29 final release and EOL; was due in June: https://phabricator.wikimedia.org/T197669 (w/ Security)
- Updates:
- New general purpose CI job that builds and runs test containers via Blubber/Docker based on config provided in each project (think `.travis.yml` file)
- Read more about Blubber here: https://wikitech.wikimedia.org/wiki/Blubber
- See recent builds at https://integration.wikimedia.org/ci/blue/organizations/jenkins/blubber-test/activity
- Gives developers one major benefit of the CD pipeline work now, having control over their pre-merge and gating tests without having to mess with integration/config
- Only scheduled to run for a few repos at the moment, but will eventually be expanded to many more projects (we need to tune CI infra around it first)
- Looking for more participants to join the Code Health Metrics working group. This group's purpose is to define and later implement a set of core metrics that we will use to asses the health of our code base. More info: https://www.mediawiki.org/wiki/Code_Health_Group/projects/Code_Health_Metrics
- New general purpose CI job that builds and runs test containers via Blubber/Docker based on config provided in each project (think `.travis.yml` file)
Train status and happenings
[edit]- Removed:
- 1.32.0-wmf.13
- 1.32.0-wmf.12
- 1.32.0-wmf.10
- two rollbacks:
- Filed task for tools.wikimedia.org/versions caching: https://phabricator.wikimedia.org/T202734
- Sent log health email:
Past week status updates
[edit]- All of it in table form: https://www.mediawiki.org/wiki/Wikimedia_Release_Engineering_Team/Goals/201718Q4
Quaterly Goals for Q1
[edit]Pipeline: Move verify stage from Minikube to CI k8s namespace in production context
[edit]- Done
- Moving this to integration/pipelinelib this week(?)
Code Health
[edit]- T199253 - Investigate and propose record of origin (ROO) for deployed code (currently Developers/Maintainers page)
- Started straw man on this.
- Perform existing Stewardship review process for Q1 cycle.
- T199254 - Add test evaluation to post mortem review process.
- Review existing e2e test coverage.
- Define prioritization scheme.
- Prioritize e2e testing gaps.
- T199257 - make current unit testing coverage more visible by reporting out to Engineering Management.
- On track to publish report to EMs this week. Mostly manual at this point.
- T199259 - Platform and Search Platform teams are using TDM PoC
- T199262 - Identify key Tech Debt areas
- T199263 - Put in place Tech Debt management process for PEP
- T199261 - Define base Code Health metric set.
- WG is formed and kickoff meeting will be either this week or next (waiting on availability response from Kunal)
https://www.mediawiki.org/wiki/Code_Health_Group/projects/Code_Health_Metrics
Developer Productivity
[edit]- Make a hire to create the capacity needed for this program.
- Write and share a survey to measure developer satisfaction and areas for investment. - task T197635
Other work
[edit]Selenium
[edit]- Q1 goals task: T198389 Q1 Selenium framework improvements
- T193157 Quibble does not install ffmpeg
Gerrit
[edit]Phabricator
[edit]- Deployed new antivandalism code now with
moarzero false positives.
Jenkins
[edit]QA
[edit]- Draft QA strategy shared with Greg, Ryan, Corey, Ryan, and Erika.
Standup!
[edit]Antoine
[edit]- What I plan to do this week
- Selenium daily jobs using wdio, in progress with Zeljko
- Review / merge / deploy pending Quibble patches from WMDE & others
- Reply to https://phabricator.wikimedia.org/T94322
- Level up on train deployment with Dan
- What I'm blocked on
- 3 extensions left to migrate to Quibble. Need to raise attention to them:
- ArticlePlaceholder (QUnit is screwed up, I know nothing about JavaScript) - https://phabricator.wikimedia.org/T180171
- ReadingLists , related to extension registry not loading config - https://phabricator.wikimedia.org/T196567
- TrustedXFF, no reviewers - https://phabricator.wikimedia.org/T198120
- Legacy ruby / mediawiki-selenium jobs
- 3 extensions left to migrate to Quibble. Need to raise attention to them:
- Other?
Dan
[edit]- What I plan to do this week
- Train w/ Antoine
- Statsd publisher that sends job/node metrics to statsd.eqiad.wmnet
- KUBECONFIG support in integration/pipelinelib
- What I'm blocked on
- Nothing
- Other?
- Addressing Tyler's feedback on blubberoid
Greg
[edit]- What I plan to do this week
- Get caught up
- What I'm blocked on
- Inbox: 97
- Other?
Jean-Rene
[edit]- What I plan to do this week
- Post mortems
- T199253 - Investigate and propose record of origin (ROO) for deployed code (currently Developers/Maintainers page)
- Publish first monthly Code Coverage report to EMs
- Stewardship review
- What I'm blocked on
- Other?
Mukunda
[edit]- What I plan to do this week
- Finish prioritizing tasks for the new organized mukunda 2.0beta
- Finish giving feedback on tech dept phabricator pipe-dream/wishlist document
- Work on elastic 6 support in Phabricator search
- Review paladox's gerrit patches
- What I'm blocked on
- Other?
Tyler
[edit]- What I plan to do this week
- Incident report for 2018-08-20 train week
- Ensure scap sync-wikiversions canary work works in beta
- deploy update to scap canary
- releng.team https
- Review paladox work on gerrit avatars
- Deploy notify for CI nodes where disk is > 95% full
- Update https://www.mediawiki.org/w/index.php?title=Review_queue#Deploy_to_Beta_Cluster
- Figure out how to sign/what to sign cla@wikimedia.org
- What I'm blocked on
- Other?
Zeljko
[edit]- What I plan to do this week
- T179188 Video recording for Selenium tests in Node.js
- T188742 Run tests daily targeting beta cluster for all repositories with Selenium tests
- T202787 Document how to implement browser acceptance tests involving OOUI
- Post mortems for wmf.13 and wmf.14
- A volonteer asked to talk with somebody about CI
- What I'm blocked on
- Other?
- Timo added me to wdio-mediawiki NPM package collaborators
- My garage is so clean that I go there and just enjoy the site from time to time
Grooming
[edit]Team Kanban Board Review and Triage
[edit]- closed and touched in the 7 days
- No update for 4 weeks
- No update for 3 weeks
- No update for 2 weeks
- No update for 1 week
- All Open
- Review To Triage column of #releng
Once / month-ish review of backlog(s)
[edit]- releng Review To Triage column of #releng
- releng-kanban Review unassigned in kanban
- releng-kanban Review 'backlog' colum of -kanban
- releng-next - Review for things we need to put on our kanban backlog
- releng-backlog - oh my, the huge backlog of things...