Wikimedia Release Engineering Team/Checkin archive/20190617
Appearance
2019-06-17
[edit]Vacations/Important dates
[edit]- June 10 – July 21 - Dan leave (6 weeks, then additional leave later)
- June 19 (Juneteenth) - US Staff
- June 20 - Željko, Corpus Christi
- June 21 - Željko, vacation
- June 24 - Željko, vacation
- June 25 - Željko, Statehood Day
- July 2 - Greg's birthday, unsure if taking off, already have one meeting
- July 4 (US Independence Day) - US Staff
- July 10 - Lars off (swapping with weekend)
- July 22 - August 9 - Željko vacation
- August 7–19 - James off (inc. Wikimania)
- August 12 - September 8 - Dan leave
- August 12 (Glorious Twelfth) - US Staff
- August ??? - ??? - Antoine
- August 14–18 - Wikimania
- Attending: James, Lars, Jean-Rene
- August 15 - Željko, Assumption of Mary
- August 25 - September 4 - Brennen vacation
- September 2 (Labor Day) - US Staff
- October 14 (Indigenous Peoples' Day) - US Staff
- November 11 (Veterans' Day) - US Staff
- November 28–29 (Thanksgiving) - US Staff
- December 6 - Lars, Finnish Independence Day
- December 25–31 (Christmas) - US Staff
- December 25–26 - Lars, Christmas
- 2020 January 1 (New Year's Day) - US Staff, Lars
Rotating positions
[edit]Train
[edit]- Maniphest query for deployment blocker tasks: https://phabricator.wikimedia.org/maniphest/query/s3KW8bpsXhYF/#R
- June 10 - wmf.9 - No Train (SRE Summit)
- June 17 - wmf.10 - Mukunda (but Juneteenth on the Wednesday? Yes. Do group0 and group1 an hour apart on Tuesday)
- June 24 - wmf.11 - Jeena (with Mukunda)
- July 1 - wmf.12 - No train (Fourth of July)
- July 8 - wmf.13 - Jeena
- July 15 - wmf.14 - Lars (with Antoine)
- July 22 - wmf.15 - Lars
- July 29 - wmf.16 - Brennen (with Tyler)
- Aug 5 - wmf.17 - Brennen
- Aug 12 - wmf.18 - No Train (Wikimania)
- Aug 19 - wmf.19 - Zeljko 😱
- Aug 26 - wmf.20 - Zeljko 😭
SoS
[edit]- Zeljko 4eva! :)
Team Business
[edit]Timespent spreadsheet
[edit]- For the avoidance of doubt: fill out the sheet week number for the previous week
- link to week starting June 10: https://docs.google.com/spreadsheets/d/1urCLNQXeEi1DOR8Iu0qW0yPt-glxX1laqlMovbGyCW0/edit#gid=256442765
Book club
[edit]- https://www.mediawiki.org/wiki/Wikimedia_Release_Engineering_Team/Book_club
- Notes: https://www.mediawiki.org/wiki/Wikimedia_Release_Engineering_Team/Book_club/Continuous_Delivery
- Next: June 28th, chapters 12+13 (9am Pacific)
AUP exceptions
[edit]- In response to the Acceptable Use Policy, please file any personal computer usage here: https://docs.google.com/forms/d/e/1FAIpQLSeKcnBqlPuXppIjXw9RxCLtq7IfydXA181A3R0V4UFjY9wfgQ/viewform
Fall Offsite + TechConf19
[edit]- Decided: 1 long trip, offsite after TechConf
- dates? 2019-11-1{6,7}--2019-11-21..ish
TechConf19
[edit]- Dates Tuesday 2019-11-12 – Friday 2019-11-15
- update....
- Vision: https://www.mediawiki.org/wiki/Wikimedia_Technical_Conference/2019
- nomination form: https://lists.wikimedia.org/pipermail/wikitech-l/2019-May/092131.html
Annual Planning
[edit]- https://docs.google.com/spreadsheets/d/1TrkGTfPLR0C74va3XyY6faYplSh6UggGiPdmxIVm1uo/edit#gid=0
- All of our Outcomes/Key Deliverables and projects for next year
- We need to determine Q1 goals this week/next week.
Monthly reflection on accomplishments - May '19 edition
[edit]- https://www.mediawiki.org/wiki/Wikimedia_Release_Engineering_Team/Monthly_notable_accomplishments
- Add as you have them!
- Phabricator vandalism rollback tool completed 🎉 (blog post? 😉)
- Upgrade Zuul to 2.5.1-wmf6 (which unblocks the Gerrit upgrade to 2.16) - https://phabricator.wikimedia.org/T208426
- Team offsite in Chicago
- Repository-hosted CI/CD pipeline configurations now supported (.pipeline/config.yaml) - https://phabricator.wikimedia.org/T210267
- Train notes published on branch cut
- Codehealth pipeline beta - https://phabricator.wikimedia.org/phame/live/1/post/160/introducing_the_codehealth_pipeline_beta/
- Some baseline local development images published
- Speculative CI meta-architecture published within WMF for feedback
- Old image versions automatically removed from jenkins agents when /var/lib/docker space > 80%
- scap 3.10.0 cut
- Jenkins build timings reports: https://people.wikimedia.org/~dduvall/jenkins/
- Helped Kask team sketch an outline of its architecture (https://www.mediawiki.org/wiki/Kask)
- Fatal Monitor with marker lines for deployments: https://logstash.wikimedia.org/app/kibana#/dashboard/77cc3e90-aa27-11e7-9109-51bd3197f7a9?_g=()
Incoming/Needs attention
[edit]- REL1_33 branching for extensions: https://phabricator.wikimedia.org/T220653
- Reedy said he'll move forward with rc0 announcement soon.
- Mukunda tried to run the script but it ran into trouble. Will re-try, manually.
- Switching on HTTP Auth again still seems blocked. Barricade should help with this; review when Tyler gets back.
- Update 2019-06-03: Fighting fires last; should be able to do this week.
- 2019-06-10: Done with a quick hack by Reedy; do we need to fix the script for next time?
- http auth patches merged in upstream, next week is the earliest it'll be released
- 2019-06-17: Gerrit 2.15.14 is out, need to build and release, hopefully this week
- Documentation!
- Zuul and force merge: https://www.mediawiki.org/wiki/Topic:V14dlv7nt5ne7gsd
- Antoine to file task and reply
Scrum of Scrums
[edit]Incoming from last week
[edit]https://www.mediawiki.org/wiki/Scrum_of_scrums/2019-06-12#Release_Engineering
Outgoing this week (wrong section heading level is on purpose for copy/pasting into Scrum of Scrums etherpad
[edit]Release Engineering
[edit]- Blocked by:
- Core Platform Team (low priority): https://phabricator.wikimedia.org/T205361 is blocking undeployment of CodeReview.
- SRE:
- Traffic Team (low priority): https://phabricator.wikimedia.org/T213769 is blocking undeployment of Wikipedia Zero.
- ServiceOps Team: Scap 3.10.0: https://phabricator.wikimedia.org/T224915
- Wikidata: We need to update wikiba.se hosting to PHP7 so we can drop php56 from CI. https://phabricator.wikimedia.org/T224905
- Blocking:
- Updates:
- Train Health
- Last week: 1.34.0-wmf.9 - NO TRAIN OR ANY OTHER DEPLOYS due to SRE Off-site
- This week: 1.34.0-wmf.10 - https://phabricator.wikimedia.org/T220735
- Next week: 1.34.0-wmf.11 - https://phabricator.wikimedia.org/T220736
- Code Health
- Log Health
- Train Health
Callouts
[edit]- Release Engineering
Train status and happenings
[edit]- New filtered fatal monitor dashboard including markers for scap deployments: https://logstash.wikimedia.org/app/kibana#/dashboard/77cc3e90-aa27-11e7-9109-51bd3197f7a9?_g=()
- Need to fix scap clean :\
- thcipriani has a crappy fix in mind until http tokens in gerrit are back
- Any idea when HTTP tokens will come back? Weeks? Months? Never? :-(
- ~Weeks
- 2019-05-06: cleaned up stuff last week on deploy hosts, just not the gerrit branches
- 2019-05-13: …
- 2019-06-03: upstream issues/patches we want resolved before doing this
- cf: https://phabricator.wikimedia.org/T218750#5128424
- looks like these patches merged -- I'll check what release they're going out with
- 2019-06-10: upstream cutting new version with security fixes (hopefully) end of week, ETA early next week
- 2019-06-17: gerrit 2.15.14 is out, need to build and release, hopefully this week
- thcipriani has a crappy fix in mind until http tokens in gerrit are back
Quarterly Goals for Q4
[edit]https://www.mediawiki.org/wiki/Wikimedia_Technology/Goals/2018-19_Q4
TEC1 (Maint): Outcome 1 / Output 1.1
[edit]- GOAL: Undeploy the CodeReview extension.
- WHO: James, need help from CPT
- James will ping CPT about this this week (April 8th)
- … and again w/c 15 April.
- … and again w/c 6 May (in SoS).
- … and again w/c 27 May (in SoS).
- [Recurring item]
TEC1 (Maint): Outcome 1 / Output 1.1
[edit]- GOAL: Setup 1-3 of the CI WG options (Zuul v3, Argo, GitLab)
- WHO: Lars
- Gitlab:
- https://wmf-gitlab3.vm.liw.fi/ is up and accepts registrations with wikimedia.org (and liw.fi) email addresses
- Please play with it and tell Lars anything that seems iffy
TEC3 (Pipeline): Outcome 1 / Output 1.2
[edit]- GOAL: Instrument Quibble for data collection
- WHO: Mukunda, Antoine
- Blocked
TEC3 (Pipeline): Outcome 1 / Output 1.2
[edit]- GOAL: Create a graph where time is spent and make a prioritized list for improvements.
- WHO: Mukunda, Antoine
- Blocked
TEC3 (Pipeline): Outcome 1 / Output 1.2
[edit]- GOAL: Prepare the Deployment Pipeline for changes to our CI tooling.
- WHO: ???, ???
- Blocked by not having new CI tooling yet
TEC3 (Pipeline): Outcome 3 / Output 3.1
[edit]- GOAL: Create a .pipeline/config.yaml standard to give users more control over how their tests are run in the pipeline and allow the easy saving of artifacts at pipeline completion. (RelEng)
- WHO: Dan, Tyler, ???
Done
TEC3 (Pipeline): Outcome 3 / Output 3.1
[edit]- GOALS:
- Adopt more services into Deployment pipeline - task T212801
- Wikidata Termbox SSR, Kask for Session Storage Service, cpjobqueue (stretch), ORES (stretch)
- Adopt more services into Deployment pipeline - task T212801
- WHO: Dan, Tyler, Lars
There are tasks: https://phabricator.wikimedia.org/T220403
- Wikidata Termbox SSR
- In progress
- Kask for Session Storage Service
- Done
- cpjobqueue (stretch)
- Not done pushing to next quarter
- ORES
- cf: Dan's comments
- Not done pushing to next quarter
TEC12 (DevProd): Outcome 1 / Output 1.1
[edit]- GOAL: Provide an "Official" Docker base image for local development of MediaWiki based on the production tooling.
- WHO: Jeena, Brennen
- https://phabricator.wikimedia.org/T212449
- Done for MediaWiki, for some values of "done" and "MediaWiki". Production-likeness needs considerable work.
TEC13 (Code Health): Outcome 1 / Output 3
[edit]- GOALs: Presentation/session(s) at the Wikimedia Hackathon on the current state of Code Health projects (technical debt and code stewardship)
- WHO: JR
Done
TEC13 (Code Health): Outcome 1 / Output 1.1
[edit]- GOAL:
- Publish a re-imagination of the Review Queue process.
- Develop and implement metrics around task and code-review responsiveness
- WHO: Greg, JR (and Andre)
- Review Queue
- Blocked on Greg time
- Task and code-review responsiveness metrics
- No progress last week.
= TEC13 (Code Health): Outcome 4 / Output 4.2
[edit]- GOALs:
- Expand SonarQube reporting into CI infrastructure
- Perform SonarQube analysis on all extensions
- Engage user communities in direct feedback solicitation
- WHO: JR, Zeljko, Code Health Metrics
- Continued working towards expansion of coverage to all extensions in Code Health Pipeline. Going to do so in smaller chunks to see how infrastructure reacts.
Other non-goal work
[edit]Release MW 1.33
[edit]- Handed off to Reedy along with security releases.
Selenium
[edit]Gerrit
[edit]- 2.15.14
- Cannot assign user name "XXX" to account ####; name already in use. https://phabricator.wikimedia.org/T216605
Phabricator
[edit]Jenkins
[edit]QA/Code Health
[edit]SCAP
[edit]- Enhance MediaWiki deployments for support of php7.x
- may need to do work here in the near term
Standup!
[edit]Antoine
[edit]- What I plan to do this week
- Finish reviews of Awight Quibble patches. Hopefully cut a new Quibble release
- Polish up zuul/layout.yaml a bit more, specially jobs for MediaWiki release branches
- Cleanup CI puppet manifests -- https://gerrit.wikimedia.org/r/#/q/bug:T225735
- What I'm blocked on
- Kosta has send a patch for Quibble to spawn Apache since `php -S` is way too slow (single threaded, probably lacks opcache etc) -- https://gerrit.wikimedia.org/r/#/c/integration/quibble/+/516729/
- Other?
- Lot of efforts have been put to speed up MediaWiki tests. Moaare needed, specially the Selenium tests seem unreasonably slow (too many login actions).
Brennen
[edit]- What I plan to do this week
- Figure out where to go with 508392: Add .pipeline/blubber.yaml with dev variant for local-charts
- Any needed followup for 517111: CI: Create lightweight agent role for Jenkins
- Follow up on local dev testing (once deployment-charts exists)
- Experiment more with Lars's GitLab
- What I'm blocked on
- Other?
- Out for a few minutes this morning to deal with lingering tax nonsense.
Dan
[edit]- What I plan to do this week
- What I'm blocked on
- Other?
Greg
[edit]- What I plan to do this week
- Security Council Meeting (already done!) ;)
- Get and response to final budget/plan
- Q1 Goals
- TechConf nominations review
- TechConf topics review (post office hours last week)
- including reviewing of things not covered from last techconf
- Phabricator board triaging/reconfiguring again/still (sorry!)
- Read chapter 12 of book
- What I'm blocked on
- Other?
James
[edit]- What I plan to do this week
- Dropping php56 CI testing https://phabricator.wikimedia.org/T224906
- [More] Helping with unit vs. Integration test split https://phabricator.wikimedia.org/T87781 and https://phabricator.wikimedia.org/T225068 and https://phabricator.wikimedia.org/T221434
- [Still] Fixing quibble node10 follow-up for MobileFrontend https://phabricator.wikimedia.org/T224997
- [More] Migrating CI phan jobs over to php72 https://phabricator.wikimedia.org/T223847
- [Again] Building a proof of concept of shims in WikimediaMessages so we can undeploy things better: https://phabricator.wikimedia.org/T222918
- [Again] Discussing further defining variant Wikimedia production config in compiled, static files https://phabricator.wikimedia.org/T223602
- What I'm blocked on
- Other?
Jean-Rene
[edit]- What I plan to do this week
- schedule Core Review WG kickoff
- work on goal planning
- work on Tec13 1.1 (responsiveness metrics)
- continue working on Code Stewardship tasks
- What I'm blocked on
- Other?
Jeena
[edit]- What I plan to do this week
- Finally make patchset for deployment-charts
- work on other charts for deployment-charts
- fix some bugs I found in local-charts
- fill out planning chart
- Read CD book
- Prepare for train next week
- Look into beta needs for https://phabricator.wikimedia.org/T222820
- What I'm blocked on
- Other?
Lars
[edit]- What I plan to do this week
- read chapter 12 of CD book
- make v2 of CI arch doc based on feedback so far
- write up a CI implementation around GitLab, using Gerrit
- plan next FY work for my own part
- What I'm blocked on
- Other?
Mukunda
[edit]- What I plan to do this week
- Train
- Rebuild phabricator sshd with a patch for T224677
- Phabricator deployment if possible
- What I'm blocked on
- Other?
Tyler
[edit]- What I plan to do this week
- Remove barricade v2 lucene dependency (with dcausse)
- Gerrit 2.15.14 released, will deploy
- Finish Blubberoid policy file work
- get up-to-date on the lib/extension dependency work
- What I'm blocked on
- scap 3.10.0-1 ot poke SRE
- Other?
Zeljko
[edit]- What I plan to do this week
- T199113 All repositories with Selenium tests should use wdio-mediawiki - TwoColConflict is the only one left from the original list, have to cross check with new repos added to https://www.mediawiki.org/wiki/Selenium/Node.js#write-tests
- T223774 The first Selenium test for WikibaseCirrusSearch - could not get the extenstion working on vagrant :P will be able to do most of the stuff anyway
- What I'm blocked on
- Other?
- holiday on Thursday (20th), vacation on Friday (21st) and Monday (24th), holiday on Tuesday (25th)
Grooming
[edit]Team Kanban Board Review and Triage
[edit]- closed and touched in the 7 days
- No update for 4 weeks
- No update for 3 weeks
- No update for 2 weeks
- No update for 1 week
- All Open
- Review To Triage column of #releng
Once / month-ish review of backlog(s)
[edit]- releng Review To Triage column of #releng
- releng-kanban Review unassigned in kanban
- releng-kanban Review 'backlog' colum of -kanban
- releng-next - Review for things we need to put on our kanban backlog
- releng-backlog - oh my, the huge backlog of things...