Wikimedia Release Engineering Team/Checkin archive/20190318
Appearance
2019-03-18
[edit]Vacations/Important dates
[edit]- March 29–April 1: James out (New Hampshire)
- April 9-12: Greg at tech-mgt F2F in Portland
- April 17-19 (Wednesday - Friday) - Željko vacation
- April 22 (WMF Holiday) - US Staff
- April 22-27: Team offsite in Chicago
- April 29: Moved WMF Holiday for US staff at offsite
- May 1st - Lars, Antoine and Željko, Labor Day / May Day
- May 8th - Antoine, 1945 victory
- May 15 (Wednesday) - Željko vacation
- May 16-20 - Wikimedia Hackathon 2019 (Prague, Czechia)
- Attending: Greg, JR, Zeljko, James, and Jeena
- May 30th-31th - Antoine, Feast of the Ascension
- June 10th - Antoine, Pentecost -- see https://en.wikipedia.org/wiki/Eastertide for Antoine/France Easter holidays
- May 27 (Memorial Day) - US Staff
- June 6-7 - Brennen, Apogaea
- June 19 (Juneteenth) - US Staff
- July 22 - August 9 - Željko vacation
- August 25 - September 4 - Brennen vacation
Rotating positions
[edit]Train
[edit]- Maniphest query for deployment blocker tasks: https://phabricator.wikimedia.org/maniphest/query/s3KW8bpsXhYF/#R
- Jan 07 - wmf.12 - Dan
- Jan 14 - wmf.13 - Dan
- Jan 21 - wmf.14 - Mukunda
- Jan 28 - wmf.15 - No Train (All Hands)
- Feb 04 - wmf.16 - Mukunda
- Feb 11 - wmf.17 - Tyler
- Feb 18 - wmf.18 - Tyler
- Feb 25 - wmf.19 - Antoine
- Mar 04 - wmf.20 - Antoine
- Mar 11 - wmf.21 - Zeljko
- Mar 18 - wmf.22 - Zeljko
- Mar 25 - wmf.23 - Dan
- Apr 01 - wmf.24 - Dan
- Apr 08 - wmf.25 - Mukunda
- Apr 15 - 1.34.0-wmf.1 - Mukunda
- Apr 22 - wmf.2 - NO TRAIN, team offsite
- Apr 29 - wmf.3 - Tyler
- May 06 - wmf.4 - Tyler
- May 13 - wmf.5 - Antoine
- May 20 - wmf.6 - Antoine
- May 27 - wmf.7 - Zeljko
- June 03 - wmf.8 - Zeljko
SoS
[edit]- Zeljko 4eva! :)
Team Business
[edit]Book club
[edit]- https://www.mediawiki.org/wiki/Wikimedia_Release_Engineering_Team/Book_club
- Notes: https://www.mediawiki.org/wiki/Wikimedia_Release_Engineering_Team/Book_club/Continuous_Delivery
- Next: March 21st at the "same" time (9am Pacific/16:00 UTC)
Spring Offsite
[edit]- Location: Chicago, IL (Central timezone, UTC-5 while we're there)
- Dates: Arrive Monday 4/22, Depart Saturday 4/27.
- BOOK YOUR FLIGHTS BY: March 21
- Activity day
- Fill out the spreadsheet: https://docs.google.com/spreadsheets/d/1zqO8Mk1wUU2ZtyAM9xU68CQTpJFEOPALfDKCj7aMNo4/edit
- Program:
- start listing your topics! https://etherpad.wikimedia.org/p/releng-offsite-201904-topics
Monthly reflection on accomplishments - March '19 edition
[edit]- https://www.mediawiki.org/wiki/Wikimedia_Release_Engineering_Team/Monthly_notable_accomplishments
- Add as you have them!
- CI tooling future WG started, blogged
- GerritBot comments on patches going through the pipeline (with fancy badges and the like)
- Train deploy notes are now automatically generated on branch push
- Scap 3.9.2-1 released in production
- Phabricator upgrade: https://phabricator.wikimedia.org/phame/post/view/147/projects_forms_and_subtypes_oh_my/
- Published the ISOSTWG results and recommendation on officewiki and announced: https://office.wikimedia.org/wiki/Internal_Support_for_Open_Source_Tools_Working_Group
- swat tags now show up in the deployment schedule (via lua magic)
Q4 Goals planning
[edit]- etherpad: https://etherpad.wikimedia.org/p/releng-1819Q4-goals
- Due: Monday March 18th, aka this Friday
Posted online at their respective locations:
- https://www.mediawiki.org/wiki/Wikimedia_Technology/Annual_Plans/FY2019/TEC12:_Developer_Productivity/Goals#Q4_Goals
- https://www.mediawiki.org/wiki/Wikimedia_Technology/Annual_Plans/FY2019/TEC3:_Deployment_Pipeline/Goals#Q4_Goals
- https://www.mediawiki.org/wiki/Wikimedia_Technology/Annual_Plans/FY2019/TEC13:_Code_Health/Goals#Q4_Goals
- https://www.mediawiki.org/wiki/Wikimedia_Technology/Annual_Plans/FY2019/TEC1:_Reliability,_Performance,_and_Maintenance/Goals#Q4_Goals
Annual Planning is coming up
[edit]- 2019-03-13: I emailed mark re future testing/"evaluation" environments
- See notes here: https://docs.google.com/document/d/1QU_6Svn4iduK0TPLSOghYP4g1lK-byCv-0ZKoHfIAVY/edit#heading=h.6gq2j7lm5pz8
- 2019-03-18: updates....
Incoming/Needs attention
[edit]Pywikibot CI
[edit]- https://phabricator.wikimedia.org/T132138
- Antoine to take a time boxed look into this, this week
- 2019-03-18: Antoine was blocked last week
Merge blocker: The table 'l10n_cache' is full in quibble-vendor-mysql-hhvm-docker
[edit]- https://phabricator.wikimedia.org/T217654
- "The bump from 256M to 320M must be good enough and I have updated the Jenkins jobs. Lowering priority to High." -- https://phabricator.wikimedia.org/T217654#5020364
- closed
Merge blocker: quibble-vendor-mysql-hhvm-docker in gate fails for most merges (exit status -11)
[edit]- https://phabricator.wikimedia.org/T216689
- "I have rollbacked the jobs container:" -- https://phabricator.wikimedia.org/T216689#5020757
- See T218209 though. :-(
- closed
Merge blocker: Failed to create /nonexistent/.pki/nssdb directory
[edit]- https://phabricator.wikimedia.org/T218209
- Caused by revert for T216689
- closed
FYI: Wikimedia-production-error (Shared Build Failure)
[edit]
Cannot access beta cluster db
[edit]- https://phabricator.wikimedia.org/T217938
- Mukunda to take a look
- joe claimed the task, has some patches
Deploy Extension:WikimediaEditorTasks to Beta
[edit]- https://phabricator.wikimedia.org/T218137
- needed today
- James can and will deal
branch cutting
[edit]- our current branch cut method is broken due to HTTP Token on gerrit being disabled for security reasons.
- TODO: create a task about this, add to train as a blocker
- Tyler and Mukunda and $OTHERS to chat after this meeting
Scrum of Scrums
[edit]- Greg to copy to etherpad after meeting: https://etherpad.wikimedia.org/p/Scrum-of-Scrums
Incoming from last week
[edit]- Blocking:
Outgoing this week (wrong section heading is on purpose for copy/pasting into Scrum of Scrums etherpad
[edit]Release Engineering
[edit]- Blocked by:
- Blocking:
- Updates:
- Help my CI job fails with exit status -11 https://phabricator.wikimedia.org/phame/post/view/152/help_my_ci_job_fails_with_exit_status_-11/
- Train Health:
- Last week: 1.33.0-wmf.21 - https://phabricator.wikimedia.org/T206675
- This week: 1.33.0-wmf.22 - https://phabricator.wikimedia.org/T206676
- Next week: 1.33.0-wmf.23 - https://phabricator.wikimedia.org/T206677
- Code Health:
Callouts
[edit]- Release Engineering
Train status and happenings
[edit]
Quarterly Goals for Q3
[edit]https://www.mediawiki.org/wiki/Wikimedia_Technology/Goals/2018-19_Q3
TEC1 (Maint): Outcome 1 / Output 1.1
[edit]- GOAL: Automate the generation of change log notes
- WHO: Mukunda, (Tyler on backup)
- In progress should now run on branch cut https://integration.wikimedia.org/ci/job/train-deploy-notes/
TEC1 (Maint): Outcome 1 / Output 1.1
[edit]- GOAL: Investigate notification methods for developers with changes that are riding any given train
- WHO: Mukunda, Tyler
TEC3 (Pipeline): Outcome 1 / Output 1.2
[edit]- GOAL: Instrument Quibble for data collection
- WHO: Mukunda, Antoine
- I haven't gotten any responses about where to put the data. Hopefully graphite & promethius will work. Otherwise I guess logstash?
TEC3 (Pipeline): Outcome 1 / Output 1.2
[edit]- GOAL: Create a graph where time is spent and make a prioritized list for improvements.
- WHO: Mukunda, Antoine
TEC3 (Pipeline): Outcome 2 / Output 2.1
[edit]- GOAL: Select and integrate a code health metric solution into our tooling.
- WHO: JR, ...
TEC3 (Pipeline): Outcome 3 / Output 3.1
[edit]- GOALS:
- Adopt more services into Deployment pipeline - task T212801
- cxserver, ORES (partially), citoid, changeprop, cpjobqueue (stretch)
- Deploy eventgate
- Adopt more services into Deployment pipeline - task T212801
- WHO: Dan, Tyler, Lars
- In progress cxserver
- Images built via deployment pipeline
- Namespaces created for k8s eqiad/codfw
- helm charts created
- Done citoid
- Images built via deployment pipeline
- Deployed
- Traffic switched
- changeprop
- Done eventgate
- In progress ORES
- cf: Dan's comments
TEC12 (DevProd): Outcome 1 / Output 1.1
[edit]- GOAL: Conduct interviews with development stakeholders and compile a report that informs future work creation of a rubric.
- WHO: Jeena, Mukunda
- Done Results are posted: https://www.mediawiki.org/wiki/Developer_Satisfaction
TEC13 (Code Health): Outcome 1 / Output 1.1
[edit]- GOALs:
- Develop and communicate guidelines and best practices for successful Code Stewardship.
- (Continued from Q2) Update/refresh review queue (review process for initial code deployment)
- WHO: JR
relocated Code Stewardship page and created base structure for Resources/Best practices.
TEC13 (Code Health): Outcome 2 / Output 2.2
[edit]- GOAL: 5 of the 15 prioritized repositories have at least 1 end-to-end test - task T206621
- WHO: Zeljko
TEC13 (Code Health): Outcome 2 / Output 2.3
[edit]- GOALs:
- Evolve/develop tools and processes to support the PE refactoring effort to improve code health.
- Develop common test strategy that enable teams to engage in more effective and efficient testing practices. (maybe should be output 2.4?)
- WHO: JR, Core Platform Team
TEC13 (Code Health): Outcome 3 / Output 3.2
[edit]- GOALs:
- Speak at All Hands on the status of Technical Debt
- Engage and coach development teams on their approach to managing technical debt.
- WHO: JR, Core Platform Team
TEC13 (Code Health): Outcome 4 / Output 4.1
[edit]- GOALs: Code Health Dashboard with 50% of repositories covered.
- WHO: JR, Core Platform Team
Waiting on patch review/merge from RelEng. Upon merge, all extensions will have ability to run experimental to perform code analysis
Other non-goal work
[edit]Selenium
[edit]Gerrit
[edit]Phabricator
[edit]- Vandalism revert tool should have been finished last week but that didn't happen, should be done this week.
Jenkins
[edit]QA/Code Health
[edit]SCAP
[edit]Standup!
[edit]Antoine
[edit]- What I plan to do this week
- pywikibot tests run- https://phabricator.wikimedia.org/T186208
- Some Quibble improvements?!
- Catch on CI working group
- What I'm blocked on
- Please proof read/improve https://phabricator.wikimedia.org/phame/post/view/152/help_my_ci_job_fails_with_exit_status_-11/
- And sign at the bottom if you did any fix :-]
- Please proof read/improve https://phabricator.wikimedia.org/phame/post/view/152/help_my_ci_job_fails_with_exit_status_-11/
- Other?
- Strike at school on Tuesday so kids will be at home some part of the day
Brennen
[edit]- What I plan to do this week
- CI WG
- Run through Zuul v3 quick start
- Finish local-charts sshfs scripting - https://gerrit.wikimedia.org/r/c/releng/local-charts/+/497013
- If longma's not already on it, tackle mediawiki blubber.yaml definition - https://phabricator.wikimedia.org/T218360
- Read next 2 chapters of book
- CI WG
- What I'm blocked on
- Nothing
- Other?
- Fiddling with YubiKey 5 and SSH keys
Dan
[edit]- What I plan to do this week
- Implementing .pipeline/config.yaml https://phabricator.wikimedia.org/T210267
- Drafting email to Analytics re: long-term event log storage
- Evaluating Jenkins X
- What I'm blocked on
- Nada
- Other?
Greg
[edit]- What I plan to do this week
- Quality discussion
- Schedule a meeting for us before offsite to talk annual planning kickoff
- TechConf planning meeting and follow-up with Deb/etc
- "Wikimedia Foundation's Health and WellBeing Benefits Survey", due Friday March 22nd
- will email reminder to team list
- Write down some more notes about CD book
- What I'm blocked on
- A bit sick as well :/
- Other?
James
[edit]- What I plan to do this week
- Most SDC stuff (potentially bumpy train deployment, as there's a DOM change for Commons File pages)
- Train blocker fun
- More CD reading.
- What I'm blocked on
- Other?
Jean-Rene
[edit]- What I plan to do this week
- Continue work on test strategy
- continue work of Code Stewardship best practices
- Q4 Code Health Metrics WG goals.
- Start work on DevEd Unit testing work with Guillaume
- What I'm blocked on
- Other?
Jeena
[edit]- What I plan to do this week
- Figure out helm charts issue turning number strings into floats. Then finish and test mediawiki automated install for local-charts
- Update my computer to try and stop it from frrreezing and shutting down
- Read book
- Work on documentation for local-charts
- What I'm blocked on
- Other?
Lars
[edit]- What I plan to do this week
- Read CD book chapter 7, prepare for and particpate in book club meeting on Thursday.
- Finish the CI WG work as much as possible (deadline on Monday next week).
- What I'm blocked on
- Other?
Mukunda
[edit]- What I plan to do this week
- Figure out storage for quibble instrumentation
- Finish deploying vandalism revert tool in phabricator
- Document branch cut via ssh / pushInsteadOf
- What I'm blocked on
- No storage for metrics from quibble. I'm hoping to use promethius to collect the metrics if I can figure it out.
- Other?
Tyler
[edit]- What I plan to do this week
- fix wikimedia branch cut docs
- blubber policy; made upstream patch
- kosta and paladox review. My review queue is backed up :(
- What I'm blocked on
- Other?
- sick :(
- brain scatter
Zeljko
[edit]- What I plan to do this week
- T206676 1.33.0-wmf.22 deployment blockers
- T217325 Consider and evaluate possible new CI tooling
- What I'm blocked on
- Other?
- code health metrics blocked on releng (Antoine/Tyler):
- https://gerrit.wikimedia.org/r/c/integration/config/+/494548 integration/config Remove requirement for properties file, import coverage if present
- https://gerrit.wikimedia.org/r/c/integration/quibble/+/497222 integration/quibble Add Parsoid to docker image and run for Selenium tests
- https://phabricator.wikimedia.org/T218598 Generate code coverage and make it available to wmf-sonar-scanner
- code health metrics blocked on releng (Antoine/Tyler):
Grooming
[edit]Team Kanban Board Review and Triage
[edit]- closed and touched in the 7 days
- No update for 4 weeks
- No update for 3 weeks
- No update for 2 weeks
- No update for 1 week
- All Open
- Review To Triage column of #releng
Once / month-ish review of backlog(s)
[edit]- releng Review To Triage column of #releng
- releng-kanban Review unassigned in kanban
- releng-kanban Review 'backlog' colum of -kanban
- releng-next - Review for things we need to put on our kanban backlog
- releng-backlog - oh my, the huge backlog of things...