Wikimedia Release Engineering Team/Checkin archive/20181105
Appearance
2018-11-05
[edit]Vacations/Important dates
[edit]- November 8-9 - Dan vacation in Mexico City 🇲🇽🌮🎉
- November 12th - Holiday (Veteran's Day, Observed)
- November 22+23 - Holidays (Thanksgiving)
- November 25-december 2nd: Mukunda vacation (in California ahead of the offsite)
- Week of December 3rd - Team offsite
- December 24-28 - Holidays (Christmas)
Rotating positions
[edit]Train
[edit]- Maniphest query for deployment blocker tasks: https://phabricator.wikimedia.org/maniphest/?project=PHID-PROJ-fmcvjrkfvvzz3gxavs3a&statuses=open%28%29&group=none&order=newest#R
- Oct 08 - wmf.25 - Dan (No train due to DC switchover)
- Oct 15 - wmf.26 - Mukunda (last 1.32 wmf.XX release, 1.33 starts the next week)
- Oct 22 - wmf.1 - Mukunda (warning, TechConf happening, ping Greg if you need responses from anyone there...)
- Oct 29 - wmf.2 - Tyler
- Nov 05 - wmf.3 - Tyler <----
- Nov 12 - wmf.4 - Antoine
- Nov 19 - wmf.5 - No Train (Thanksgiving)
- Nov 26 - wmf.6 - Antoine
- Dec 03 - wmf.7 - No Train (Offsite)
- Dec 10 - wmf.8 - Zeljko
- Dec 17 - wmf.9 - Zeljko
- Dec 24 - wmf.10 - No Train (Holiday break)
- Dec 31 - wmf.11 - No Train (Holiday break)
- Jan 07 - wmf.12 - Dan
- Jan 14 - wmf.13 - Dan
- Jan 21 - wmf.14 - Mukunda
- Jan 28 - wmf.15 - No Train (All Hands)
- Feb 04 - wmf.16 - Mukunda
- Feb 11 - wmf.17 - Tyler
- Feb 18 - wmf.18 - Tyler
- Feb 25 - wmf.19 - Antoine
SoS
[edit]- Oct 10 - Zeljko
- Oct 17 - Zeljko
- Oct 24 - Zeljko
- Oct 31 - Zeljko
- Nov 07 - Zeljko <----
- Nov 14 - Zeljko
- Nov 21 - Zeljko
- Nov 28 - Zeljko
- Dec 05 - Zeljko
- Dec 12 - Zeljko
- Dec 19 - Zeljko
- Dec 26 - Zeljko
- Jan 02 - Zeljko
- Jan 09 - Zeljko
- Jan 16 - Zeljko
- Jan 23 - Zeljko
- Jan 30 - Zeljko
- Feb 06 - Zeljko
- Feb 13 - Zeljko
- Feb 20 - Zeljko
- Feb 27 - Zeljko
Team Business
[edit]Hiring
[edit]- Software Engineer position open and reviewing/hiring for now
- update....
December Offsite
[edit]Details:
- Week of December 3rd
- At the Queen Mary hotel in Long Beach
- Deb T will be facilitating
Topics!
- https://etherpad.wikimedia.org/p/RelEng-Offsite-201811-Topics
- Deb and I talked on Friday, she is starting to get the schedule in place.
REMINDER: Deadline to book travel is Nov 8th!
All Hands
[edit]- Registration: https://office.wikimedia.org/wiki/All_hands/2019/Registration
- Needed for everyone
- NOTE: There's a way to request a hotel room for semi-local people (commutes longer than 1.5 hours)
Needs attention
[edit]- gerrit security release 2018-10-08
- https://groups.google.com/forum/m/#!topic/repo-discuss/eH0iLt2XawU
- jGit update, we are unaffected
- may want to hold off until next week: https://bugs.chromium.org/p/gerrit/issues/detail?id=9836
- 2018-10-15 -- paladox tells me they're working on a fix and should have a 2.15.6 tagged Soon™
- 2018-10-22 -- jGit updated to fix leaks https://gerrit-review.googlesource.com/c/gerrit/+/201273
- 2018-10-29 -- 2.15.6 released: https://groups.google.com/forum/?hl=en#!topic/repo-discuss/9EUYI2eyIZM
- thcipriani: Will send email today to update on...Wednesday? Anyone wanna work on this with me?
- Antoine to pair, and be point next time
- 2018-11-05: built and testing https://gerrit.wikimedia.org/r/#/c/operations/software/gerrit/+/471758/-1..1
- deploy1001:/srv/mediawiki out of date?
- https://phabricator.wikimedia.org/T207602
- Found because the Security team noticed that a previously deployed security patch was no longer deployed, should sync up with them this week about that (Reedy or Brian)
- See: https://phabricator.wikimedia.org/T207600
- 2018-10-22: no idea, thcipriani will look, I guess
- 2018-10-29: scap updated, needs release this week
- 2018-11-05:
- Need to poke Reedy re:T207600
- scap still needs release - mukunda will take care of it
- deployment-prep region migration
- See email with same subject on releng@lists
- Question: incrementally or not?
- looks like "however Andrew wants to do it"
- REMINDER: send an email update to wikitech-l@/qa@ with the planned timeline/outage
- 2018-10-29: ACTION: Tyler to reply saying "take it away, andrew, and when are you going to do it?"
- 2018-11-05: Email response Done -- blocking task from Krenair https://phabricator.wikimedia.org/T208101 -- Dan and Mukunda graciously volunteered ;)
Scrum of Scrums
[edit]- Greg to copy to etherpad after meeting: https://etherpad.wikimedia.org/p/Scrum-of-Scrums
Incoming from last week
[edit]- Blocking:
Outgoing this week (wrong section heading is on purpose for copy/pasting into Scrum of Scrums etherpad
[edit]Release Engineering
[edit]- Blocked by:
- Blocking:
- Updates:
- Train Health:
- Last week: 1.33.0-wmf.2 deployment blockers https://phabricator.wikimedia.org/T206656
- wmf.2 was late last week due to an odd HHVM issue: https://phabricator.wikimedia.org/T208549
- This week: 1.33.0-wmf.3 deployment blockers https://phabricator.wikimedia.org/T206657
- Next week:
- Last week: 1.33.0-wmf.2 deployment blockers https://phabricator.wikimedia.org/T206656
- Log Health:
- Code Health:
- Train Health:
Callouts
[edit]- Release Engineering
Train status and happenings
[edit]- OMG discussion on how to do incident reporting and analysis better
- https://phabricator.wikimedia.org/T208632
- mukunda to make some comments
Quarterly Goals for Q2
[edit]TEC1 (Maint): Outcome 1 / Output 1.1
[edit]- GOAL: Release MediaWiki 1.32
- WHO: Mukunda, (Tyler on backup)
TEC1 (Maint): Outcome 1 / Output 1.1
[edit]- GOAL: Determine the procedure and requirements for an automated MediaWiki branch cut.
- WHO: Mukunda, Tyler, Antoine
- Created a bunch of subtasks of https://phabricator.wikimedia.org/T156445 for automating release
- most are needed for MW Branch cut as well as release automation
TEC3 (Pipeline): Outcome 1 / Output 1.2
[edit]- GOAL: Formalize the collection of CI infrastructure and tooling metrics
- WHO: Dan, Antoine
TEC3 (Pipeline): Outcome 2 / Output 2.3
[edit]- GOAL: Develop set of metrics to assess incident reports/post mortems - task T206622
- WHO: Greg, Zeljko
TEC3 (Pipeline): Outcome 3 / Output 3.1
[edit]- GOALS:
- Adopt more services into Deployment pipeline - task T205919
- Migrate graphoid to the Deployment pipeline
- Deploy zotero v2 to the Deployment pipeline
- Deploy blubberoid
- Adopt more services into Deployment pipeline - task T205919
- WHO: Dan, Tyler, Lars
- Lars, Dan, and thcipriani had a pairing session Friday to move Blubberoid forward
TEC12 (DevProd): Outcome 2 / Output 2.1
[edit]- GOAL: The Annual Developer Productivity Survey results are synthesized and shared, creating a first year baseline.
- WHO: Mukunda, Greg
- This is finally sent out and we've already gotten a lot of (IMO useful) responses.
TEC13 (Code Health): Outcome 1 / Output 1.1
[edit]- GOAL: Update/refresh review queue (review process for initial code deployment)
- WHO: JR
TEC13 (Code Health): Outcome 2 / Output 2.2
[edit]- GOAL: 5 of the 15 prioritized repositories have at least 1 end-to-end test - task T206621
- WHO: Zeljko
TEC13 (Code Health): Outcome 2 / Output 2.3
[edit]- GOAL: Assess Platform unit test practices and define improvement plan
- WHO: JR, Core Platform Team
TEC13 (Code Health): Outcome 3 / Output 3.2
[edit]- GOAL: Core Platform and Search Platform teams are using TDM PoC
- WHO: JR, Core Platform Team
TEC13 (Code Health): Outcome 3 / Output 3.4
[edit]- GOALs:
- Identify key Tech Debt areas
- Put in place Tech Debt management process for PEP
- WHO: JR, Core Platform Team
TEC13 (Code Health): Outcome 4 / Output 4.1
[edit]- GOAL: Metrics defined and deployed for all 4 Code Health areas.
- WHO: JR, Code Health Metrics Working Group
Other work
[edit]Selenium
[edit]Gerrit
[edit]Phabricator
[edit]Jenkins
[edit]QA
[edit]SCAP
[edit]Standup!
[edit]Antoine
[edit]Relocated Wikibase client job ready to migrate to Docker. Repo one gotta wait and see why scope is so different
- What I plan to do this week
- Look at Wikibase repo
- Java 8 security update fall outs? Probably want to upgrade CI container
- What I'm blocked on
- DonationInterface migration pending on fundraising
- Other?
Dan
[edit]- What I plan to do this week
- Write a blog post about October 2018 CI build data analysis
- Working title: "It's a zombie party: bring in 'da noise, bring in defunct"
- Analysis: https://docs.google.com/spreadsheets/d/1-HLTy8Z4OqatLnufFEszbqkS141MBXJNEPZQScDD1hQ/edit?usp=sharing
- Still prometheus-ing
- Write a blog post about October 2018 CI build data analysis
- What I'm blocked on
- Other?
Greg
[edit]- What I plan to do this week
- catch up on l10nupdate follow-ups
- follow-up from TechConf program committee (cleaning/sanitzing notes and posting to wiki mostly)
- a quick pass through any remaining updates to the onboarding process/task structure (incorporate learnings from Lars')
- What I'm blocked on
- dunno?
- Other?
- dunno?
Jean-Rene
[edit]- What I plan to do this week
- What I'm blocked on
- Other?
Lars
[edit]- What I plan to do this week
- Delivery pipeline architecture diagram to understand what the goal and status quo is.
- Find and read existing delivery pipeline code. (thcipriani: in integration/config)
- Study Kanban boards.
- What I'm blocked on
- Lack of superbrain
- Other?
- Nada
Mukunda
[edit]- What I plan to do this week
- Get the lastest scap deb released
- keyholder review
- I didn't get the MW 1.32.0-rc1 tarball done last week, get that done this week for sure
- (with Dan) Fix beta cluster static IPs for transition to the new cloud region
- Outline proposal for incident report forms
- What I'm blocked on
- Other?
Tyler
[edit]- What I plan to do this week
- Train
- Gerrit
- Fundraising CI job
- What I'm blocked on
- Other?
Zeljko
[edit]- What I plan to do this week
- T199133 Find top 15 target projects that could use Selenium tests to prevent incidents
- What I'm blocked on
- Other?
Grooming
[edit]Team Kanban Board Review and Triage
[edit]- closed and touched in the 7 days
- No update for 4 weeks
- No update for 3 weeks
- No update for 2 weeks
- No update for 1 week
- All Open
- Review To Triage column of #releng
Once / month-ish review of backlog(s)
[edit]- releng Review To Triage column of #releng
- releng-kanban Review unassigned in kanban
- releng-kanban Review 'backlog' colum of -kanban
- releng-next - Review for things we need to put on our kanban backlog
- releng-backlog - oh my, the huge backlog of things...