Wikimedia Release Engineering Team/Checkin archive/20190513
Appearance
2019-04-15
[edit]Vacations/Important dates
[edit]- May 13–15: James working from London
- May 15 (Wednesday) - Željko vacation
- May 16-20 - Wikimedia Hackathon 2019 (Prague, Czechia)
- Attending: Greg, JR, Zeljko, James, and Jeena
- May 17th: Mukunda day off - Concert.
- May 17th: thcipriani - half day - airport run
- May 20-31 - Jeena Vacation
- May 21: James still travelling back to SF
- May 27 (Memorial Day) - US Staff
- May 28th-31st - thcipriani - family in town
- May 30th - Lars, Ascension
- May 30th-31th - Antoine, Feast of the Ascension
- June 6-7 - Brennen, Apogaea
- June 10th - Antoine, Pentecost -- see https://en.wikipedia.org/wiki/Eastertide for Antoine/France Easter holidays
- June 10-? - Dan leave (4-6 weeks, then additional leave later)
- June 19 (Juneteenth) - US Staff
- July 4 (US Independence Day) - US Staff
- July 22 - August 9 - Željko vacation
- July 22 - Lars, Midsummer
- August 7–9 - James off
- August 12 (Glorious Twelfth) - US Staff
- August 14–18 - Wikimania
- Attending: James, ? …
- August 25 - September 4 - Brennen vacation
- September 2 (Labor Day) - US Staff
- October 14 (Indigenous Peoples' Day) - US Staff
- November 11 (Veterans' Day) - US Staff
- November 28–29 (Thanksgiving) - US Staff
- December 6 - Lars, Finnish Independence Day
- December 25–31 (Christmas) - US Staff
- December 25-26 - Lars, Christmas
- 2020 January 1 (New Year's Day) - US Staff, Lars
Rotating positions
[edit]Train
[edit]- Maniphest query for deployment blocker tasks: https://phabricator.wikimedia.org/maniphest/query/s3KW8bpsXhYF/#R
- Apr 29 - wmf.3 - Tyler
- May 06 - wmf.4 - Tyler
- May 13 - wmf.5 - Antoine
- May 20 - wmf.6 - Antoine
- May 27 - wmf.7 - Zeljko
- June 03 - wmf.8 - Zeljko
- June 10 - wmf.9 - Mukunda
- June 17 - wmf.10 - No Train (Juneteenth)
- June 24 - wmf.11 - Mukunda
- July 1 - wmf.12 - No train (Fourth of July)
- July 8 - wmf.13 - Tyler
- July 15 - wmf.14 - Tyler
- July 22 - wmf.15 - Antoine
- July 29 - wmf.16 - Antoine
- Aug 5 - wmf.17 - one of Mukunda/Antoine/Tyler (Antoine and Zeljko on vacation)
- Aug 12 - wmf.18 - Zeljko (during Wikimania)
- Aug 19 - wmf.19 - Zeljko (after Wikimania)
SoS
[edit]- Zeljko 4eva! :)
Team Business
[edit]Timespent spreadsheet
[edit]- For the avoidance of doubt: fill out the sheet week number for the previous week
- link to week stating May 13: https://docs.google.com/spreadsheets/d/1urCLNQXeEi1DOR8Iu0qW0yPt-glxX1laqlMovbGyCW0/edit#gid=571269651
Book club
[edit]- https://www.mediawiki.org/wiki/Wikimedia_Release_Engineering_Team/Book_club
- Notes: https://www.mediawiki.org/wiki/Wikimedia_Release_Engineering_Team/Book_club/Continuous_Delivery
- Next: June 14th, chapters 10+11
Spring Offsite
[edit]Follow-ups:
- Greg: Email mark re gerrit/phab hosting discussion
- DONE: scheduled for Wed May 8th.
- Attending: Mark, mutante, Alex, Joe, Lars, Antoine, Dan, Mukunda, Tyler, Me
- Not just about gerrit/phab, also "the pipeline transition period and what that means for Beta Cluster" (hence all the people)
- https://etherpad.wikimedia.org/p/ep-sre-ap-sync
- Greg: email mark about capex request for next year for pipeline
- I'm actually not sure what this is about/what the ask is, help?!
- "staging" pipeline?
- Production access?
- Tyler: write out a justifiable ask for hardware resources for Gerrit
- ????: re Integration environments: establish SLAs between the teams for what is their responsibility and ours, what is the working relationship
- I think there's something more here that needs to be fleshed out, see the relevant section here: https://docs.google.com/document/d/1Y-cYrPKT0dvN2oj0hScIjRjkM2zWL5NY9xMYfMuC2Do/edit?ts=5c9cd50b#heading=h.vbm26ktfhprv
- Greg: flesh out/say more on this
- 2019-05-13: not yet...
- Mukunda: talk with Timo and Fillipo about our prioritized of feature requests for LMM
- Note: Gergo confirmed that SRE is going to work on Sentry in Q1/Q2 (from a conversation with Faidon and Filippo)
- Greg: announce that RelEng is backup only for SWAT (removal of person’s names from getting pinged everytime on IRC) and we’ll start working on automating the train
- Still need to do Q4 goals...table this “doing” until Q1?
- Greg will send a signed email if someone writes it up ;)
- Željko will write the e-mail this week - done
- Greg: setup the new project/task management process in Phab based on feedback
- taskified: https://phabricator.wikimedia.org/T222496
- Demo time!
- kanban: https://phab.wmflabs.org/project/board/37/query/all/
- TODO: https://phab.wmflabs.org/project/board/36/
- RelEng (categories): https://phab.wmflabs.org/project/board/35/
- Greg: collect mission/scope output in a central living place
Monthly reflection on accomplishments - May '19 edition
[edit]- https://www.mediawiki.org/wiki/Wikimedia_Release_Engineering_Team/Monthly_notable_accomplishments
- Add as you have them!
- Phabricator vandalism rollback tool completed 🎉 (blog post? 😉)
- Upgrade Zuul to 2.5.1-wmf6 (which unblocks the Gerrit upgrade to 2.16) - https://phabricator.wikimedia.org/T208426
- Team offsite in Chicago
- Repository-hosted CI/CD pipeline configurations now supported (.pipeline/config.yaml) - https://phabricator.wikimedia.org/T210267
- Train notes published on branch cut
Annual Planning
[edit]- Metrics are in for Core/Operational work. Waiting on c-level announcement on the Priority work.
- https://docs.google.com/document/d/1GueI1JhQkWjnZXKUmN8T7SS3s3alJQY2fYFfCRZp7So/edit#heading=h.v9gue7m4adr5
FYI from SRE: "Percentage of services in the Deployment Pipeline having SLOs defined and agreed upon together with their service owner" 50% by end of FY2019/20, 100% in 3-5 years.
Annual Reviews
[edit]Overview: https://office.wikimedia.org/wiki/FY_2018-19_Annual_Review_and_Retrospective
- Note: there is a workshop you can attend to get advice: https://office.wikimedia.org/wiki/FY_2018-19_Annual_Review_and_Retrospective#Sprints_&_trainings_-_support_from_T&C
Deadlines
[edit]Everyone:
Starting now: You and I discuss who your peer reviewers should beApril 26th: Enter your peer reviewers into Namely (please run them by me first)- May 17th: Deadline to complete self-reviews, peer reviews, and reviews of your manager.
- May 20th: I start reviewing the peer reviews and writing my feedback on you.
Non SafeGuard (aka US Employees):
- June 14th: Deadline for managers to complete all 1:1 meetings with direct reports and provide written feedback in Namely.
SafeGuard:
- June 14th - Managers of those employed by Safeguard submit their reviews to HR for submission to Safeguard
- July 12th - Deadline to have a 1:1 and share final manager review with direct report in Namely
Incoming/Needs attention
[edit]- node6-node10 migration in CI: https://phabricator.wikimedia.org/T211784
- James needs a CI root to push new config/images
- TODO: Give James access (to contint-admins in puppet + integration/config +2)
- FYI: Java logging session from gehel last week https://drive.google.com/file/d/1gkA1CUrBkiN6XkUNWSChK-XQMZSbxGQT/view
Scrum of Scrums
[edit]Incoming from last week
[edit]- Blocking:
- Callouts: Changing a WikibaseCirrusSearch config default (to activate its functionality by default when installed) appears to break several browser tests in CI. Guidance requested on what to do about this: https://gerrit.wikimedia.org/r/#/c/mediawiki/extensions/WikibaseCirrusSearch/+/507597/
- I took a look. It looks to me that the tests fail because search is not working (properly) in CI. Is it similar to this? https://phabricator.wikimedia.org/T188507
- I guess Antoine needs to take a look at this.
- Callouts: Changing a WikibaseCirrusSearch config default (to activate its functionality by default when installed) appears to break several browser tests in CI. Guidance requested on what to do about this: https://gerrit.wikimedia.org/r/#/c/mediawiki/extensions/WikibaseCirrusSearch/+/507597/
- Callouts: some tests are really long: https://phabricator.wikimedia.org/T222757
- Greg already left a comment. I think Antoine is already working on making Quibble faster. I've looked at the phab board, but couldn't find a task: https://phabricator.wikimedia.org/project/view/2772/
- Do we need to leave a comment saying we're working on it? Antoine again? :)
- Callouts: some tests are really long: https://phabricator.wikimedia.org/T222757
- Language: Add abi to l10n-watchers group in Gerrit (https://phabricator.wikimedia.org/T222015)
- This seems a policy problem. Greg or Tyler? thcipriani: what policy blocks this? Not sure. People seem confused. Maybe it just needs to be made explicit what are the rules.
- Language: Add abi to l10n-watchers group in Gerrit (https://phabricator.wikimedia.org/T222015)
Outgoing this week (wrong section heading level is on purpose for copy/pasting into Scrum of Scrums etherpad
[edit]Release Engineering
[edit]- Blocked by:
- Blocking:
- Updates:
- Train Health
- Last week: 1.34.0-wmf.4 - https://phabricator.wikimedia.org/T220729
- This week: 1.34.0-wmf.5 - https://phabricator.wikimedia.org/T220730
- Next week: 1.34.0-wmf.6 - https://phabricator.wikimedia.org/T220731
- Code Health
- Log Health
- Train Health
Callouts
[edit]- Release Engineering
Train status and happenings
[edit]- Need to fix scap clean :\
- thcipriani has a crappy fix in mind until http tokens in gerrit are back
- Any idea when HTTP tokens will come back? Weeks? Months? Never? :-(
- ~Weeks
- 2019-05-06: cleaned up stuff last week on deploy hosts, just not the gerrit branches
- 2019-05-13: …
- thcipriani has a crappy fix in mind until http tokens in gerrit are back
- 1.33 branch cut for extensions is blocked (except tarball ones, which James did manually)
- 2019-05-06: Mukunda to do it this week
- Greg: email Cindy re process of this release
- 2019-05-13: We talked on Thursday. Mukunda will review hexmode's work, Cindy will email Greg with plan of action re timeline.
Quarterly Goals for Q4
[edit]https://www.mediawiki.org/wiki/Wikimedia_Technology/Goals/2018-19_Q4
TEC1 (Maint): Outcome 1 / Output 1.1
[edit]- GOAL: Undeploy the CodeReview extension.
- WHO: James, need help from CPT
- James will ping CPT about this this week (April 8th)
- … and again w/c 15 April.
- … and again w/c 6 May (in SoS).
TEC1 (Maint): Outcome 1 / Output 1.1
[edit]- GOAL: Setup 1-3 of the CI WG options (Zuul v3, Argo, GitLab)
- WHO:
- Focus on a couple noteworthy repos: e.g.,
- core
- extensions
- ops/puppet
- Maybe setup in serial, i.e., a week per evaluation
- Questions:
- RelEng/Extended working group?
- At least in the WG eval it was good to have non-familiar people
- But maybe with the setup of options it might be beneficial to have experienced with current setup people.
- Folks outside the original working group to join-in to setup options; people TBD
- Do we need a rubric before we do this prototyping? (yes)
- DONE lars to work on rubric week of 2019-04-01
- See email 2019-04-08
- DONE lars to work on rubric week of 2019-04-01
- RelEng/Extended working group?
- 2019-05-06: Feedback from Android. Working on an arch document. Do in Q1?
TEC3 (Pipeline): Outcome 1 / Output 1.2
[edit]- GOAL: Instrument Quibble for data collection
- WHO: Mukunda, Antoine
- Still no progress / nowhere to store this data and other tasks taking priority
TEC3 (Pipeline): Outcome 1 / Output 1.2
[edit]- GOAL: Create a graph where time is spent and make a prioritized list for improvements.
- WHO: Mukunda, Antoine
TEC3 (Pipeline): Outcome 1 / Output 1.2
[edit]- GOAL: Prepare the Deployment Pipeline for changes to our CI tooling.
- WHO: ???, ???
- Blocked by not having new CI tooling yet
TEC3 (Pipeline): Outcome 3 / Output 3.1
[edit]- GOAL: Create a .pipeline/config.yaml standard to give users more control over how their tests are run in the pipeline and allow the easy saving of artifacts at pipeline completion. (RelEng)
- WHO: Dan, Tyler, ???
- Dan's pipeline work merged
- Followup to update the pipeline to *use* that new fancy code
- General problem of shared resources on staging and in helm test stage (ci namespace on staging)
- What does the annual plan *actually say* for this?
TEC3 (Pipeline): Outcome 3 / Output 3.1
[edit]- GOALS:
- Adopt more services into Deployment pipeline - task T212801
- Wikidata Termbox SSR, Kask for Session Storage Service, cpjobqueue (stretch), ORES (stretch)
- Adopt more services into Deployment pipeline - task T212801
- WHO: Dan, Tyler, Lars
There are tasks: https://phabricator.wikimedia.org/T220403
- changeprop
- In progress ORES
- cf: Dan's comments
- Wikidata Termbox SSR
- Kask for Session Storage Service
- cpjobqueue (stretch)
TEC12 (DevProd): Outcome 1 / Output 1.1
[edit]- GOAL: Provide an "Official" Docker base image for local development of MediaWiki based on the production tooling.
- WHO: Jeena, Brennen
- https://phabricator.wikimedia.org/T212449
TEC13 (Code Health): Outcome 1 / Outcome 3
[edit]- GOALs: Presentation/session(s) at the Wikimedia Hackathon on the current state of Code Health projects (technical debt and code stewardship)
- WHO: JR
Met and discussed Hackathon session with Code Health Metrics WG. Daniel will also be having a related session on Cycle Dependencies.
- T216630 Present Code Health Metrics at the Hackathon
- image updated
- jobs publish
- need to actually use jobs in zuul, waiting on T222210
TEC13 (Code Health): Outcome 1 / Output 1.1
[edit]- GOAL:
- Publish a re-imagination of the Review Queue process.
- Develop and implement metrics around task and code-review responsiveness
- WHO: Greg, JR (and Andre)
- No activity
= TEC13 (Code Health): Outcome 4 / Output 4.2
[edit]- GOALs:
- Expand SonarQube reporting into CI infrastructure
- Perform SonarQube analysis on all extensions
- Engage user communities in direct feedback solicitation
- WHO: JR, Zeljko, Code Health Metrics
- continued work towards integration SonarQube into CI
Other non-goal work
[edit]Release MW 1.33
[edit]- ETA: end of May
- "just" producing the tarballs, Cindy is on point for what's in/what's out.
- Can I get a volunteer?
- Mukunda can build tarballs
- Greg: email Cindy asking for status
- See previous discussion above
Selenium
[edit]- Progress on various Phabricator tickets and/or Gerrit patches
Gerrit
[edit]Phabricator
[edit]- Merged and deployed upstream changes
- Fixed the calendar default view, no more fatal error.
- Reviewed and deployed several weeks of new translations from translatewiki
Jenkins
[edit]QA/Code Health
[edit]- Sent out participation request for Code Review Workgroup, 19 interested respondents so far.
- Discussions with existing TEs moving over to Q&T Engineering started last week. New team is being received well.
SCAP
[edit]Standup!
[edit]Antoine
[edit]- What I plan to do this week
- Train!
- tox upgrade from 2.9 to 3.10.0
- What I'm blocked on
- Zuul dos - our zuul is too old. I give up eventually.
- Bring new zuul-merger instance but the provisionning is subtily broken
- Reorganize CI projects in Phabricator https://phabricator.wikimedia.org/T223134
- Other?
- For those heading to hackathon, Kosta is willing to work on: https://phabricator.wikimedia.org/T87781 //Split mediawiki core tests into unit and integration tests//
- if you build a Docker container, it might not be immediately available due to replication delay between codfw (active) and eqiad (replica, serving docker-registry.wikimedia.org) https://phabricator.wikimedia.org/T222210#5176863
Brennen
[edit]- What I plan to do this week
- Work on publishing of starter dev images before hackathon
- Whatever else might be useful to tweaking local-charts prior to showing it a bunch of people
- Review writing
- What I'm blocked on
- Other?
Dan
[edit]- What I plan to do this week
- Get a pipeline/.config.yaml working for... blubber(oid)?
- Needs integration/config
- What to do? Seed job?
- Reviews
- Get a pipeline/.config.yaml working for... blubber(oid)?
- What I'm blocked on
- Other?
Greg
[edit]- What I plan to do this week
- Annual Reviews
- Annual Planning
- Hackathon
- What I'm blocked on
- Other?
James
[edit]- What I plan to do this week
- CI/npm stuff, as above.
- More MW static configuration concept work
- Hackathon
- Pipeline documentation work with Martyav.
- Helping more ServiceOps with the HHVM -> PHP72 migration
- What I'm blocked on
- Other?
Jean-Rene
[edit]- What I plan to do this week
- Prep for Hackathon
- Travel/Hackathon
- Reviews
- What I'm blocked on
- Other?
Jeena
[edit]- What I plan to do this week
- Finish reviews
- merge parsoid patch for blubbefile
- get Xdebug in mediawiki dockerfile
- Hackathon prep/Hackathon
- What I'm blocked on
- Other?
Lars
[edit]- What I plan to do this week
- help Antoine with the train group 0
- write reviews of self, Greg, peers
- improve CI arch document, reach out for more feedback
- start looking at what it takes to get one of the CI candidates running
- What I'm blocked on
- nada
- Other?
- nada
Mukunda
[edit]- What I plan to do this week
- Write reviews
- T222638 Talk with Timo and Fillipo about grafana and sentury
- T222829 merge branch.py and make-wmf-branch
- Try to get some movement on "T200392 release notes automation"
- Read a book
- What I'm blocked on
- Other?
Tyler
[edit]- What I plan to do this week
- gerrit 2.15.13
- working on code health stuff
- pipeline policy file
- gerrit logging
- What I'm blocked on
- Other?
- annual review
- book
Zeljko
[edit]- What I plan to do this week
- Prepare for Wikimedia hackathon (submit sessions and projects)
- Attend the hackathon
- What I'm blocked on
- Other?
Grooming
[edit]Team Kanban Board Review and Triage
[edit]- closed and touched in the 7 days
- No update for 4 weeks
- No update for 3 weeks
- No update for 2 weeks
- No update for 1 week
- All Open
- Review To Triage column of #releng
Once / month-ish review of backlog(s)
[edit]- releng Review To Triage column of #releng
- releng-kanban Review unassigned in kanban
- releng-kanban Review 'backlog' colum of -kanban
- releng-next - Review for things we need to put on our kanban backlog
- releng-backlog - oh my, the huge backlog of things...