Jump to content

Wikimedia Release Engineering Team/Checkin archive/2024-05-22

From mediawiki.org

2024-05-22

[edit]

πŸ† Wins/winterrogation

[edit]
https://www.mediawiki.org/wiki/Wikimedia_Release_Engineering_Team/Monthly_notable_accomplishments
May 2024
  • scap k8s deployment progress reporting
  • scap release-scripts/perform-release rewritten in Python, and added wait for the tag pipeline.
  • Jaime is a PHP expert now, succesfully running patchdemo
  • Patches upstream for Phorge viewing reports while not logged in (https://we.phorge.it/D25608) etc
  • buildkitd upgraded to v13.2
  • scap clean improvements
  • First changes to Catalyst Patchdemo
  • Scap3 broken symlink up for review
  • Skins available in the catalyst environment
  • Upstream buildkit mod merged: https://github.com/moby/buildkit/pull/4899
  • Moved wmf buildkit helm chart to its own repo for easier maintenance: https://gitlab.wikimedia.org/repos/releng/buildkit-chart/
  • integration/config: jjb-diff improvement (don't assume stdout wants ansi)
  • Docker gc config for CI: https://gerrit.wikimedia.org/r/c/operations/puppet/+/1031045
  • docker-hub-mirror upstream bug workarounds
  • Phab: made good progress removing tech debt in Phabricator, all deployed thanks to Brennen: https://phabricator.wikimedia.org/maniphest/query/cGaRtbNWQSd1/#R . Disabled ~12 ancient Herald rules. Hackathon. Phorge upstream stuff. etc.
  • Hackathon a good time generally
  • Wikibugs got initial gitlab integration during the hackathon and has a couple of improvements since. Next step is wiring up a bot to configure more webhooks so the bot can see CI runs and code review comments. https://www.mediawiki.org/wiki/Wikibugs
  • Contint1002 is now running on bullseye along with python2 zuul, but this is the LAST TIME! (thanks dzahn)
  • Hack found for Wikibugs network instabilty issue talking to https://gitlab-webhooks.toolforge.org. Bypassing Kubernetes ingress by talking directly to service makes things much more stable for long lived connections.
  • Bryan working with Eoghan to get secrets provisioned for adding GitLab account block/unblock to Wikitech block cascade.
  • Upgrading SyntaxHighlight to work with the newest Pygments is stalled because the new version needs Python 3.8+. Prod, test, and default dev environments are all currently Buster with Python 3.7. <https://phabricator.wikimedia.org/T364249>
  • Jelto unblocked Wikibugs tests from calling Phabricator by creating a GitLab shared runner that can be used by projects in /toolforge-repos/
  • Deployed protection for https://phabricator.wikimedia.org/T282893 (Various CI jobs failing after "mkdir: cannot create directory β€˜log’: Permission denied"). That revealed a few places where a root:root cache or log directory was previously being auto-created by docker. Added fixes for that. Plus a fix for codehealth checks from Tyler
  • Blubber Python builder: Always use a virtualenv https://phabricator.wikimedia.org/T357548 . blubber/buildkit 0.23.0 released
  • docker-gc resiliency improvement deployed.
  • Simplified gitlab-trusted-runner projects.json (removed project-ids)
  • Fixed problem recently discovered w/ gitlab-mentions-bot... it starts getting email notifications for MRs that it has made a note on. These emails go to releng. Fixed.
  • Andre's 1st TRAINNNNN \o/
  • Backfilled bugzilla tickets in phabricator to fix stats after 10 years - https://phabricator.wikimedia.org/T107254
  • Phabricator OGP previews upstream patch - https://we.phorge.it/D25668
  • SRE Collaboration Services has a dedicated IRC channel now irc://irc.libera.chat/wikimedia-sre-collab
  • https://phabricator.wikimedia.org/T313624
    • Reproduced the issue locally and identified that it occurs when the keyholder key is either not specified in scap.cfg or is missing from /etc/keyholder.d. According to OpenSSH behavior, if no specific key is provided, it tries all authentication methods up to the MaxAuthTries limit. Since these configurations are on the target and not modifiable, increasing MaxAuthTries is not a viable solution.
    • To resolve this, I updated the code to abort the program and prompt the user for a rollback if the key is missing.
  • Patchdemo checkbox with ooui https://patchdemo.catalyst-qte.wmcloud.org/

Stuff from last time

[edit]

πŸ“… Vacations/Important dates

[edit]
https://office.wikimedia.org/wiki/HR_Corner/Holiday_List#2024
https://wikitech.wikimedia.org/wiki/Deployments/Yearly_calendar
https://www.mediawiki.org/wiki/Wikimedia_Release_Engineering_Team/Time_off (page needs updating for Dayforce)
  • Apr 29 - May 02: Andre less available before Hackathon
  • Apr 29 - May 3 Antoine
  • Weds May 1 - May 9: Brennen (Hackathon, working but expect limited comms)
  • [FYI: May03-05 Hackathon]
  • Wed May 8th - May 10th: Antoine (Victory Day, Ascencion Day, + 1 vacation)
  • Wed May 8th - Andre b/c Liberation Day = CZ vacation day
  • Thurs 9th (holiday), Fri 10th (holiday moved from the 1st): Jaime
  • Mon May 20th - Pentecost Day: Antoine
  • Fri 24 May: Dancy PTO
  • Mon 27 May: Memorial Day (US staff with reqs)
  • Fri May 31: Brennen PTO (tentative)
  • Wed Jun 5- Sun 9: Brennen PTO
  • Mon Jun 10–Fri 14: Tyler
  • Thu Jun 13-Sat 15: Andre less available (DevConf.cz)
  • Wed Jun 19: Juneteenth (US staff with reqs)
  • Mon Jun 17-Fri 21: Bryan PTO
  • July 1-9: Jaime
  • July 4-5: US staff holiday
  • July 5: Andre (CZ Holiday)
  • July 12–15: Ahmon
  • Fri 09 Aug – Global holiday: International Day of the World’s Indigenous Peoples
  • Sun 25 Aug - 03 Sep: Brennen

Future

[edit]

πŸ”₯πŸš‚ Train

[edit]
https://versions.toolforge.org/
https://train-blockers.toolforge.org/
https://wikitech.wikimedia.org/wiki/Deployments/Yearly_calendar

Rotation

[edit]
  • 3 Dec – 1.42.0-wmf.8 – No Train offsite
  • 11 Dec – 1.42.0-wmf.9 – Brennen + Antoine (Jaime out)
  • 18 Dec – 1.42.0-wmf.10 – Ahmon + Brennen (Jaime out)
  • 25 Dec – 1.42.0-wmf.11 – No Train
  • 1 Jan – 1.42.0-wmf.12 – Dan + Ahmon (Jaime out)
  • 8 Jan – 1.42.0-wmf.13 – Jeena + Dan (Jaime out)
  • 15 Jan – 1.42.0-wmf.14 – Jaime + Jeena
  • 22 Jan – 1.42.0-wmf.15 – Antoine + Jaime
  • 29 Jan – 1.42.0-wmf.16 – Ahmon + Antoine(Brennen out Wed–Fri)
  • 05 Feb – 1.42.0-wmf.17 – Brennen + Ahmon
  • 12 Feb – 1.42.0-wmf.18 – Brennen+Antoine (Friday)
  • 19 Feb – 1.42.0-wmf.19 – Jeena+Brennnen
  • 26 Feb – 1.42.0-wmf.20 – Dan + Jeena
  • 04 Mar – 1.42.0-wmf.21 – Jaime + Dan (Antoine out)
  • 11 Mar – 1.42.0-wmf.22 – Antoine + Jaime (Dan out)
  • 18 Mar – 1.42.0-wmf.23 – Ahmon + Antoine
  • 25 Mar – 1.42.0-wmf.24 – Jeena + Ahmon
  • 1 Apr – 1.42.0-wmf.25 – Jaime + Jeena
  • 8 Apri – 1.42.0-wmf.26 – Antoine + Jaime (Tyler out)
  • 15 Apr – 1.43.0-wmf.1 – Ahmon + Antoine
  • 22 Apr – 1.43.0-wmf.2 – Brennen + Ahmon (Global holiday Monday; Brennen out Friday)
  • 29 Apr – 1.43.0-wmf.3 – Jaime + Brennen (Antoine out Wednesday; Jaime floating holiday to Friday; Hackathon over the weekend)
  • 6 May (6-10) – 1.43.0-wmf.4 – Jeena + Jaime (Jaime out Thursday; Brennen out; Antoine out; Ahmon backup Thu)
  • 13 May (13-17) - 1.43.0-wmf.5 – Antoine + Andre (Jeena as backup)
  • 20 May (20-24) - 1.43.0-wmf.6 – Andre + Antoine (you can do it Andre!) (Antoine out Mon, Ahmon out Fri)
  • 27 May (27-31) - 1.43.0-wmf.7 – Ahmon + Andre (Memorial day Monday)
  • 03 Jun (03-07) - 1.43.0-wmf.8 – Dduvall + Ahmon (Brennen out)
  • 10 Jun (10-14) - 1.43.0-wmf.9 – Brennen + Ahmon
  • 17 Jun (17-21) – 1.43.0-wmf.10 – Jaime + Brennen
  • 24 Jun (24-28) – 1.43.0-wmf.11 –

Team Discussions

[edit]
  • Team demosβ€”this is a good spot for 'em
  • Hypothesis WE6.2.....
    • Feedback: Individual deploy speedups would be nice
    • Question: Will this work for scap3? (Not yet, maybe someday.)
    • Feedback: Community configuration may change some of this (fewer backport deployments - allowing people to use the wiki itself to make config changes).
    • Feedback: [redacted]

Let's do some inbox triage: https://phabricator.wikimedia.org/maniphest/query/7vRDrcVnt8OI/#R

🌻 Open source/Upstream contributions

[edit]
https://www.mediawiki.org/wiki/Wikimedia_Release_Engineering_Team/Upstream