Wikimedia Release Engineering Team/Deployment pipeline/20170830-planning
Appearance
2017-08-30
[edit]General note: Tentative timeline will ask for draft goals in about 1.5 wks
Status
[edit]TechOps
[edit]- Have buy-in on network policy approaches for how to deploy Kubernetes pods
- Implementation working
- Left: upload to Puppet repo, document it, and keep synced with Calico API
- Kubernetes 1.7 (latest), provides dynamic admission controllers
- Finishing Puppetization
- Intent is NOT to break toollabs :P
- No progress on ingress solutions
- And finally, Giuseppe and Alex looking at the standard pod structure.
RelEng
[edit]- Goal 1
- Goal 2
- Status:
- blocking tasks
- building mathoid via blubber
- finding build location accessible by jenkins (the ci-admin LDAP group, now done this week)
- "Not optimistic" about completing those goals this quarter
- blocking tasks
Services
[edit]- No official service delivery goals this quarter, not much progress last month.
Upcoming quarter goals (by team)
[edit]TechOps
[edit]- Pending draft goal would be infrastructure work
- Main idea long-term is to get all services running
- We have been waiting for Kub 1.7 to do [?] so we want to finally fix that next quarter
- Open ingress point
- Monitoring, since currently we have some icinga but no good checks for node down, heapster; no good graphs
- We are used to host based monitoring, but this will be different; perhaps look into not using icinga and going straight to prometheus
- Proposal from last Q for container security upgrades, so that will consume some time https://phabricator.wikimedia.org/T167269
- That work should be coordinated with Moritz on general updates, so tentative for next Q
- After that, we would be blocked on a trial service running, so (ideally) we should aim for that by end of Q2
- We should be at a point where we could run a real trial (non-production) service...maybe halfway through Q2
RelEng
[edit]- No draft goals yet
- If we miss Q1 goals, those will carry into Q2
- Monitoring for services is the main thing we want to get in place
- Be able to deploy to staging cluster and get feedback to [?] seems reasonable for Q2
- Trial service? Should be ready after blubber is working, so by early Q2. Enough to test infrastructure.
- Services needs a way to control what goes to production, so that could be Q2. (Helm). Depends on integrating with services
- Could other teams help? We have work to do.
- https://phabricator.wikimedia.org/T157469 (flow of things that need to be completed before this service is "done")
Services
[edit]- No official service delivery Q2 goal so far, drafts at https://www.mediawiki.org/wiki/Wikimedia_Services/2017-18_Q2_Goals
- Considering continuing dev environment / mwctl work, but resources are tight. Open to re-prioritizing other goals based on feedback.
- Would like more clarity about long-term milestones and goals of this program. Where should we be at end of Q2 or Q3?
- Focus for Q2 is on storage
- Also want to do something on delivery side, but struggling with time available, so would like to shift priorities to make time
- Uncertainty about coordination with other teams
- We should aim to be ready by end of Q2
Discussion
[edit]- Seems like RelEng may be a bottleneck by late Q2 into Q3
- A lot of the work will be RelEng + Services
- As we think about Q3 goals, think about other teams, not just our own; we should work even closer than we are today
- Milestones
- Able to run and test a service for infrastructure
- We can get that going with a lot of workarounds
- Full pipeline with self-serve (but that's a big step up)
- I don't think we have a proper design for that yet
- Once we start running a service, we can design that
- Should we kick off the design work sooner? Would prefer not to wait until Q3 planning before we start
- Design arttifact as a possible milestone in Q3
- Able to run and test a service for infrastructure
- There is a weekly meeting
- Currently building the pipeline to create the containers
- Next step would be deployment tool with defacto being helm
- Create a wiki page design doc
- We should have that as an early Q2 goal. Let's make it explicit as a goal
- ACTION: Alexandros can write 80% of the design doc by the next meeting
- Next meeting is at end of quarter, so too early for design
- ACTION: Mark will create phab tasks
- ACTION: Tyler document RelEng draft goals before next week (but really that's true for all teams)
- Services might be conservative this quarter and only make soft commitments on design work & dev environment
- Go-around of concerns
- Dan: Good to externalize our vision
- Gabriel: Looking forward to more clarity on design
- Giuseppe: Main concern is unknowns, since we haven't actually run something yet (e.g. managing configuration) +1, configuration already came up as part of dev environment work -- Gabriel
- Greg: I'm noodling about implications of program-centric goalsetting