Kubernetes SIG/Meetings/2023-10-24

Agenda:

Misc:
- Please review (and edit/comment): https://wikitech.wikimedia.org/wiki/Kubernetes/Resource_requests_and_limits
Topic: Local Kubernetes development, how should it work? Should you be able to run a WMF cluster locally in minikube?
- Two different POV’s probably: People deploying services to k8s clusters and SREs building/managing k8s clusters
- [EB] Setting up minikube (helm chart repo etc) felt tedious, then there are the requirements of the actual application to take care of (setting up kafka)
- [DC] Relied on a smaller version of application to test just the features I wanted; used minikube as well which worked okay for me. Difficulty was to set up something like minio because we lacked storage space. PersistentVolumes and the like were very difficult to learn.
- [DC] Struggled with where to place test values. I’ve been using an uncommitted values.yaml file. Perhaps there could be a place within the helmfile.d folders to set these
- [DC] Guidance around stateful services, object stores etc would be helpful
- [BK] Is there test swift/minio clusters for local testing?
- [JM] It’s quite a hurdle if you have to get a couple clusters and maps infrastructure running. Still trying to understand the difference between issues with k8s and issues with running the actual application in k8s
- [EB] I know in a lot of computing we like to separate the layers, but at the end of the day if I have to operate on 3 layers then it doesn’t seem to make things easier for me. But ofc [service-ops] can’t be responsible for all the application layer.
- [EB] Some of the issues are because we’re not talking to the infra in the right way, or because we don’t have the right network stuff set up. The ideal would be if I could somehow test those things locally and be able to push a thing and know that it works. So that’s my end goal: to know that something is going to work before I deploy it. And that requires talking to enough things that are similar enough to be valid.
- [JM] Broad scope here, hard to solve everything. For talking to other services via service mesh for example, this will be very cumbersome. Finding a generic solution for that is probably very hard or even impossible. I fear this goal of having a complete environment is not going to be feasible.
- [EB] Would it be useful to have tooling for teams who aren’t super familiar with k8s to build that env for themselves. For example in mediawiki dev land there’s tools that start jobs, etc and you don’t have to worry if the thing is already there; you can test your application locally. For our use cases a combination of [our application] being a streaming application w/ all the k8s stuff is difficult [meta notetaker note: I may have transcribed this last sentence wrong]
- [DC] So on a slightly different approach, when I was working on this application I was also working on yarn (yarn is the other computer cluster we have in hadoop), and it’s much more free-form so I can ship my app in yarn; I don’t need to deploy to deployment-charts. It’s a “production” cluster but it’s not the “wiki production” cluster so it’s more like a sandbox for me and the feedback loop was quick because I just grab my jar, copy to a stat machine and run my job and get feedback. I don’t have to be “official” (making a release etc). This would allow to go with kubectl directly/manually changing helm charts etc. without having to go through deployment-charts patches etc.
- [BT] I know some of the work I did at the beginning of this year was to make a spark operator available on the DSE cluster and that is intended to support this kind of workflow much more, in that you’d be on a statbox, and you’d be able to tell the application “these are the jar files I want to provide, this is how my spark driver is going to run, how many executors to run”, so it was intended to allow the DSE cluster to run using the generated workload. We’ve got 95% of the way there but haven’t had time to fix up the perms required to allow a spark client and all the hadoop delegation and this kind of thing. And that still only supports spark; the flink operator is in the same kind of boat in that you should be able to spin up a flink cluster on DSE without necessarily going through deployment-charts, but at the moment there are not the custom permissions set that would enable us to do that which is what we’d intended for the spark operator. Ideally you want a repl-type thing where you just fire up a spark sql session and you’ve got distributed computing at your fingertips.
- [BT] Wrt minikube, I did have some success with minikube while working on the datahub deployment. I was able to separate the application from the other set of resources which are its dependencies, and I could very easily deploy these dependencies to my local minikube application and that was fine and I was able to separate them very easily. So it didn’t matter very much - I didn’t have to build my own mariadb instance, I could use whatever containers I wanted from 3rd parties deploying on my local minikube - but what I actually found was that as soon as I’d gone from there to deploy the application to staging and then to production, all the local prerequisites started getting bitrot because I never went back and re-ran on my local dev machine.
- [JM] I’d take a look at this problem from the other side as well. I know that JH did some work on this as part of the tracing stuff in jaeger (??).
- [JH] My memory’s a bit rusty but I was one of the people that originally proposed this topic so it definitely interests me. I set up the aux cluster. I was initially surprised that a local test story was not a first-class instance in our k8s env. That wasn’t the case for standing up the aux cluster, so when I was working on…I found the combo of helm+helmfile+deployment charts to be pretty hard to wrangle. I thought the documentation for helmfile was rough; the levels of templating gave me a lot of difficulty. When I wanted to test the pieces for distributed tracing, I was able to get some stuff set up in minikube but it required a lot of effort and in particular a lot of duplicate effort. But I did find it very useful for initial testing, but I had similar problems to what BT said, which is that once I got some pieces into production my minikube testing stuff started to bitrot because of the mismatch being too large. The repo I setup can be found here, https://gitlab.wikimedia.org/repos/sre/jaeger-minikube.
- [FG] I wanted the development environment I choose to go the non-local way, because what I want is puppet. Puppet is the source of truth for SRE and I want a copy of production as close as feasible. In puppet we have one role per host, so the current implementation is based on cloudvps, and VMs. One role per vm, everything is close enough to production for my usecase.
- [FG] I haven’t run k8s roles in cloud vps, I have only observability roles. Those are all vms in principle.
- [JM] What was talked about is called pontoon, that’s a thing that he invented to transport the production puppet stuff into cloud VMs. I did that for the k8s roles during the last k8s update because I had to refactor all the stuff we had in puppet for k8s. And it didn’t feel safe to do that in prod. That ofc has the same problems with bitrot b/c I didn’t update the stuff after I did the refactoring. But it is a way to at least produce a cluster that seems similar enough from the outside to what we have in prod. The downside is lots of the insides of the cluster, stuff deployed from deployment-charts repo, those things won’t work in cloud VPS. So discrepancies start at the very low level in this. Ultimately that just moves the requirement for components and dependencies into that cluster rather than into minikube. So it doesn’t remove the problem but just changes where those things live.
- [FG] Disclaimer, it’s not a walk in the park. The other thing is that my development environment *is* puppet. I’m dealing with it anyway so I’d rather have my dependencies also be managed by puppet and not in a docker container which then is different from prod and then I’m back to square 1.
- [JM] Did you do anything in regards to the tracing stuff? Or did you then refrain from doing it in deployment-charts and test the deployment on the ops cluster directly.
- [FG] The latter. Testing in prod for lack of a better word.
- [JM] Still not sure where the core trouble is. From what I’ve heard it’s more or less easy to set up minikube itself but it’s not easy to get the stuff into minikube that is required to deploy your service and it’s potentially very complicated to deploy with the same tools that we currently have. So using helmfile and the stuff in the repo to deploy to the local cluster is potentially hazardous I guess.
- [JH] Yeah I did try, although I didn’t know helmfile before, but I did spend a fair bit of time trying to get it capable of deploying directly to minikube and I ended up giving up and just using helm to deploy the dependencies that look something like it. I think for my development it’d have been very beneficial to be able to take that repo, take deployment-charts and have it work in a minikube without a huge amount of effort.
- [DC] So I don’t know if it’s feasible but we’ve done some test resources that we wanted to deploy in minikube at some point and that test became outdated very rapidly because we don't seem to have a place to keep these in the repo so we leave them out. Could we have a `-minikube.yml` file that would be tested in CI whenever you change your chart? CI would take care of running minikube, load values file, run app and make a small test case so we ensure through CI that the test env is never completely outdated.
- [JM] It does increase the complexity by quite a lot. I see the option of service-owners just being able to run `helmfile -e minikube`. But I’m a bit anxious that it increases the already-high complexity of the deployment-charts repo even more. But we could at least look into that, to have the same tooling. Not sure about doing it in CI because that’s quite a lot of work and CI is already slow as is. All the basic linting stuff should be caught already so it’s really about just trying to not bitrot the test environments.
- [BK] Is there any value to having something like minio, a way to have some of the services we might want to use for testing locally, like fake stub versions. Maybe some of the SDN stuff, maybe prometheus. Any ideas if there’d be value in having a unified way to mock those services locally?
- [BT] I think we had examples of setting up kafka clusters already, in eventgate or something. I’m sure I’ve seen a couple of zip files that would unzip in order to deploy some pre-reqs similar to what we have.
- [JM] Some charts do have dependencies defined with a certain switch to flip to have them deployed. I guess that’s more or less the way to go but it still relies on running helm for example and it won’t go through the complete chain that deployment-charts does. Which will ultimately be hard to get right due to value inheritance and the like.
- [JM] service-ops doesn’t have the bandwidth to do anything concrete about this in the short-term. But there is in the annual plan, “knowledge platform reduce fragmentation” bit that could be used to justify some work in this area. So happy to help but can’t drive something like this in the medium-term.
- [BK] wrt to search platform, maybe we can make a drive to look at this stuff next year. The fragmentation issue is something we should probably take on as a team.
- [BT] I’m already keen on getting user-generated workloads, like what we spoke about w/ spark jobs. We’ve done quite a bit of work on the DSE cluster on being able to do persistent volumes. It’s not the same as the prod infrastructure and the larger-scale kafka clusters and mariadb and all the rest, but there’s certain use cases for persistent volumes that could be useful in a lot of this stuff. Where as a deployer you have the rights to create the resources that you need.
- [JM] One last question: if we removed restrictions on the wikikube staging cluster for example, so giving everybody that has deploy rights the permissions or granting everybody a personal namespace in that cluster and allowing them to kube-control whatever in that namespace. Would that solve problems already from the service deployers’ perspective?
- [EB] Maybe? It makes it easier potentially for what DC talked about earlier about trying to solve an issue that you can’t reproduce anywhere except prod. Building an updated version of the thing and deploying it there. But I suspect still having to publish docker images [will still have problems] without special tooling around it.
- [JM] If search people are interested in taking a stab. Supportive work is not a problem from service-ops side but driving it would be. So we’re very happy to help.
Action Items:
- [BK] Will make a ticket for search team to look into this further, with supportive help from service-ops