Currently Wikimedia creates a /tmp/ json file that it subsequently reads from until there is an mtime update to InitialiseSettings.php: I don't know how that works in a container. We could carry that forward; however, if it is not using a shared volume of some kind then each pod will have some startup cost: having to regenerate this configuration. Alternatively, regenerating via pipeline makes a long process even longer.
Topic on Talk:Wikimedia Release Engineering Team/MW-in-Containers thoughts
Appearance
My vision is that we stop making the temporary JSON files on-demand on each server, and instead pre-generate them (per-wiki or a single mega file, not sure) on the deployment server and sync the compiled config out in the scap
process instead of InitialiseSettings.php
. Then, in the container universe, this JSON blob gets written into pods through the k8s ConfigMap system, rather than as a file tied to a particular pod tag.
+1 -- makes sense to me, that's what I'd like as well. Some investigation needs to happen -- random server shows 941 json files totaling 74MB of config for each wiki version.
Note that ConfigMaps have a limit of 1MB (actually it's a bit more than that, but it's best to stick to 1MB mental model). That stems from etcd having a max object size of 1MB (again a bit more, like 1.2MB, but I digress). So we aren't going to be able to use that approach to inject that content into the pods (unless we split it into many many ConfigMaps).
We could alternatively populate the directory on kubernetes hosts and bind mount it to all pods (especially easy if it's read only from the container's perspective). But then we would have to figure out how to populate it on the kubernetes nodes, which is starting to imply scap.
Yeah. :-( Theoretically we could do one ConfigMap per wiki, but that means a new default setting would need 1000 ConfigMaps to be updated, which suggests a race condition/etc. as it rolls out.
Does any one JSON file approach 1M in size? K8s has a "projected" volume feature that allows multiple volume sources (including ConfigMaps) to be mounted under the same mount point, so ostensibly we could have one ConfigMap per wiki but still have them under the same directory on a pod serving traffic for all wikis. Still a bit cumbersome from a maintenance perspective perhaps but it might work around etcd's technical limitation.
- Does any one JSON file approach 1M in size?
Nope. The biggest one currently is:
108K /tmp/mw-cache-1.36.0-wmf.4/conf2-commonswiki.json
What about doing the compilation in CI, and not during scap deployments? Would that be feasible? Maybe later?
What about doing the compilation in CI, and not during scap deployments? Would that be feasible? Maybe later?
Doable, but makes merges a mess unless we write a very special git merge driver, and it bloats the git repo with these files, which can get relatively big as Tyler points out. 🤷🏽♂️
One thing I would like, regardless of CI compilation, is a way to roll these back quickly: this is the advantage of having them generated on demand currently, and one thing that generating them on the fly at deploy time would slow down (maybe).
I had some ideas on configuration provided each pod is dedicated to a single wiki (which might be nice if we wanted to scale based on traffic per wiki).
I thought it would be ideal if we could inject the configuration (overrides) into the pod at deploy time, but it seems like there's too much configuration to do that.
We could add the configuration with a sidecar container. That could assist with the size issue of the configuration and the rolling back issue as well, I think.
I don't think we will go with one wiki per pod, that would be extremely impractical as we would need to have 900 separated deployments.
When we start the migration, we will probably have one single deployment pod, and then at some point we might separate group0/1/2, but I don't see us going beyond that.
There are other practical reasons for this, but just imagine how long the train would take :)
Yeah, 900 is a lot, but I don't think that is such a problem for Kubernetes. I thought we could have an umbrella chart with the 900 wikis, and then update the image tags and install with helm once. I have never tested helm with such a large chart but since the release info is stored in configmaps/secrets(helm 3) I guess we could run into a size limit issue there so...maybe you are right :P
I think the sidecar container with configuration is feasible for a single deployment as well, though
My current thinking is:
- Give MediaWiki a way to load settings from multiple JSON files (YAML could be supported for convenience, but we probably don't want to use it in prod, for performance reasons). This needs to cover init settings and extensions to load as well as the actual configuration.
- pre-generate config for each of the 900 wikis when building containers (maybe by capturing the result of running CommonSettings.php). Mount the directory that contains the per-wiki config files (from a sidecar?). Let MediaWiki pick and load the correct one at runtime.
- pre-generate config files for each data center and for each server group (will all the service addresses, limit adjustments, etc). Deploy them via ConfigMap (one chart per data center and server group). Let MediaWiki load these at runtime.
- Let MediaWiki load and merge in secrets from a file managed by Kubernetes.
- Use a hook or callback to tell MediaWiki to merge live overrides from etcd at runtime.
- Extract all the hooks currently defined in CommonsSettings.php into a "WmfQuirks" extension
How does that sound?