I have a couple of points to make, most stem from the diagram but are related to the threat model
- There are no fine grained distinctions for developers (e.g. volunteer vs trusted volunteer/staff vs repo owner). Each of those should have different rights and be able to do different things. e.g. we certainly don't want every single change to be able to create a test environment
- Tangential to the above, both code fetched from the internet and code uploaded in gerrit is inherently unsafe. We should be very clear about communicating that cause otherwise it might lead to people making assumptions. .e.g that means that a LGTM in gerrit isn't enough. It might very well be that some 3rd party repository is compromised and we end up with cryptominers (in either CI or production)
- There will need to be a way for code to enter that cycle in an embargoed way. Whether that is gerrit private repos or some way to inject code directly in production and bypassing the entire CI/CD is debatable, but we definitely need something. Depending on how we do it we might have to make artifacts non public, which in principle is already a leak.
- It's quite possible that we need the CI web UI and gerrit to interact somehow, either directly or via a agent/bot/something that provides useful information and links to developers. This adds attack vectors of course as that agent/bot/something could be compromised.
- The distinction between the persistent and the temporary object store is unclear. I get the feeling that they are distinct just because of attack vectors on the temporary one, but it's unclear what those are. We should be adding them.
- It's unclear if the deployment node is automated or user accessible/triggered. Depending on the answer above, it might mean different attack vector and perhaps a need to split it to multiple deployment nodes. e.g. if the deployment node that creates the test environments is user accessible, it could be compromised but that would have no adverse effects into production. Of course in that paradigm, the promotion should be done by a different deployment node
- The list is nice as a start, but it probably should be unified a bit, e.g. the following are currently the same thing essentially.
- Elevate privilege by impersonating SRE/admin on Gerrit, over ssh.
- Elevate privilege by impersonating SRE/admin on Gerrit, over HTTP.
- We need severities. I can start by saying that most denial of service attacks are of low severity (excluding the production node capacity one)
- I find the labels on the arrows not very explanatory. Perhaps action verbs would be more useful?