Here's an idea, possibly too crazy: we build new container versions, in sequential order somehow: v1, v2, ...
We deploy each a new version, and after it's run acceptably with production traffic for time T, we label is golden. Any non-golden version can be rolled back (possibly automatically, based on error rate or UBN; possibly manually be RelEng/SRE/CPT/....). If rolling back one version isn't enough, roll back further, until newest golden version.
Every roll back results in an alert to RelEng, SRE, CPT, and anyone with changes since the newest golden version until the rolled back version.