User story 13, Anonymous rate limit, refers to the use of a "global pool". This has been clarified (elsewhere) to mean that requests from all anonymous agents will contribute to a single rate, with the limit to be applied against that rate. I believe this should be reconsidered.
Assuming some limit L
and concurrency C
(and assuming for the sake of argument that all agents make requests at the same rate), the effective rate becomes L/C
. L
is constant, but C
is not, so the effective per-agent limit is unknowable (without visibility of C
). How would we communicate this to users? How do we set expectations?
I believe the more conventional approach would be to apply limits on a per host/IP basis. These limits would be communicated directly to users; Interpretation would be straightforward.
If the concern is that for any given limit, a sufficiently large concurrency will push the system over capacity, then limiting concurrency is probably the appropriate response (and a sufficiently large concurrency could take us down anyway, regardless of rate).
Finally, since (to the best of my knowledge) this requirement isn't about implementing certain semantics for the sake of the product (the API), but about safe guarding the infrastructure that hosts it, we should really wait for SRE to weigh in. I believe they already have ideas (plans?) with regards to this; It may not even be necessary for us to rate-limit anonymous users (if for example this is occurring upstream of the gateway).