Marking this as a draft because some of these points need a bit of review before we can rely on this document for general usage.
E.g. I have just removed the advice to "Work on the assumption that at 2 million records a database table becomes unusable, so if you want 6 months work of data 5 events per second would get you there". Large tables can become a problem, but in that form the statement is plain wrong. Currently there exist about 140 EventLogging tables with more than 2 million rows (SELECT table_name, TABLE_ROWS FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_SCHEMA ='log' ORDER BY TABLE_ROWS DESC;
), and 16 with more than 100 million records. Many of these large tables are enjoying a happy, productive life. Of course 100 million may be too much to query at once, but as I understand it, all EL tables are indexed by timestamp, so that you can reduce computational effort by restricting queries to certain timespans.