Analytics/Techdebt
Appearance
Tech Debt
[edit]Fixed Date
[edit]Hardware
[edit]- Stat1 migration to eqiad (see my email regarding stat1003) -- deadline January 31st 2014.
User Metrics
[edit]Wikimetrics
[edit]* Migrate to prod (https://mingle.corp.wikimedia.org/projects/analytics/cards/1185)
Unprioritized
[edit]Limn
[edit]- MediaWiki OAuth support (Diederik: not sure depends on the decision whether or not we are going to integrate Limn into Mediawiki)
- Migrate to production (particularly deploying 3rd party dependencies) (https://mingle.corp.wikimedia.org/projects/analytics/cards/119)
- LimnPy (Diederik: what needs to happen? solve bugs -- if yes can we file those in bugzilla?
- make it easy to use for everyone
- Saving Ad-hoc created graphs and datasources to a permanent datastore (Diederik: not sure depends on the decision whether or not we are going to integrate Limn into Mediawiki)
- Decoupling query.co from data transformations (to make things like indexed scales much easier) https://mingle.corp.wikimedia.org/projects/analytics/cards/1218
- Indexed charts (https://mingle.corp.wikimedia.org/projects/analytics/cards/88)
- Annotations
- Synopsis page (Diederik: what needs to happen exactly?)
- Decoupling server/client (https://mingle.corp.wikimedia.org/projects/analytics/cards/621) (Diederik: not sure depends on the decision whether or not we are going to integrate Limn into Mediawiki)
Limn Dashboards
[edit]- Implement Private Read (Diederik: still necessary if geowiki is going to either Office wiki or a secure webserver?)
- Legacy dashboards (Diederik: what needs to happen exactly?)
Kraken
[edit]- implement JAR versioning and deployment standards (https://mingle.corp.wikimedia.org/projects/analytics/cards/423)
- make pageview UDFs handle json format log lines
- clean up the pageview logic code
- Nexus/Maven repo migrate out of labs to production (https://mingle.corp.wikimedia.org/projects/analytics/cards/1212)
- fix X-Forwarded-For handling for Ops/Wikipedia-Zero (https://mingle.corp.wikimedia.org/projects/analytics/cards/1191)
- clean up cruft from repo (Unused code paths, unused pig scripts, ...)
- fix Device detection (https://mingle.corp.wikimedia.org/projects/analytics/cards/317, https://mingle.corp.wikimedia.org/projects/analytics/cards/1070)
- clarify licensing
- get CI working (https://mingle.corp.wikimedia.org/projects/analytics/cards/1076)
Wikistats
[edit](EoL from Product POV)
- Running the reports is a SPOF
- Documentation e.g. of csv file formats
Wikimetrics
[edit](pick up and put in WM epics) * Migrate to prod (https://mingle.corp.wikimedia.org/projects/analytics/cards/1185) * MediaWiki OAuth (https://mingle.corp.wikimedia.org/projects/analytics/cards/765) * refactor "old style" tests to use create_test_cohort or similar methods (test code is disjointed between the two approaches) * fix configuration to enable overrides in production / staging (currently we git stash, deploy, git pop)
Wikipedia Zero (Epic)
[edit]- Agree on a pageview definition (Diederik: in progress with Zero team)
- Use ids instead of index numbers to reference series (That would allow users to create graphs that do not break over time)
- Add mobile traffic that does not stem from mobile varnishes to input (e.g.: api traffic) (https://mingle.corp.wikimedia.org/projects/analytics/cards/1190)
- Fix implemented metrics to meet our official definitions
Geowiki
[edit]- Grant access to Jessie's team
- Stop users from jumping countries upon updating GeoIP databases (Diederik: let's ask our customer if they care about this)
- Make sure users are not counted multiple times when aggregating regions (Diederik: let's ask our customer if they care about this)
- Update bot definition (Diederik: let's ask our customer if they care about this)
General
[edit]- Automatically update IPv6 MaxMind databases (Diederik: AFAIK there is not yet a monthly subscription available to maxmind ip6 databases, so no need to update automatically)
- Discuss how the definition of active editors for days (e.g.: geowiki) instead of months (Diederik: this is part of metric standardization)
- Move production services out of labs (geowiki, wikipedia zero) (Diederik: see line 5)