Analytics/Reportcard/2.0/Database design
Appearance
This page is obsolete. It is being retained for archival purposes. It may document extensions or features that are obsolete and/or no longer supported. Do not rely on the information here being up-to-date. |
Database design | |
---|---|
Component | General |
Creation date | |
Author(s) | Erik Zachte, Diederik van Liere |
Document status | draft withdrawn |
Database design Report Card 2.0
[edit]This page documents the database design for the new reportcard ([http://reportcard.wmflabs.org/reportCard/ reportcard.wmflabs.org/reportCard/). The document consists of two sections:
- the current database design consists of 10 tables. Most of these tables have the same structure.
- a new simplified proposed design. Please chime in on the Talk page with your thoughts.
The high level goals are documented here Analytics/Reportcard/Requirements.
Current Table Designs
[edit]CREATE TABLE `binaries` (
`date` date NOT NULL,
`project_code` char(2) NOT NULL DEFAULT '',
`language_code` char(15) NOT NULL DEFAULT '',
`extension` varchar(10) NOT NULL DEFAULT '',
`binaries` bigint(15) DEFAULT NULL,
PRIMARY KEY (`date`,`project_code`,`language_code`,`extension`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;
CREATE TABLE `page_views` (
`date` date NOT NULL,
`project_code` char(2) NOT NULL DEFAULT '',
`language_code` char(15) NOT NULL DEFAULT '',
`views_non_mobile_raw` bigint(15) DEFAULT NULL,
`views_mobile_raw` bigint(15) DEFAULT NULL,
`views_non_mobile_normalized` bigint(15) DEFAULT NULL,
`views_mobile_normalized` bigint(15) DEFAULT NULL,
`views_raw` bigint(15) DEFAULT NULL,
`views_normalized` bigint(15) DEFAULT NULL,
PRIMARY KEY (`date`,`project_code`,`language_code`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;
CREATE TABLE `offline` (
`date` date NOT NULL,
`source` varchar(255) NOT NULL DEFAULT '',
`readers` bigint(12) NOT NULL DEFAULT '0',
PRIMARY KEY (`date`,`source`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;
CREATE TABLE `mobilepageviews` (
`date` date NOT NULL,
`language_code` char(15) NOT NULL DEFAULT '',
`project_code` varchar(10) NOT NULL DEFAULT '',
`country_code` varchar(3) NOT NULL DEFAULT '',
`value` bigint(12) NOT NULL DEFAULT '0',
PRIMARY KEY (`date`,`language_code`,`project_code`,`country_code`,`value`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;
CREATE TABLE `editortargets` (
`date` date NOT NULL,
`language_code` char(15) NOT NULL DEFAULT '',
`project_code` varchar(10) NOT NULL DEFAULT '',
`country_code` varchar(3) NOT NULL DEFAULT '',
`value` bigint(12) NOT NULL DEFAULT '0',
PRIMARY KEY (`date`,`language_code`,`project_code`,`country_code`,`value`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;
CREATE TABLE `editorsbygeography` (
`date` date NOT NULL,
`language_code` char(15) NOT NULL DEFAULT '',
`project_code` varchar(10) NOT NULL DEFAULT '',
`country_code` varchar(3) NOT NULL DEFAULT '',
`value` bigint(12) NOT NULL DEFAULT '0',
PRIMARY KEY (`date`,`language_code`,`project_code`,`country_code`,`value`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;
CREATE TABLE `wikistats` (
`date` date NOT NULL,
`project_code` varchar(10) NOT NULL DEFAULT '',
`language_code` varchar(15) NOT NULL DEFAULT '',
`editors_all_time` int(10) DEFAULT NULL,
`editors_new` int(7) DEFAULT NULL,
`editors_ge_5` int(7) DEFAULT NULL,
`editors_ge_25` int(7) DEFAULT NULL,
`editors_ge_100` int(7) DEFAULT NULL,
`articles` int(12) DEFAULT NULL,
`articles_new_per_day` int(9) DEFAULT NULL,
`articles_over_bytes_500` int(12) DEFAULT NULL,
`articles_over_bytes_2000` int(12) DEFAULT NULL,
`edits_per_article` decimal(9,1) DEFAULT NULL,
`bytes_per_article` decimal(9,1) DEFAULT NULL,
`edits` int(12) DEFAULT NULL,
`size_in_bytes` int(15) DEFAULT NULL,
`size_in_words` int(15) DEFAULT NULL,
`links_internal` int(15) DEFAULT NULL,
`links_interwiki` int(15) DEFAULT NULL,
`links_image` int(15) DEFAULT NULL,
`links_external` int(15) DEFAULT NULL,
`redirects` int(15) DEFAULT NULL,
PRIMARY KEY (`date`,`project_code`,`language_code`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;
Suggested Table Design
[edit]Summary of changes:
- Merge different tables in one single table
- change CHARSET to UTF8
- change column type date to timestamp
- set engine to Innodb instead of MyISAM.
- add view column which is of type ENUM
- change size of extension field to 4
- change size of language_code field to 4
- change bigint to bigint unsigned as 'value' will always be 0 or large
- rename binaries to value
- rename table name to aggregated_data
CREATE TABLE `analytics_aggregation` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`hashed_key` integer NOT NULL,
`trait_instance_key` varchar(512) DEFAULT NULL COMMENT 'hashed_key is generated from this more descriptive key',
`count` bigint(9) unsigned DEFAULT NULL,
`project` char(32) NOT NULL COMMENT 'Wikimedia project code',
`language_code` char(4) NOT NULL COMMENT 'Wikimedia language code, usually derived from project URL',
`aggregate_timestamp` timestamp NULL DEFAULT NULL COMMENT 'timestamp for which this aggregate counts data',
`granularity` integer NULL DEFAULT NULL COMMENT 'time granularity in seconds for which this aggregate counts data ',
`theme` enum('readers','editors','devices','articles','diversity','ecosystem','files','context') DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `hashed_key` (`hashed_key`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;