Readers/Web/Metrics Platform Adoption/Sampling Rates

From mediawiki.org

Overview[edit]

This document provides an overview of configuration settings and sampling rates for event tracking across different wikis on both the Event Platform and Metrics Platform. It outlines how these settings are implemented in configuration files and how the sampling rates are structured for desktop and mobile tracking.

Key Changes in Sampling Rates[edit]

This section highlights significant changes in sampling rates as we transition from Event Platform to the Metrics Platform. These changes affect how data is collected across different wikis and devices, ensuring clarity and consistency in event tracking.

  1. Consistency and Adjustment in Default Sampling Rates:
    • The default sampling rates for both desktop and mobile events are now consistent at 0.2 in the Metrics Platform.
    • Previously in the Event Platform, the default sampling rate for mobile was 0.1. This increase aligns mobile tracking with desktop rates, facilitating uniform data collection and analysis.
  2. Explicit Setting for OfficeWiki:
    • In the Metrics Platform, OfficeWiki's sampling rate for mobile has been explicitly set to 0. This is a change from Event Platform, where no explicit rate was defined for OfficeWiki. This adjustment ensures that no mobile data is unintentionally collected from OfficeWiki.

Configuration Files[edit]

Event Platform[edit]

Metrics Platform[edit]

Sampling Rates[edit]

EventPlatform DesktopWebUIActions Sampling Rates[edit]

Sampling rates for desktop events are defined in InitiliseSettings.php under the variable wgWMEDesktopWebUIActionsTracking. Each wiki has specific sampling rates set, with some special configurations for testing and other scenarios:

'wgWMEDesktopWebUIActionsTracking' => [
    'default' => 0.2,
    'legacy-vector' => 0,
    'officewiki' => 0,
    'testwiki' => 1,
    'enwiki' => 0.01,
    // Additional specific wiki configurations
    ...
];

DISCLAIMER: The sampling rates differ between logged-in users and anonymous users. For detailed configuration, refer to the DesktopWebUIActionsTracking configuration.

EventPlatform MobileWebUIActions Sampling Rates[edit]

MobileWebUIActionsTracking is configured in the same file, under the variable wgWMEMobileWebUIActionsTracking, with similar specific settings per wiki:

'wgWMEMobileWebUIActionsTracking' => [
    'default' => 0.1,
    'enwiki' => 0.01,
    'testwiki' => 1,
    // Additional specific wiki configurations
    ...
];

Metrics Platform Combined Desktop and Mobile Sampling Rates[edit]

In the Metrics Platform, desktop and mobile events are combined into a single event stream. Configuration is located starting at line 942 in ext-EventLogging.php. The default sampling rate for both desktop and mobile is set to 0.2. The configuration is detailed as follows:

'mediawiki.web_ui_actions' => [
    'schema_title' => 'analytics/mediawiki/product_metrics/web_ui_actions',
    'destination_event_service' => 'eventgate-analytics-external',
    'producers' => [
        'metrics_platform_client' => [
            'provide_values' => [
                // Various properties collected for events
                ...
            ],
        ],
        'sample' => [
            'unit' => 'session',
            'rate' => 0.2,
        ],
    ],
];

Special Wiki Configurations[edit]

Certain wikis have special configurations, such as legacy-vector and officewiki, which are explicitly set to a sampling rate of 0 in the Metrics Platform. TestWiki retains a sampling rate of 1, and enwiki has a sampling rate of 0.01, matching the settings in Event Platform. These have been placed at the end of the file, per the guidance of the Data Engineering team.

'+legacy-vector' => [
    'mediawiki.web_ui_actions' => [
        'sample' => [
            'rate' => 0,
        ],
    ],
],
'+officewiki' => [
    'mediawiki.web_ui_actions' => [
        'sample' => [
            'rate' => 0,
        ],
    ],
],
'+testwiki' => [
    'mediawiki.web_ui_actions' => [
        'sample' => [
            'rate' => 1,
        ],
    ],
],
'+enwiki' => [
    'mediawiki.web_ui_actions' => [
        'sample' => [
            'rate' => 0.01,
        ],
    ],
],

Notes[edit]

  • These configurations are intended to be consistent across both platforms, with the transition from separate mobile and desktop events in the Event Platform to a combined approach in the Metrics Platform. This change aims to streamline data collection and analysis.