Product Analytics/Dashboarding Guidelines
Publishing/sharing
[edit]Before publishing and/or sharing your Superset dashboard, please double check that you have:
- contact information
- correct access and permissions for your data and your audience
Refer to the sections below for details.
Contact Info
[edit]Use the following template for the information at the top or bottom of the dashboard as Markdown:
This dashboard is maintained by {NAME}, [Product Analytics](https://www.mediawiki.org/wiki/Product_Analytics). If you have questions or feedback please email {name}@wikimedia.org or product-analytics@wikimedia.org
Permissions
[edit]Virtual datasets
[edit]For Presto-based charts that rely on virtual datasets derived from event data, make sure the stakeholder has been added to analytics-privatedata-access
group.
If they are not, ask them to request access through Phabricator. Refer T286746 to as an example.
Physical datasets
[edit]For charts that rely on Hive tables added as physical datasets, make sure that users outside of your group have read access to the files in Hadoop:
hdfs dfs -chmod -R o+r <path to your table>
Example
[edit]Suppose you did your ETL and created a countries.csv that you then make available in Hive via:
import wmfdata as wmf
wmf.hive.load_csv(
"countries.csv",
field_spec="name string, iso_code string, economic_region string, maxmind_continent string",
db_name="canonical_data",
table_name="countries"
)
You add it as a physical dataset within Superset and create a chart that relies on it. To make sure that everyone can view that chart (and dashboard) you would update permissions with:
hdfs dfs -chmod -R o+r /user/hive/warehouse/canonical_data.db/countries
If you loaded data into Hive manually and have the data available elsewhere, change the path accordingly.