Wikimedia Research/Showcase/Archive/2022/05
Appearance
May 2022
[edit]May 18, 2022 Video: YouTube
- Ms. Categorized
- Gender, notability, and inequality on Wikipedia
- By Francesca Tripodi (University of North Carolina at Chapel Hill)
- For the last five decades, sociologists have argued that gender is one of the most pervasive and insidious forms of inequality. Research demonstrates how these inequalities persist on Wikipedia - arguably the largest encyclopedic reference in existence. Roughly eighty percent of Wikipedia's editors are men and pages about women and women's interests are underrepresented. English language Wikipedia contains more than 1.5 million biographies about notable writers, inventors, and academics, but less than nineteen percent of these biographies are about women. To try and improve these statistics, activists host âedit-a-thonsâ to increase the visibility of notable women. While this strategy helps create several biographies previously inexistent, it fails to address a more inconspicuous form of gender exclusion. Drawing on ethnographic observations, interviews, and quantitative analysis of web-scraped metadata this talk demonstrates that womenâs biographies are more frequently considered non-notable and nominated for deletion compared to menâs biographies. This disproportionate rate is another dimension of gender inequality on Wikipedia previously unexplored by social scientists and provides broader insights into how womenâs achievements are (under)valued in society.
- By Francesca Tripodi (University of North Carolina at Chapel Hill)
Relevant paperË Ms. Categorized: Gender, notability, and inequality on Wikipedia - Francesca Tripodi, 2021 (sagepub.com)
- Controlled Analyses of Social Biases in Wikipedia Bios
- By Yulia Tsvetkov (University of Washington)
- Social biases on Wikipedia could greatly influence public opinion. Wikipedia is also a popular source of training data for NLP models, and subtle biases in Wikipedia narratives are liable to be amplified in downstream NLP models. In this talk I'll present two approaches to unveiling social biases in how people are described on Wikipedia, across demographic attributes and across languages. First, I'll present a methodology that isolates dimensions of interest (e.g., gender), from other attributes (e.g., occupation). This methodology allows us to quantify systemic differences in coverage of different genders and races, while controlling for confounding factors. Next, I'll show an NLP case study that uses this methodology in combination with people-centric sentiment analysis to identify disparities in Wikipedia bios of members of the LGBTQIA+ community across three languages: English, Russian, and Spanish. Our results surface cultural differences in narratives and signs of social biases. Practically, these methods can be used to automatically identify Wikipedia articles for further manual analysisâarticles that might contain content gaps or an imbalanced representation of particular social groups.
- By Yulia Tsvetkov (University of Washington)
Relevant papers: TheWebConf'22, ICWSM'21