Wikimedia Discovery/Meetings/Checkin/2017-10-03
Topics from the past
[edit]- Wikimedia Developer Summit 2018 (Jan 22-23): Call for Position Statements, due Friday 9/29
- Chris: We currently have fewer position statements than spaces for the developer summit by a significant number and the deadline for applying is in 3 days. You are encouraged to apply! The odds are ever in your favor.
Announcements, Information, Questions
[edit]- Quarterly Review for Search Platform (Technology) scheduled for 10/16
- Quarterly Review for Audiences:Readers scheduled for 10/18
- Deployment schedule (train and SWAT) for upcoming (end of year) holiday period:
- No MW train the week of Thanksgiving (Nov 20th), SWATs open for high priority things
- No deploys weeks of Dec 18th and 25th (last two weeks)
- Normal week week of Jan 1st (minus no deploys that Monday)
- The Dev Summit and WMF All Hands is the week of January 22nd, so that will be a "No Train but SWATs OK on Mon/Tues/Wed" week.
- The following week (week of January 29th) the Release Engineering team will be on an offsite, so a week of "No Train, but SWATs and service deploys OK".
- More information: https://lists.wikimedia.org/pipermail/engineering/2017-September/000475.html
Scrum of Scrums
[edit]Are we blocked?
- None
Are we blocking?
- None
Other dependencies (in either direction) which don’t need to be called out as “blocked” (e.g. are progressing smoothly, have no urgency, etc.)
- None
Discovery News
[edit]
Quick Quarterly Goals/KPI Update (if needed)
[edit]Discovery Roadmap FY 2017/18: https://docs.google.com/a/wikimedia.org/presentation/d/1N41eNrz0vFHJamLkhQjSFCOSiDg-bKr5H3ROchqwVJU/edit?usp=sharing
FY 2017-18 Q2 (Oct-Dec) goals: https://www.mediawiki.org/wiki/Wikimedia_Discovery#Projects
This status was last updated 2017-10-03. Completed/dropped goals may not be shown.
Tech:Search Platform
[edit]Search:
1. Implement advanced methodologies such as “learning to rank” machine learning techniques and signals to improve search result relevance across language Wikipedias.
- Begin to automate the machine learning pipeline, starting by targeting eight to ten languages, other than English, that match (at a minimum) current performance and then deploy those models.
2. Improve support for multiple languages by researching and deploying new language analyzers as they make sense to individual language wikis.
- Investigate open source language software that is available and see if it can be converted into ElasticSearch plugins.
- Investigate usage of fall-back languages and fuzzy (phonetic) matching.
- Continue general language support.
3. Investigate how to expand and scale Wikidata Query Service to improve its ability to power features on-wiki for readers
- Work on sub-category filtering and searching within the Wikidata Query Service.
4. Address technical debt:
- Convert existing Selenium tests to Node.js
- Investigate ownership and maintenance of Logstash
Structured Data on Commons:
1. Commons search will be extended via CirrusSearch and ElasticSearch and Wikidata Query Service, to support searching based on structured data elements describing media.
- Determine advanced search requirements and measures for structured data on commons.
2. Advanced search capabilities (e.g., Wikidata Query Service, SPARQL queries) will be updated to support the more specific media search filters and the relationships to the topics they represent
- Begin work on prefix- and full-text search in ElasticSearch on Wikidata in preparation for the Structured Data on Commons project.
WDQS: Wikidata Query Service goal for this quarter will be to work on sub-category filtering and searching within the Wikidata Query Service; it will be maintained by Stas and Guillaume to support the continued growth and use of the service; the Analysis team will help with statistics.
Audiences:Readers:Discovery
[edit]https://www.mediawiki.org/wiki/Wikimedia_Audiences/2017-18_Q2_Goals#Readers
Portal: Update the Wikipedia.org portal codebase to be completely automated for ease of ongoing maintenance.
- Automate portal project updates: statistics and translations
Maps: Support the move to be more operationally centralized and roll out a new map style that has numerous updates and enhancements.
- Finalize and deploy new map style; replicate maps test cluster in Wikimedia Cloud Service; monitor for critical bugs
Analysis: The team will continue to work closely with the Search Platform team to analyze A/B tests and other assorted data; they will also begin working on determining a baseline set of metrics for Structured Data on Commons.
FYI
[edit]- Oct 9: US Holiday
- Nov 11: US Holiday
- Nov 23-24: US Holiday
- Dec 25-29: WMF Holiday week
- OΟО
- Erika out 10/3-10/4 for Agile Open NorCal conference
- Stas on PTO 10/5-10/6, nothing strange or loopy
- Erika out 10/13 for Transformative Technology Conference (strange AND loopy)
- Paul off October 19 - 22 + travel for State of the Map US in Boulder, CO
- Deb off October 20 for State of the Map US in Boulder, CO
- Chelsy off October 20 for vacation (instead of Columbus on Oct 9)
- Chris at Readers Apps off-site Oct 16-20 in Philadelphia
- Chris at Readers Web off-site Oct 23-27 in Queens, NY
- Mikhail at Readers Apps off-site Oct 16-20 in Philadelphia
- Stas at WikidataCon 28-29 Oct in Berlin (out on 26th, back on Oct 31)
- Guillaume on vacation Nov 20-24