Wikimedia Engineering/Report/2013/March
Engineering metrics in March:
- Approximately 113 unique committers contributed patchsets of code to MediaWiki.
- The total number of unresolved commits went from about 830 to about 816.
- About 48 shell requests were processed.
- Wikimedia Labs now hosts 154 projects and 1,103 users; to date 1,641 instances have been created.
Major news in March include:
- Open registration for the Amsterdam hackathon, including travel sponsorship;
- Lua scripting launched on all WMF wikis, with implications for performance and structured data use;
- A redesign of the Translate interface and other progress on translation and language-related tools;
- Fresh, friendly instructions on reporting a technical problem and an invitation to help prioritize problems to fix;
- Greater ability to upload images to Commons from mobile phones, allowing you to directly add a photo to a Wikipedia article that has no image;
- Collaboration with the Noun Project towards creating an "Encyclopedia Collection" of free icons;
- A deeper look at Parsoid and the challenge of a better editing interface;
- Wikipedia Zero winning a SXSW Interactive award for activism and gaining a new partner, Axiata.
Note: We're also providing a shorter, simpler and translatable version of this report that does not assume specialized technical knowledge.
Upcoming events
[edit]There are many opportunities for you to get involved and contribute to MediaWiki and technical activities to improve Wikimedia sites, both for coders and contributors with other talents.
For a more complete and up-to-date list, check out the Project:Calendar.
Personnel
[edit]Are you looking to work for Wikimedia? We are hiring for many positions, and we really love talking to active community members about these roles.
In Engineering:
- Director of Analytics
- Software Engineer - Editor Engagement
- Software Engineer - Parser
- Software Engineer - Language Engineering
- Software Engineer - Mobile
- Software Engineer - Multimedia Systems
- Software Engineer - Multimedia User Interfaces
- Software Engineer - Search
- Product Manager - Mobile
- Director of User Experience
- Visual Designer
- Dev-Ops Engineer (SRE)
- Operations Engineer - Database Administrator
We have seven additional openings in other Foundation departments.
Announcements
[edit]Two new full-time employees started in WMF engineering in March:
- Yuri Astrakhan, Senior Software Engineer in the Mobile group (announcement).
- Adam Baso, Senior Software Engineer, Mobile (Engineering) (announcement).
Technical Operations
[edit]Site infrastructure
- This month we saw a few short site glitches that lasted from about a minute to ten minutes each. The outages did not noticeably affect readers, but editors and contributors experienced intermittent problems.
- The first incident was triggered by a deployment of Article Feedback Tool v5, and once the code was reverted, the site outage ended. The incident lasted for about 10 minutes (incident documentation).
- The other two were jobqueue-related, according to Asher Feldman. The current MySQL jobqueue implementation is far too costly. In analyzing the data during that 24-hour period, we see that 75% of all queries that take over 450ms to run on the English Wikipedia master are related to the jobqueue, and all major actions result in replicated writes. In fact, the jobqueue takes 58% of all query execution time when not limiting the analysis to queries over the slow threshold. If 1 million refresh-links jobs are queued as quickly as possible without paying attention to replication lag, that causes the Apache servers to experience time-out due to the replication lag. MediaWiki depends on reading from slaves to scale, and avoids lagged ones. If all slaves are lagged, the master is used for everything, and if this happens to English Wikipedia, the site falls over. This MySQL jobqueue was identified as a scaling bottleneck a while ago, and thus we will be switching to Redis very soon. We're currently aiming for that switch to coincide with the release of 1.22wmf1, but we may be able to backport to 1.21wmf12 and get this done in early April.
- On March 12, we experienced a Esams site outage which was probably caused by packet loss between Esams and Eqiad. Leslie changed routes from Esams to Eqiad to fix the packet loss, which caused Esams to recover. While we still don't clearly understand what caused the outage, we did notice it coincided with the news release when the new Pope was elected. The election did trigger a surge in traffic to our web properties.
- In March, we had a short security sprint led by Leslie Carr. We patched servers that needed security upgrades. In addition, we continued to work on MariaDB migration, Ceph deployment and fixing Varnish bugs.
- TechOps has initiated a fortnightly meeting with the engineering teams to drive alignment amongst the various engineering projects and TechOps regarding requirements and expectations. This is also the process to surface potential deployment issues (such as capacity demand, new infrastructure and performance). Meeting minutes are documented on the meeting Etherpad.
Fundraising
- Added logging to fundraising deployment scripts.
- Work is continuing on tools for import. Setting up a local copy of a wiki which includes only a subset of the page content has always been problematic, since this requires use of the notoriously slow and finicky importDumpphp maintenance script. Under development is a tool to filter the currently produced SQL table dumps against a list of page IDs of a content subset; these tables could then be imported into a MySQL database, along with tables produced from the content subset, bypassing the need for importDump.php. Additionally, these SQL fles could be shared with other users who are interested in the same content subset. We hope to be able to launch this in April.
- A schedule has been posted for LabsDB. A number of glusterfs stability work has been done. We've also begun work on a replacement for project storage. A new feature has been added in support of tool labs: service users and groups. This feature is per-project manageable service users and groups. A number of interface fixes were made, such as adding the admin list to the project page and layout of the instance pages. Network changes were made to the instances and network hosts: the network node has 3 bonded 1GB NICs and all instances were changed to use the virtio network driver, which increases their speed to the speed of the host. Work on tool labs progressed well this month. Most of the necessary infrastructure is available for many tools and bots.
Editor retention: Editing tools
[edit]The parser test framework now supports language-specific tests, which required support for loading language-specific default setting in Parsoid.
The serializer is now fully DOM-based and uses constraint-based newline / white-space separator handling, which will make the serializer less sensitive to newlines and whitespace in HTML. Round-trip test results of 82% (pages without any diffs) and 98% (pages without semantic diffs) indicates that the new serializer is on par with the old serializer currently deployed on production.
Extension content is now parsed all the way to DOM, which enforces proper nesting. The generic support for balanced fragment parsing will later also be applied to templates. Parsing of transclusion directives (includeonly and friends) has also been improved and simplified.
The DOM specification for images and templated / extension content was fleshed out in preparation for full editing support.
Late in March, C. Scott Ananian joined us as a contractor. Welcome!Editor engagement features
[edit]Flow Portal/Project information
Editor engagement experiments
[edit]For the Getting Started project, the team launched a new version on English Wikipedia, which included a new landing page with additional types of tasks suggested for brand new editors to try. The list of tasks is now generated by a basic recommender system built by Ori Livneh, which gathers, filters, and delivered a fresh list of tasks automatically for every editor. This new backend paves the way for releasing the "getting started" feature on other projects, after we've completed data analysis and testing to understand which kinds of tasks are ideal for first time editors. Additionally, Matt Flaschen collaborated with the Editor Engagement Features team to build notifications to welcome new editors and invite them to contribute via the Getting Started.
For the account creation and login work, S Page, Munaf Assaf, and the rest of the team rebuilt our design to work with MediaWiki core, and solicited reviews from outside the team. We currently plan to launch both interface redesigns on an opt-in basis in April, to have editors test the localization and other functional aspects of the forms via a URL parameter, before we enable them as default.Support
[edit]- The latest translation interface and translation editor improvements for the Translate extension were completed and deployed. The new interface was made default with MediaWiki 1.22wmf1, later enabled on Meta on 2013-04-03; the old translation editor is planned to be removed completely in a few weeks. The user documentation has already been updated.
- The Search improvements were completed as well, with a pilot of the new Special:SearchTranslations (based on a Solr backend) that will be extended to Wikimedia projects in a few days.
- Publication of the language coverage matrix.
- Development progress for the translatewiki.net homepage redesign.
Language community outreach:
- The Language Engineering team kickstarted its Language Support Maven plan for getting language tools feedback from Wikimedian community members who are using internationalisation and localisation tools developed by the team. The team also held its regular monthly office hours in March. The team's outreach coordinator also reported team progress with multiple blog posts on the technology blog. The team plans to restore its bug triage sessions, starting in April 2013.
MediaWiki Core
[edit]frame:callParserFunction()
and frame:extensionTag()
, improved CPU time accounting, and allowed argument expansion to be excluded. We have patches outstanding for "text" module including unstrip functionality, as well as improved debug output. We've also made significant improvements to templates since the launch. Site performance and architecture
Security auditing and response
Quality assurance
[edit]The continuous integration site has been moved from integration.mediawiki.org to integration.wikimedia.org and is now always on HTTPS. The index page has been rewritten based on Twitter Bootstrap (see integration.wikimedia.org).
Antoine Musso has given our Zuul status page an overhaul. It features live reloading through ajax and contains direct links to the Gerrit changesets and Jenkins jobs. A big improvement over the plain text version.
Antoine Musso and Timo Tijhof set up the new doc.wikimedia.org portal. The MediaWiki core (Doxygen-generated) PHP documentation has been moved here (svn.wikimedia.org/doc is now a redirect). We're currently working on packaging jsduck and writing Jenkins jobs to generate JavaScript documentation with JSDuck.
We've packaged various Python modules for the Debian project, which will in turn let us simplify deployment. Meanwhile, we're experimenting with having our Debian/Ubuntu packages built by Jenkins directly.
This month we've continued to extend Jenkins coverage for Gerrit repositories. We're happy to announce that almost all repositories for MediaWiki extensions in Gerrit now have Jenkins integration.Analytics
[edit]Visualization, Reporting & Applications
- In order to support mobile initiatives--including the Mobile Website, Mobile Apps, and Wikipedia Zero--we focused our attention on providing data extracts and visualizations with this focus. New visualizations include the Mobile app dashboard.
- In addition, we updated the report card for the March Metrics Meeting, improved the robustness of the reportcard infrastructure, added target bars and added links to the metric definitions.
Wikistats
- We are currently working on a new mobile pageview report.
Services & Access Points
- In March, we saw the launch of the User Metrics API, a service that allows researchers to perform cohort analysis on various data sets, making it easier to measure the effects of programs and platform experiments among discrete sets of users. We are currently working on improving the web-based user interface to make it available for use outside of Wikimedia Foundation staff in the coming months.
Analytics Infrastructure
- Our big-data cluster known as Kraken has been undergone no major changes in capability, but we have been working to make it more robust and improve security. Our udp2log monitoring has become more accurate, and Limn can be installed on both production and Labs instances.
Misc: Defects Closed
- Fixed the Space characters in pagecounst-raw titles bug.
Misc: Management & Communication
- The Analytics team has started to use Mingle to manage its work more effectively day-to-day. Bugzilla remains our primary interface for managing defects with respect to communicating their priority and status.
- Finally, we had our Analytics Reboot meeting, where all internal WMF Analytics stakeholders convened and we surveyed what customer opportunities were out there, what Analytics models are currently available, and how to improve inter-team communication.
Engineering community team
[edit]Two bugdays took place as part of the QA Weekly Goals: cleaning up and retesting General MediaWiki reports and a bugday concentrating on the LiquidThreads extension. For the latter, 76 out of 218 open reports received updates. Valerie analyzed which important Wikimedia feedback channels link to each other and Bugzilla, and created a diagram of the current situation. Valerie also published two blogposts explaining how to create a good first bug report and how to help Wikimedia squash software bugs. Andre improved the Bugzilla Weekly Report email to the wikitech mailing list. On most open bug reports with a target milestone set to future MediaWiki version 1.21.0, reminder comments were added for developers. Andre and Valerie also held the first IRC Office Hour on Bugzilla and Bug management for those interested in discussing problems and improvements with Wikimedia's bug management. In Bugzilla's internal product and component taxonomy, several Mobile application products were merged into a single "Wikipedia App" product and two Search components were merged, to simplify finding information for developers and reporters.
Also, the bug management task list received a major cleanup, making it clearer what is being worked on and what you can help with.- Completing Round 5 of Outreach Program for Women with casual evaluation meetings with the 6 intern/mentors teams. To be summarized with a blog post in early April.
- Wikimedia's application for Google Summer of Code 2013 and a new round of Outreach Program for Women including 17 common project ideas with mentors.
- Making Possible projects the reference list of big tasks to potential contributors. Featured project ideas must go through a reality check considering project feasibility for newcomers, availability of mentors and community/maintainers buy-in, which we check through related feature requests filed in Bugzilla.
- Announcing a GSoC and other open source internship programs meetup in San Francisco on April 11.
Volunteer coordination and outreach
- Drafting Wikitech contributors, a proposal to attract technical volunteers and connect them with interesting people and activities in a single site: wikitech.wikimedia.org. Helping define the proposal for distribution of content between Wikitech and mediawiki.org.
- (Re)defining factors for measuring success in QA activities.
- QA weekly goals: supporting and promoting our first browser automation activity: Browser automation testing for Wikipedia Search. Also the general MediaWiki reports Bug Triage and the LiquidThreads Bug Triage.
- San Francisco meetups: organized Lua meets Wikipedia meetup.
- Helping the Security for developers training meeting and creating a wiki page to be recycled for future editions.
- Drafting criteria to become an official Wikimedia mobile app in sync with Mobile Programs & Engineering teams.
The Kiwix project is funded and executed by Wikimedia CH.
- Work on the 0.rc3 release of Kiwix is ongoing, mostly consisting of bug fixing and a few UI improvements. The release date is in around one month. For the first time, a ZIM file of Wikisource (in French) was done, within the scope of the Afripedia project.
The Wikidata project is funded and executed by Wikimedia Deutschland.
- Denny Vrandečić and Lydia Pintscher gave a short update on Wikidata's status at the metrics and activities meeting. A more detailed analysis can be found in our blog post. In addition, Wikidata phase 1 (language links) has been activated on the remaining 282 Wikipedias. This means that all Wikipedias now get their language links from Wikidata. Not too long after that, phase 2 (infoboxes) was activated on the first 11 Wikipedias. They can now make use of shared structured data from Wikidata in their articles. On Wikidata itself we introduced a new data type (
string
), extended references in statements (they can now have multiple values), and improved the search box.
- We have written down how we envision queries on Wikidata and would appreciate your feedback.
- As a nice demonstration of the potential of Wikidata we've seen two new projects this month: Wiri and a tree of life.
Future
[edit]- The engineering management team continues to update the Deployments page weekly, providing up-to-date information on the upcoming deployments to Wikimedia sites, as well as the engineering roadmap, listing ongoing and future Wikimedia engineering efforts.