Wikimedia Technical Conference/2018/Session notes/Architecting Core: stand-alone services
Slides that were used to guide this session are available on Commons.
Goals for the session (see slide for full text)
[edit]A specific set of criteria for determining whether functionality goes in MW or a standalone service. In essence, the outcome of this session will be an RFC, comprising the criteria, requirements, and expectations for MediaWiki functionality that is provided in the form of a standalone service.
Definition of a standalone service (see slide for full text)
[edit]For purposes of this session, a standalone service has the following properties.
- Business logic in separate runtime from MW
- Interacts with MW via some remote mechanism
- API
- Queue
- XHR
- ?
- Does not directly access MediaWiki's data store
- May utilize MW extension(s) to call an external service, provided that the business logic is in the external service and not the extension
Exercise 1
[edit]Question 1: What properties make functionality a candidate for separation in to a separate service?
- Async
- Elevated security need
- State context independency
- 3rd party library exists (potentially in another language or in another form that makes integration in to MediaWiki difficult)
- Excessive resource needs
- Independently useful and/or can be replaced with something off the shelf
- Better lang or framework exists for solving the problem
- Independent scalability concerns
- Different ownership models/autonomy/rate of change
- Used to triage MW or fix it
- Need to ship quickly
Question 2: What properties disqualify functionality from separation in to a separate service?
- Require direct MediaWiki DB access
- Easy to do in the context of MediaWiki (using existing classes, for example), difficult to do outside of the context of MediaWiki.
- Too small to justify separation overhead
- Chattiness with the MediaWiki api
- Synchronous
- Needs extensibility by MW features/extensions
Exercise 2
[edit]Question 1: What existing MediaWiki functionality is provided by standalone services?
- Parsing (Potentially quick/large wins available through re-integration in to MediaWiki.)
- Thumbnailing
- ORES
- cp-jobqueue
- Eventstreams
- Map tiles
- Recommendation
- Search
- MCS (Mobile content service)
- Restbase (caching / routing)
- Citations
- Mathoid
- Graph rendering
- Translations
- WDQS
- Analytics
- Routing (Potentially quick/large wins available through re-integration in to MediaWiki.)
- CDN
Question 2: What existing MediaWiki functionality could be provided by standalone services?
- A/B Testing
- Job Queue
- Server Side Rendering
- Maps
- Inter-Service Discovery/Routing
- Users and Auth
- Echo Notification
- URL Routing
- l10n/i18n
- (Some) special pages
- Edge purger
- Media handling & transcoding
- URL shortening
- Reading lists
- watch lists
- Revision service
- Parser
Exercise 3
[edit]Question 1: What technical/architecture requirements should apply to all standalone services?
- Minimize data collection
- OSI licensed
- Respect GDPR and other applicable data privacy frameworks
- Must do a thing
- Should not be redundant with other services
Question 2: What additional requirements should apply to standalone services in Wikimedia production?
- SLIs/SLOs
- WMF-compatible monitoring
- Has a privacy policy and policy practices that are compatible with WMF privacy policy
- Uses Wikimedia deploy tooling
- Has passed WMF Security review
- Uses a language and toolset that have been approved by TechCom
- Has an owner
- Has Runbooks
- Is licensed under an OSI-approved Open Source license
- Has WMF compatible structured logging
- Swagger specs
- Fault tolerant
- Multi data center
- Backups
- Pinned/Pinnable dependencies
- Horizontal scalability
- Documentation
- Trusted upstream asset chain
- Performs sufficiently for Wikipedia use cases
- Has users (or a plan to get users)
Question 3: What additional requirements should apply to standalone services distributed for 3rd party use?
- Easy to install
- Versioned (semver) - Compatible with supported MW releases (LTS)
- Easy to upgrade and to extend
- Public docs on install, upgrade
- config outside of code
- Operationally independent of wikis
- Open source - usable, accepts patches, etc
- Small footprint
- Public security advisories
- Support channel