Jump to content

User:DKinzler (WMF)/Modular REST API

From mediawiki.org

This is a proposal to introduce the concept of component APIs [name TBD] into MediaWiki's REST framework. The goal is to enable teams to clearly defined and easily evolve their APIs. This would be achieved by three mechanisms:

  1. Explicit specifications for each component API based on OpenAPI
  2. Support for independent versioning for each component API
  3. Support for zoning component APIs

This proposal aims to make MediaWiki more oppinionated about the structure of the API, providing more guidance and guarantees to maintainers and consumers. This is done to improve maintainability and usability, at the expense of flexibility.

Impact for Clients[edit]

Per this proposal, endpoints exposed by MediaWiki core would eventually no longer all share the /v1/ prefix. Instead, endpoints exposed by core would be grouped into component APIs [name TBD], and each such component API would be exposed under a path prefixes such as /content.v2/ or /search.v1/ which include a version identifyer [or /content/v2/ resp. /search/v1/, TBD]. The old paths will remain available for some time, to allow clients to migrate without disruption.

Per convention, extensions are already using component perfixes with version numbers for their API endpoints: E.g. CheckUser uses /checkuser/v0/, OAuth uses /oauth2/, etc. For backwards compatibility, core could define /v1/ as one "component" (and /coredev/v0/ as another component).

Explicitly defining specifications for component APIs has the following advantages for API consumers:

  • When an API is updated to a new version due to a breaking change, only the paths of the affected component will change. If all endpoints are versioned together (as is currently the case for MW core), either the path for all endpoints would have to change, or the breaking change would be performed without a updating the version number, potentially leading to confusion and disruption.
  • Internal and unstable APIs can be marked as such by including a zone prefix. A prefix like e.g. /internal:components.v1/ or /beta:content.v3/ would make it clear that endpoints under it are not stable for public use. With the old system, internal endpoints are mixed in with public ones, developers need to read the specification of each endpoint in order to know which on is considered stable.
  • The set of endpoints in each component would be well defined, and the spec for that component would be a cohesive and stable document. Currently, specs of all APIs are aggregated into a single specification which is different for each wiki, depending on configuration and enabled extensions.
  • The team responsible for a given endpoint could easily be determined based on the prefix, and would be documented in the spec file. Currently, this information could only be maintained separately for each endpoint, which is tedious and prone to growing stale.

Impact for Maintainers[edit]

Core Maintainers[edit]

Currently, all REST routes for MediaWIki core are defined in the coreRoutes.json file (plus the coreDevelopmentRoutes.json file, for experimental routes not enabled in production). MediaWiki enforces no structure on the paths, extensions can add endpoints anywhere.

Per this proposal, there would be one JSON file defining each component API [name TBD], containing meta-data about the component along with mapping of paths to handler classes (similar to the files thet RESTbase uses to define API modules). Each component would have a unique path prefix, and the set of endpoints under that prefix would be the one defined in the JSON file. Extensions would not be abl to add endpoints under the same prefix.

Conceptually, a component API groups together a set of related REST endpoints that belong to the same business domain (or bounded context). Each component API has its own OpenAPI spec, is owned by a single team, and is versioned independently of other component APIs. The JSON file describing the component API can contain JSON schemas for data structures used to interact with the endpoints, which will be integrated with the OpenAPI spec. Data structures should at least be consistent among the endpoints of one component, but some will probably be shared among all ccore components (and possibly with extensions as well.).

This approach will have the following advantages for teams maintaining REST endpoints:

  • Teams are in in control of the life cycle of APIs they own, so they can evolve it over time, including breaking changes, without disrupting other teams.
  • The specification and implementation of all endpoints in a bounded context can be understoof by lookign at a single file.
  • Ownership of endpoints is clearly documented based on which component they belong to.
  • Inetrnal APIs can be clearly marked as such by including a zone prefix in the component's name, to clearly mark non-public APIs as such. This allows teams to evolve APIs more quickly.
  • Component APIs can be migrated out ot MediaWiki into a standalone service more easily, since network routing can be done based on path prefixes

Extension Maintainers[edit]

Extension maintainers would gain a clear way to mtaintain the spec for the extension's REST API. On the other hand, it would eventually become impossible for extensions to define endpoints under component prefixes defiend by core. This restriction follows from the idea that API components are defined as self-conmtained cohesive units of versioning and ownership. Allowing extensions to inject additional endpoints would undermine this idea.

Operations Engineers[edit]

This proposal provides a standard for encoding infromation about the component, version, and operational requirements (zone) in an endpoihnt's URL. This makes life easier for operations engineers in several ways:

  • Requests can be routed to different services or clusters based on the component name. Currently, endpoints handled by entirely different services may share a prefix, so knowledge of the entire path structure is needed in order to route a given request to the appropriate backend service.
  • Restrictions and optimizations for internal APIs can be enforced by the API gateway, based on zone prefix (or per component).

Implementation Notes[edit]

Current[edit]

Currently, there is a single Router object in MediaWiki's REST framweork that determins which Handler to use to respond to an incoming request, based on the requested path (aka the request URI). The mapping from paths to Handlers is maintained in the coreRoutes.json file (as well as the coreDevelopmentRoutes.json file, for experimental routes not enabled in production). Extensions can add their own routes using an RestRoutes entry in extension.json, or by adding to the RestAPIAdditionalRouteFiles configuration variable. Route files contain a list of route definition objects, which includes that path pattern and the handler class, among other things. All route files are loaded and processed, and the resulting data structure is cached so we don't have to load all these files on every request. The Router object operates on the cached data structure.

This way, MediaWiki core and any extension is free to define routes with any path patter. Shared prefixes are used for related endpoints per convention, but are not enforced. Similarly, version identifiers can be included in the paths, but are not requried or enforced.

Proposed[edit]

With the proposed new system, the RootRouter would first determin which component should handle a given request, based on the component prefix in the requested path. It then hands control to the ComponentRouter, which knows all the routes defined within the component, and the associated handler.

The information used by each ComponentRouter is loaded from the respective component's spec file (and cached). The format of the speci file remaons to be termined, but it will contain a mapping of paths to handler classes, and it will incorporate schemas for the data structures that can be used to interact with the endpoints.

The information used by the RootRouter is derived from the content of all component spec files, and also cached.

Questions[edit]

  • name (component? module?)
  • versions (slash or dot?) -> how to implement the prefix match?
  • Should we configure a map of prefixes to files? Or a list of files, each containing its prefix?
  • zones (good idea?)
  • spec files: more like openapi/restbase? Should we use the spec to validate parameters, or should the param validator generate the spec (as is currently the case)?
  • how should extension.json reference module files? Can there be more than one per extension?
  • should extensions be able to define endpoints under prefixes used by core (or other extensions?)