Jump to content

Technical debt/Map

From mediawiki.org

Here's a quick review of the major pain points and areas of on-going migration and change in the MediaWiki technical ecosystem

Possibly we can try to put this in a table?
Theme Action needed Task Criticality Challenge Status
Key
  • 🟢 – Actively being worked on by staff as part of the Annual Plan
  • 🟡 – Actively being worked on by staff, but not as part of the Annual Plan

Work that is planned/hoped-for but currently on hiatus waiting for other things is not listed as "being worked on".

Technical debt by conceptual theme

[edit]

Fix things that are significant risks to production

[edit]
As well as the well-understood short-term reactive work that occurs, there are areas where our systems are currently unsustainable, and which need to be addressed or at least ameliorated to de-risk. The risk horizon of these areas varies from months to years, depending.
  • Reduce load and growth on the production systems
    • Improve use of the MediaWiki databases to grow less quickly
      • 🟢 (T337013) Split out the citations from the WDQS graph so it doesn't fall over
      • 🟡 (T343131) Pagelinks/etc. re-work so Commons doesn't fall over

Reduction in complexity and tightly-coupled code

[edit]
MediaWiki is a Labyrinthine monolith that has developed over more than two decades, which makes it hard to learn, hard to reason about, and hard to alter – changes in one area can often break or have other unexpected results elsewhere. This is an area where a lot amount of work has been done, especially by a handful of volunteers and staff, and yet there remains lots more to do.
  • (T376615) Migrate away from heavy classes to lighter-weight, smaller ones
    • Language
      • (T376565) Split out retrieving language code from Language object
    • Title
      • Switch uses of Title to PageIdentity etc., e.g (T278459) Switch uses of Title to PageIdentity where simple
    • User — (T231930) Replacement of User with Authority
      • (T271463) Migrate uses of PermissionManager to Authority
    • WikiPage — Switch uses of WikiPage to PageStore
  • Make code more isolated and easy to test
    • Drop global configuration reading/writing
      • (T71084) Convert everything to read from Config
      • (T212739) Drop use of, then drop, $wgConf
    • (T159283) Drop global non-configuration objects
      • (T160814) Drop use of, then drop, $wgLang
      • (T160812) Drop use of, then drop, $wgOut
      • (T160810) Drop use of, then drop, $wgRequest
      • (T159284) Drop use of, then drop, $wgTitle
      • (T159299) Drop use of, then drop, $wgUser
    • (#dependency_injection/) Migrate all code where possible to use dependency Injection for better isolation and testability
    • (T20654) Refactor EditPage.php to split logic from UX concerns

Address developer productivity issues

[edit]
Connected to but distinct from the complexity/connectedness issues from tightly-coupled code, it's vital that we have high-quality tools and engineering processes that make it easy to do the right thing, to increase system stability, ecosystem depth, and developer productivity in general.
  • Test our code better and more consistently to give greater confidence
    • (T249674) Run all of production's tests, rather than just a sub-set
      • (T238492) De-conflict tests from production extensions and fix them
      • (T225730) Speed up CI so it's less of a delay and productivity drain
    • (T226869) Run browser tests in parallel
    • 🟡 (T50217) Run PHPUnit tests in parallel
  • Upgrade our CI tools to the current versions, and consolidate on only having one at once
    • 🟡 (T138401) Replace the use of jsduck with jsdoc3 in all codebases
    • Selenium/Cypress
    • PHPUnit

Remove use of two technical systems where one will suffice

[edit]
There are many technical systems that have been over-taken by other systems but the replacement has stalled (or in some cases, never been started). Maintaining both is an unnecessary burden and on-boarding/complexity cost.
  • Roll-out wider use of newer, better systems (and ultimately remove the old ones)
    • 🟢 Replace the old parser with Parsoid
      • (T55784) For article views
      • (T310512) For generating metadata post-edit
      • For UX messages
    • 🟢 Replace RESTBase with the REST API
    • (T341775) Discourage, deprecate and stop using Xml methods for building HTML markup
      • (T255586) deprecate HTMLFormatter
      • (T297498) should probably discourage the use of string methods for building HTML markup, too!
    • Switch all Action API users we control to requesting JSON in formatversion=2
      • (T338439) etc.
      • Generally: come up with a deprecation policy for the Action API
        • to allow us to turn off formatversion=1 eventually
        • to allow "safe" changes to API results, notification of clients, etc.
    • (T167246) Complete the migration from "user" to "actor"
    • 🟡 (T28741) Migrate the file tables to a modern, scalable database structure
    • Drop the old dynamic extension registration system, rely on extension.json only
      • (T98668) Migrate all known extensions and skins to static extension registration
  • (T293710) Switch from having two JSON Schema libraries in PHP to only one
  • (T297498, T346829) Use a standard JSON codec
  • (T278278) Namespace all our PHP classes so we can use PSR-4 to load them more easily
    • 🟡 (T166010) Namespace all of MediaWiki core's PHP classes
  • (T198901) Migrate all production services from bare metal to Kubernetes
    • 🟢 (T290536) Migrate all production MediaWiki traffic to Kubernetes (aka "mw-on-k8s")
  • Switch front-end code to all use Codex, not various systems
    • (T281930) Switch mobile from bespoke code to Codex
    • Switch Visual Editor from OOUI
  • 🟢 Gerrit -> GitLab migration
    • (T335921) Migrate MediaWiki ecosystem code
    • (T349872) Depends-On for cross-repository dependencies
  • 🟢 (T343098) Migrate direct SQL queries to proper API
  • (T237773) Make wikitech.wikimedia.org a regular production wiki

Remove / reduce use of old user-facing features from production, ideally migrating their use to newer ones

[edit]
There are many features that have been developed over the past quarter-century and which are no longer things we wish to support, either because they never achieved their full vision, their work was over-taken by other work, or they proved to be less popular than appropriate for the level of usage they attracted. Maintaining them is generally a minor burden individually, though collectively larger, but their true cost occurs in emergencies or when they finally become so obsolete that they block wider work, and is borne mostly by ad hoc teams.
  • (T158181) Consolidate our mobile and desktop Web experiences
    • (T65504) Consolidate our mobile and desktop media overlay
  • (T49145) Drop jQuery UI
    • (T100270) Migrate code using jQuery UI to Codex
  • (T332022) Undeploy StructuredDiscussions (Flow)
    • 🟡 Migrate uses of StructuredDiscussions to regular discussion pages
  • (T350164) Undeploy LiquidThreads
    • 🟡 Migrate uses of LiquidThreads to regular discussion pages
  • Undeploy EasyTimeline
    • (T137291) Migrate uses of EasyTimeline to Graph extension or similar
      • 🟡 Fix the Graph extension
  • (T347972) 🟡 Undeploy MachineVision
  • (T161553) Undeploy OpenStackManager
  • (T290759) Undeploy VipsScaler
  • (T344534) Undeploy wikihiero
  • (T318522) Reduce complexity and brittleness of PageTriage
    • (T340117) Rewrite PageTriage toolbar in Vue
  • (T277883) Reduce complexity and brittleness of FlaggedRevisions
  • (T37704) Move use of inline style attributes to TemplateStyles
  • (T306043) Replace api.php as a user-facing front-end with the API Gateway
  • Migrate use of SpamBlacklist into AbuseFilter and undeploy
  • Migrate use of TitleBlacklist into AbuseFilter and undeploy
  • (T224922) Undeploy Collection extension and PDF render service
  • Remove the use of $wgRawHtml on donatewiki/internalwiki/thankyouwiki/collabwiki
  • [Controversial] Switch from having four-ish template systems (simple transclusion, ParserFunctions, Wikifunctions, Scribunto, plus a number of ad hoc partial re-implementations used for UX messages and in client-side JS) to fewer-than-that
  • (T125073) Be able to rename wikis where they use old or inaccurate language terms.
    • (T117845) "Serbian as spoken in Ecuador", (T36217) Emilian
    • Migrate content from wikidata that uses old or inaccurate language terms
  • Migrate use of core parser functions and the ParserFunctions extension to Scribunto or Wikifunctions and undeploy

Keep our technology stack current to keep up deployability, avoid security issues, and retain pace

[edit]
XYZ
  • Migrate production from Debian bullseye to bookworm
    • (T291916) Migrate from Debian buster to bullseye
  • Migrate production to PHP 8.2 or newer
    • (T319432) Migrate production from PHP 7.4 to PHP 8.1
    • 🟡 Fix critical tools to work in PHP 8.3+
      • 🟡 (T353362) Get CI to test and enforce PHP 8.3 compatibility
        • 🟡 Fix Wikimedia-production code to work and pass tests in PHP 8.3
    • (T314099) Remove the use of dynamic properties
  • (T189767) Migrate Scribunto to a more recent/supported Lua (5.4?)
  • (T364779) Migrate production Node services to Node 20
    • (T349118) Migrate production Node services to Node 18
      • (T308371) Migrate production Node services to Node 16
        • (T306995) Migrate production Node services to Node 14
          • (T290750) Migrate production Node services to Node 12
            • (T210704) Migrate RESTbase to Node 10+, or finally drop it from production
  • 🟡 Replace the Graph extension

Complete major feature development and land in a "good" state that is used and supported

[edit]
There are several major, partially-completed technical changes which should be disposed of in some form. This might be to complete the original vision, park them in a more complete form, or decide they will never happen.

Work on new technical improvements for planned / desired future features / capabilities

[edit]
There are a number of proposed technological changes that would enable significant user-facing features and/or capabilities, but are not currently resourced.
  • Provide a Server-Side Rendered form of Codex for server-built pages and non-JS clients
  • (T282585) Asynchronous Content Fragments
  • Switch wikitext parsing into "incremental" mode, significantly reducing load and increasing speed
    • (T114445) Provide for (and then require) templates to be balanced DOM nodes
      • (T114432) Provide for easier ways to use templates for complex contents e.g. discussion wrappers
  • Export MediaWiki's language system as a library in multiple stacks, to power the Web's language tools wherever
  • Replace MediaWiki's language converter with industry-standard tools (reuse of tools known by academic linguists and/or libicu/OpenCC and/or a NLP/machine translation based approach, glossaries in MCR, etc)
  • Progressively alter wikitext for consistency
    • Reduce/lint away corner cases to enable a reasonable spec
    • (T204370, T204283, T204371) Consistency between parser functions, magic words, and extensions
    • (T204366) Better varargs for templates
  • Provide a new wikitext syntax more suitable for discussions
  • [Controversial] (T90914) Provide a new wikitext syntax for semantically including illustrations, to replace the current directive-based one.
  • [Controversial] Global Templates
  • (T149667) Provide readers and/or editors the ability to author/use annotations on articles, enabling inline comments, pronunciation markup for text-to-speech (Wikispeech), translation correspondences, citation regions, etc
  • Mark type of MediaWiki messages (wikitext, plain text, raw html, …) to ensure they are authored and used appropriately.
    • Security-relevant w/r/t stronger checks for double-escaping, use of raw html, etc.
  • [Controversial] Unify databases between projects to make cross-wiki initiatives easier, for example dependency tracking between updates in multiple language versions of the same article (more).
  • [Controversial] (T113004) More flexible revision mechanism in core to integrate flagged revisions, unsaved stashed edits, proposed edits in the Draft namespace, storing unresolved edit conflicts, session checkpointing, etc.