Jump to content

Parsoid/Deployments/2015

From mediawiki.org

Wednesday, Dec 16, 2015 around 1:25 pm PT: Yes Deployed 64029e12

[edit]
  • T86271: Serialize <link>s on own line always
  • T121174: Carry over paramInfo when unpacking DOMFragment
  • Use Node.replaceChild() instead of delete/insert
  • Non-functional changes:
    • Fix stale paths to core-upgrade.js
    • T108140: Automate some of the tedium of manual regression testing

Monday, Dec 14, 2015 around 1:15 pm PT: Yes Deployed df3171e6

[edit]
  • Use babybird as the underlying Promise implementation
    • This is faster than the currently used implementation provided by core-js.
    • We have seen a 30% slowdown in WTS performance since the the async WTS version was deployed on Dec 9th.
  • Tweaks to the resource limits enforcing code

Friday, Dec 11, 2015 around 4:20pm PT: Yes Deployed ebd62ab5

[edit]
  • T120972: Introduce configurable wt2html/html2wt resource limits

Thursday, Dec 10, 2015 around 2:25pm PT: Config change deployed

[edit]
  • Config change: reduce request time out to 3 mins (from 4 mins earlier)

Wednesday, Dec 9, 2015 around 1:35pm PT: Yes Deployed a0c626e4

[edit]
  • T107818: Record first wikitext node in multi-template-content-block scenario
  • T104032: Fix html2wt newline constraints for paragraphs
  • T115720: Refactor WTS to be async
  • Bunch of code cleanup:
    • Cleanup in WikiConfig and parser environment constructors
    • Cleanup list handing in the serializer

Monday, Dec 7, 2015 around 1:15pm PT: Yes Deployed 4a7df427 (cf0b9ef + cherry-pick of d65debd)

[edit]
  • T115717: Strip trailing <nowiki />s
  • Update core-js to v1.2.6 and prfun to v2.1.2
  • Consolidate setting separators into a method to ensure consistent updates of SOL state

Wednesday, Nov 18, 2015 around 1:15 pm PT: Yes Deployed e0a4fc91

[edit]
  • T115327: Log errors passed along in express
  • T118715: Improvements to broken attribute parsing in self-closing tags
  • Non-functional changes
    • T93974: Allocate native extension objects once per doc
    • Removed dead code (Remove unnecessary indent pre stripping for refs)

Monday, Nov 16, 2015 around 1:15 pm PT: Yes Deployed 3a6f3b9e

[edit]
  • T118462: Support the newer scrub_wikitext form as well
  • T53444: Strip <br>s from headings via new HTML normalization routine

Thursday, Nov 12, 2015 around 9:25 am PT: Yes Deployed 392e25eb

[edit]
  • T118367: Kill dead code + fix bad perf in pathological scenarios.

Wednesday, Nov 11, 2015 around 1:15 pm PT: Yes Deployed 7ca999c1

[edit]
  • Remove api/server.js symlink to bin/server.js (no longer needed since the puppet patch updating paths has been merged and deployed)
  • T88827: Provide srcset attribute for images
  • T117566: Optimize insertion of transclusion shadow metas -- these metas are added for detecting fostered content from transclusions. These set of patches greatly reduces the volume of these meta tags and improves performance on a subset of pages that would previously take too long and cause timeouts.
  • When a template range is expanded to include a table, expand it to include fostered content from it.
  • Code cleanup in template wrapping + removal of some potentially edge case bug scenarios.

Wednesday, Nov 4, 2015 around 1:15 pm PT: Yes Deployed 04893a18

[edit]
  • Reduce logging volume for empty/li entries + turn of logging for empty/tr entries
  • Put express in production mode by default (enables view caching)
  • Non-functional changes: Code cleanup of the wikitext serializer

Monday, Nov 2, 2015 around 1:25 pm PT: Yes Deployed f0d77afc

[edit]
  • T115464: Add ability to sample log requests
  • Log template names that produce stripped empty elements
  • Fix sol handling in separators
  • Update DU.hasDiffMarkers helper
  • Non-functional changes: Reorganization of the Parsoid code repo + code cleanup.

Monday, Oct 26, 2015 around 1:50 pm PT: Yes Deployed 660c59a9

[edit]

Wikitext -> HTML fixes

  • DSR: Fix bugs in LTR propagation + fix buggy tests in DOMUtil helpers. This fix eliminates O(n^2) behavior in some cases.
  • Fix OOM issue: our old favourite (exp*)+ (cherry-picked to production on Oct 19)
  • An inline_break is a fine way to end a list

HTML -> Wikitext fixes

  • nowiki escaping: Reduce use of fullWrap scenarios

Other fixes

  • Remove forked _http_agent.js
  • Move stack suppression to the logger
  • Remove some dead code from parser.defines
  • Improve ApiRequest logging
  • T115185: Graph worker exit code / signals (cherry-picked to production on Oct 19)

Monday, October 19, 2015 around 1:20 pm PT: cherrypicked b317f33f and 60a82ae0

[edit]
  • T115072: Fix out-of-memory parse errors on some pages (regression since deploy of 44d657de on Wednesday, August 26, 2015)
  • T115185: Graph worker exit code / signals

These patches are being cherry-picked since master is not currently in deployable state.

Thursday, October 8, 2015 around 1pm PT: 998db843 to be deployed

[edit]
  • T86271: Serialize <link>s on own line always (affects newly added categories, magic words, and <*include*> directives).

Continues to be postponed since this deploy is dependent on a patch that needs review and testing. We have been backlogged because of parsing team offsite, vacations, quarterly planning and reviews. This should get unblocked this week.

Thurday, October 1, 2015 around 1:30 pm PT: Yes Deployed 62971510

[edit]
  • Set Main_Page as the default page name if none is provided in API requests.

Cuts down the errors showing up in kibana ... 100s of K errors in 3-4 bursts last 2 days.

Wednesday, September 30, 2015 around 1pm PT: Yes Deployed 39c60c67

[edit]
  • T114185: Support body_only parameter in v3 API.
  • Minor fix to WTS nowiki-ing of links whose hrefs could be magic links but whose content isn't appropriate.
  • T113666: Terminate autolinks on double or triple quotes
  • T84937: Terminate autolinks on &nbsp; and numeric entity encodings of <>

Tuesday, September 29, 2015 around 9:15 am PT: Turned on use of ParsoidBatchAPI in production

[edit]
  • Expected to reduce Parsoid's load on the Mediawiki API cluster
  • Expected to improve parse latencies
  • Improves image handling in some scenarios (T112631, T112045)

Monday, September 28, 2015 around 1:45 pm PT: Yes Deployed b9e5244e

[edit]
  • Update request to 2.63
  • T113206: Fix batch retries
  • T105413: Do not allow data-ooui attributes in wikitext
  • Turn on use of ParsoidBatchAPI in production
    • Expected to reduce Parsoid's load on the Mediawiki API cluster
    • Expected to improve parse latencies
    • Improves image handling in some scenarios (T112631, T112045)

Wednesday, September 23, 2015 around 1:45 pm PT: Yes Deployed 6619409e

[edit]
  • Count non-200 http status codes in the API (will show up in grafana)
  • Log 4xx API responses in Kibana
  • T113044: Render default part of parameters at the top level
  • A bit of bonus cleanup in the tokenizer
  • T112631: Attempt to match tpl(arg) brace precedence
  • T111151: Drop <font> tags without attributes if scrubWikitext=true

Monday, September 21, 2015 around 1:25 pm PT: Yes Deployed 9984d221

[edit]
  • T31919: Update parsoid sitematrix (et.wikimedia.org -> ee.wikimedia.org and other sitename updates)
  • T111213, T111225: Release version 0.4.1
  • T112686: Use a timer to ensure forward progress in batched dispatches (fixes bug in use of batching API which is not enabled in production)
  • T112668: Fix denial of client-side upscaling in thumb and frameless format (primarily related to batching API, but also some thumbnail scaling fixes in the non-batching API usecase)

Monday, September 14, 2015 around 1:15 pm PT: Yes Deployed 3d5f4359

[edit]

Bunch of edge-case tweaks and fixes to parsing of attributes in tables (rows, cells, table) -- improves compatibility with PHP parser output:

  • Pop tableCellArg before parsing template args
  • T95131: Content on table start / row is all attributes
  • Remove single_cell_table_args
  • Match broken attribute parsing with the PHP parser
  • Handle broken_table_attribute_name_char in table_attributes (improves handling of broken table attributes T51839, T95131, T93769)

Other wikitext -> HTML fixes:

  • TSP: Retokenize tokens that get converted to strings
  • Handle [[[Foo]]] and [[[[Foo]]]] properly

HTML -> wikitext fixes:

  • Move popping EOFTk inside tokenizeStr
  • Nowiki escaping: Process multi-line text nodes line-by-line

Other fixes:

  • Log the signal, if available, when a Parsoid worker exits
  • T111092: Batching API use (not yet enabled in production): Fix totally broken interpretation of parse batch response

Wednesday, September 9, 2015 around 1:15pm PT: Yes Deployed ffd0b444

[edit]

npm dependency tweaks to eliminate version variability in installed packages:

  • Shrinkwrap npm dependencies
  • Bump several dependencies to what's in production
  • Prefer tilde ranges in package.json

Logging and error reporting fixes:

  • Downgrade duplicate id warnings
  • DOMDiff: Use more descriptive error prefixes
  • Improved Mediawiki API error reporting for ease of debugging

Wikitext -> HTML fixes:

  • T93580: Handle <ref>s in inline image captions

HTML -> wikitext fixes (specifically nowiki escaping code):

  • Fix logic in hasWikitextTokens when asking for linksOnly

Other:

  • T111818: Update sitematrix.json for be-tarask and affcom wikis

Wednesday, September 2, 2015 around 1:15pm PT: Yes Deployed 5f2fae6c

[edit]
  • T110692: Massage batching API imageinfo width/height to numbers
  • Tabs are preventing nowiki pre protection
  • Consolidate test to determine if separator introduced SOL
  • Implement Sanitizer's escapeId

Monday, August 31, 2015 around 1:20pm PT: Yes Deployed c3e4df5e

[edit]
  • T110037: WTS support for localized ISBN magic links
  • Be careful about using tsr in tokens/x-mediawiki phase
  • Don't ignore errors in extension parsing
  • T23261: Support IPv6 addresses in URLs
  • Drop bad extension HTML and continue html2wt instead of returning HTTP 500
  • Let the OS randomize ports
  • Allow non-newline whitespace in RFC/PMID/ISBN autolinks

Wednesday, August 26, 2015 around 1:10 pm PT: Yes Deployed 44d657de

[edit]
  • Fix the profile quoting in our content type strings (currently in production via a cherry-picked deploy on Tue, Aug 25)
  • T110206: Fix couple regexps in tokenizer
  • T110206: Fix html2wt crasher on eswiki:Usme
  • T110206: Fix pathological backtracking regexp
  • T100680: Implement Parsoid v3 API (and add test suite)
  • Several cleanups and improvements to attribute parsing in the tokenizer
    • Improve broken attribute heuristics
    • Cleanup _att_value rules
    • Remove resetting the parse position
    • Move location of tokenizing tags in attributes

Tuesday, August 25, 2015 around 3:00pm PT: Yes Deployed c3b037b0 (cherry-pick of 437cac80)

[edit]

Tuesday, August 25, 2015 around 1:10 pm PT: Yes Deployed 759916fc

[edit]
  • T64326 : Upgrade express to 4.x from 2.x, use connect-busboy and upgrade other dependencies
  • Finish up fixing profile values in all content-type strings

Monday, August 24, 2015 around 1:15pm PT: Yes Deployed 0b2fbae7

[edit]
  • serializeChildrenToString shouldn't clobber sol state
  • Allow configuration of the "domain" separate from the MW API URL
  • Deprecate "prefix" parameter of setMwApi/removeMwApi
  • Match separator heuristic to its description
  • Quote the profile in our content type strings

Thursday, August 20, 2015 around 1:30pm PT: Yes Deployed db6e6404

[edit]
  • T109686: Fix crasher in normalizer
  • Use rel="mw:WikiLink" for ISBN magic links
  • T109371: Protect RFC/PMID/ISBN magic links with <nowiki> during WTS
  • Bracketed links must have at least one valid character after protocol
  • T109358: Escape serialized nowiki DOM elements

Wednesday, August 19, 2015 around 1:15pm PT: Yes Deployed 8d617c99

[edit]
  • Followup to T93580 fix: Save data-attribs in DOMs of nested refs (improves serialization and editablity)
  • T93580: Fix buggy regexp in strip meta tags DOM pass
  • T106945: Bare protocols are not autolinks
  • T107474: Fix <nowiki> escape of | in image captions
  • T78425, T108563: Fix WTS of autolink-like text after [^W]
  • T45888: Batch MW parser and imageinfo API requests (batching disabled currently -- will be enabled once the batching extension is deployed and we test latency impacts).
  • Code cleanup:
    • Remove special case in nowiki serializing
    • WikiConfig: remove dead code for hasValidProtocol / findValidProtocol
    • Convert bugzilla references in source code to phabricator references.
    • Documentation updates

Monday, August 17, 2015 around 1:15pm PT: Yes Deployed 4b656b72

[edit]
  • T108563: fix WTS of autolink-like text after [^W]
  • Allow ISBNs which end with a lowercase `x`
  • Support bitcoin:, redis:, urn:, xmpp:, etc protocols (part 2)
  • Newlines in html table attributes are valid
  • Normalizer: Tweaks to <td> escapable prefix normalization
  • Normalizer: Deal with "chameleon node" effect as in 7608aeab
  • WTS: Strip spans added for misnested a-tags
  • Other fixes: documentation, testing related code updates, code cleanup

Wednesday, August 12, 2015 around 1:40pm PT: Yes Deployed a271c205

[edit]
  • T93116, T107774, T108137: Run normalization after dom-diff to handle edited content
  • Normalizer: Do not suppress numbered extlinks
  • T108776: Unbreak Parsoid on wikitech

Monday, August 10, 2015 around 1:10pm PT: Yes Deployed 7b554ce2

[edit]
  • T50958, T107435: Parse non-block image caption all the way to Parsoid DOM
  • T95730: Scrub empty anchors
  • Support bitcoin:, redis:, urn:, xmpp:, etc protocols
  • DOMDiff: Get rid of 'modified' diff marker - reduces dirty diffs by improving reusability of original wikitext during serialization.
  • HTML pres should permit newline attributes
  • T108216: Disable pre_indent_in_tags rule for now
  • Check for null nodes in DOM helpers that test for node type

Wednesday, August 5, 2015 around 2:40pm PT: Yes Deployed cherry-picked hotfix ba49b80b

[edit]
  • Check for null nodes in DOM helpers that test for node type -- should fix crashers on saves to VE edits that involved empty table cells.

Wednesday, August 5, 2015 around 1:25pm PT: Yes Deployed d5a5722c

[edit]
  • T93116: Add a space after the | char in table cells if it contains +/- as the first char (fix for new table cells only)
  • Normalize links that end in spaces to prevent nowikis
  • T107652: Don't strip <ref> span tags in templated <td>-attr scenarios

Monday, August 3, 2015 around 1:25pm PT: Yes Deployed 38d0cdb1

[edit]
  • Enforce single-line context for definition lists
  • T65642, T76377: Additional scenarios dealing with treebuilder fixup
  • T107622: <nowiki> tags don't properly protect table-related content
  • Remove smart nowikier
    • nowiki wrappers are now added around smallest string (instead of trying to minimize nowiki additions).
    • Addresses comments like this and others in the past.
  • Update sitematrix.json
    • Fetched latest changes in wiki configs - gom, lrc, azb wikis added + TLS added to most urls
  • Update domino to 1.0.19

Wednesday, July 29, 2015 around 1:30pm PT: Yes Deployed 6e095a92

[edit]
  • Move sol transparent link hoisting behind scrubWikitext (since VE is now passing in that API flag)
  • Disable single-line wikitext mode in selser in the same places as in non-selser serialization
  • T104554: Prevent nowiki protection around leading whitespace in paragraphs by deleting that whitespace.

Monday, July 27, 2015 around 1:30 pm PT: Yes Deployed 92f1cd6d

[edit]
  • Bug fix stripping indent-pre nowikis in scrubWikitext mode

Wednesday, July 22, 2015 around 1:15pm PT: Yes Deployed 6befc44e

[edit]
  • T104502: Redirects no longer create categories
  • T104918: Fix redirects to non-local targets
  • T103364: Edited autolink-like text becomes an autolink
  • T105997: Fix crash on __proto__
  • Escape data-mw as well as data-parsoid in tokenizer
  • Refactor comment regexp into a constant and reuse everywhere
  • Use the new fork of PEG.js master

No deployments week of July 13 - 17th

[edit]

Parsoid deployments paused this week because of Wikimania. Only emergency cherry-picks, when required, this week.

Wednesday, July 8, 2015 around 1:25pm PT: Yes Deployed c4cfc527

[edit]
  • Scrub empty styles tags (if scrubWikitext API param is enabled)
  • Scrub whitespace at the start of paragraphs (if scrubWikitext API param is enabled)
  • Disentangle versioned APIs
  • T102117: Improve validating dp in the api
  • Remove old-style url redirects
  • Tweak td-fixup dom pass to handle some unhandled scenarios
  • Generate <head> only for the final document

Monday, July 6, 2015 around 2:10pm PT: Yes Deployed 87a746e6

[edit]
  • Bump HTML version because of cite html changes
  • T86782 Use CSS to style Cite references

Monday, June 29, 2015 around 1:20pm PT: Yes Deployed ea98be88

[edit]
  • Suppress newlines before category links + Don't swallow newlines & categories into last <li> of a list (Fixes T95988, related to T2087).
  • T96673: Serialize new display space hacks.

Monday, June 22, 2015 around 1:25pm PT: Yes Deployed d488783e

[edit]
  • T91411: Tokenizer incorrectly parses a "!!" inside a HTML <td> cell as a <th>
  • Newlines in comments shouldn't affect SOL state
  • Give nested blocks a chance to break on end delimiters
  • Only normalize new nodes
  • T94723: Fix serialization of `mw:WikiLink` which use absolute URLs
  • T69540: Include RL style modules from parser functions in <head>
  • Use DOMTraverser instead of DOMUtils.traverseWithTplOrExtInfo
  • Further tests and fixes to SOL behavior switches
  • Make tokenizer errors be more vague
  • Remove the last use of peg$FAILED from the PEG grammar
  • Eliminate the possibility of expansion reuse for private routes

Wednesday, June 17, 2015 around 1:21pm PT: Yes Deployed 402ddf66

[edit]
  • Don't stop on "!!" in templates
  • More cleanup in the tokenizer
  • T97430: Ignore marker meta tags during nowiki escaping
  • Refine DSR algo to use end-tag width info in the right context
  • Fix bug in computation of end-tag widths of wikitext constructs
  • Update sitematrix to include cnwikimedia
  • T99802: Don't prevent fostering of meta tags in our DOM spec
  • T102117: Return 400 if the passed in data-parsoid is empty

This is a repeat of Monday's postponed deploy.

Monday, June 15, 2015 around 1:15pm PT: to be deployed 402ddf66 (cancelled)

[edit]
  • Don't stop on "!!" in templates
  • More cleanup in the tokenizer
  • T97430: Ignore marker meta tags during nowiki escaping
  • Refine DSR algo to use end-tag width info in the right context
  • Fix bug in computation of end-tag widths of wikitext constructs
  • Update sitematrix to include cnwikimedia
  • T99802: Don't prevent fostering of meta tags in our DOM spec
  • T102117: Return 400 if the passed in data-parsoid is empty

We couldn't perform pre-deploy checks on the beta cluster since VisualEditor was broken there. Postponing deploy to Wednesday.

Monday, June 8, 2015 around 1:15pm PT: Yes Deployed 131554ba

[edit]
  • More thorough job of stripping unneeded data-parsoid from templated content
  • Code cleanup and improvements in PEG tokenizer
  • Minor code refactoring in serializer and template encapsulation code

Saturday, June 6, 2015 around 4:40 PT: 5172a446 (cherry-pick of 719c736f) deployed as a hotfix

[edit]
  • T101599: Don't hoist category links out of headings when they come from templates

Wednesday, June 3, 2015 around 1:15pm PST: Yes Deployed ab675400

[edit]
  • Be more careful about which MW API warnings we suppress
  • T97386: Make behavior switches SOL transparent
  • API: If "wt" parameter is passed in, set it as the page source unconditionally
  • T100225: DOM normalization: Move meta-tag hoisting from core serializer to DOM normalization pass
  • DOM normalization: Merge adjacent <a> tags with identical attrs

Monday, June 1, 2015 around 1pm PST: Yes Deployed 73445bfd

[edit]

This is the same as the previous deploy attempt:

  • T73161: Support subst: of transclusion blocks in the parseFragment API endpoint
  • DOMDiff: For <ref> id properties in data-mw, fetch HTML and compare DOMs to detect edits to <ref>s without requiring clients to dirty the <ref> nodes
  • DOMDiff: Improve robustness of data-mw diff testing
  • Suppress separators in single-line context (part of T52683)
  • T86882, T87513: Make hardcoded config values configurable
  • Blank template parameters should be preserved
  • Code cleanup in mediawiki wiki config
    • Use interwikiMap, not mwApiMap, to normalize titles
    • Store apiConf as an object in the mwApiMap
    • Make proxy_strip_https into a general proxy configuration option
  • Code cleanup and fixes in mediawiki API request handling
    • Refactor request default options into ApiRequest.prototype.request()
    • Strip UTF8 BOM so that JSON.parse() doesn't throw
    • Ignore modulemessages in api=parse result

Plus two new cherry-picked patches:

  • T100696: suppress modulemessages deprecation warnings in logs.
    • A new version of mediawiki core was deployed earlier in the day which caused a spike in these warning messages. With this patch, we are suppressing all warning/api messages.
  • Fix typo in config property used for sampling heap usage
    • this should fix the outgoing network spike seen in previous attempt.

Thursday, May 28, 2015 around 12:40pm PST: 497da30e to be deployed (Reverted)

[edit]
  • T73161: Support subst: of transclusion blocks in the parseFragment API endpoint
  • DOMDiff: For <ref> id properties in data-mw, fetch HTML and compare DOMs to detect edits to <ref>s without requiring clients to dirty the <ref> nodes
  • DOMDiff: Improve robustness of data-mw diff testing
  • Suppress separators in single-line context (part of T52683)
  • T86882, T87513: Make hardcoded config values configurable
  • Blank template parameters should be preserved
  • Code cleanup in mediawiki wiki config
    • Use interwikiMap, not mwApiMap, to normalize titles
    • Store apiConf as an object in the mwApiMap
    • Make proxy_strip_https into a general proxy configuration option
  • Code cleanup and fixes in mediawiki API request handling
    • Refactor request default options into ApiRequest.prototype.request()
    • Strip UTF8 BOM so that JSON.parse() doesn't throw
    • Ignore modulemessages in api=parse result

This is the same as yesterday's attempted deploy, which we had to defer due to T100439.

Reverted after observing an outgoing network traffic spike on our canary deploy machine (wtp1001). Suspected to be due to stats or logging misconfiguration. This is because of a typo in one of the parsoid-config properties that determines the heap usage sample interval. Because of the typo, instead of sending heap usage samples every 5 mins, parsoid was sending samples all the time. This caused the network spike seen on wtp1001.

Wednesday, May 27, 2015 around 1pm PST: 497da30e to be deployed (Cancelled)

[edit]
  • T73161: Support subst: of transclusion blocks in the parseFragment API endpoint
  • DOMDiff: For <ref> id properties in data-mw, fetch HTML and compare DOMs to detect edits to <ref>s without requiring clients to dirty the <ref> nodes
  • DOMDiff: Improve robustness of data-mw diff testing
  • Suppress separators in single-line context (part of T52683)
  • T86882, T87513: Make hardcoded config values configurable
  • Blank template parameters should be preserved
  • Code cleanup in mediawiki wiki config
    • Use interwikiMap, not mwApiMap, to normalize titles
    • Store apiConf as an object in the mwApiMap
    • Make `proxy_strip_https` into a general proxy configuration option
  • Code cleanup and fixes in mediawiki API request handling
    • Refactor request default options into ApiRequest.prototype.request()
    • Strip UTF8 BOM so that JSON.parse() doesn't throw
    • Ignore modulemessages in api=parse result

Because of T100439, we cannot currently test the deploy by looking at VE edits to see that we didn't break anything by examining wikitext diffs. Parsoid deploys are paused till that ticket is resolved and a patch is deployed to production.

Wednesday, May 20, 2015 around 1pm PST: Yes Deployed 8ed6fd0b

[edit]
  • T94509: Add mw:DisplaySpace to typeof for nbsp before colon
  • T96279: Provide section-offsets for immediate children of <body> to support section editing in VE and other clients

Monday, May 18, 2015 around 1:10pm PST: Yes Deployed 8ed3e503

[edit]
  • T93824: Put escaped HTML tags inside <nowiki>
  • T96923: html2wt should not need access to original source
  • Restore speedy non-selser serialization
  • Don't use selser if oldid is missing

Wednesday, May 13, 2015 around 1:25pm PST: Yes Deployed a8108fe6

[edit]
  • T96090: Allow quotes as template targets
  • Normalize empty headings only if they are newly inserted content
  • A bunch of code cleanup patches (including some refactoring of server configuration)

Monday, May 4, 2015 11:44am PST: Yes Deployed b53a7272

[edit]
  • Avoid deep freezing some parsoidConfig properties
    • This patch prevents the bug that prevented Parsoid service from starting up in production causing a revert Wedneday, April 29
  • Ensure that embedded Maps and Sets are properly deep-frozen
  • Freeze parsoidConfig to avoid shared mutable state
  • Remove uri fallback when switching wiki configs

Wednesday, April 29, 2015 around 1pm PST: 45b54f63 to be deployed (Reverted)

[edit]
  • Freeze parsoidConfig to avoid shared mutable state
  • Remove uri fallback when switching wiki configs

See outage report for more details.

Monday, April 27, 2015 around 1pm PST: Yes Deployed ebdac59b

[edit]
  • T97207: Forward the X-Request-ID header
  • T97204: Exponentially increase the request timeout
  • Reduce API concurrency and retries (to deal with overload on API cluster)
  • Don't strip \r in API routes
  • Remove redundant \r handling
  • Upgrade to prfun 2.0.0 and smash the global Promise
  • Performance: Use core-js/shim instead of es6-shim
  • A lot of code cleanup
    • This includes bcea0ab0 which is a fix for the cleanup patch 915ea3f6 which was causing last week's corruptions.

Saturday, April 25, 2015 around 8:25 am PST: Yes Deployed fca17070 (cherry-pick of d2135c6b on parsoid master)

[edit]

Cherry-picked "Reduce API concurrency and retries" from parsoid master to reduce # retries and concurrency level with which Mediawiki API is hit.

Friday, April 24, 2015 around 12:50 pm PST: Reverted deploy to 3311936a

[edit]

Thursday late night deploy reverted due to corruptions reported.

See outage report for more details

Thursday, April 23, 2015 around 11:45pm PST: Yes Deployed d2135c6b

[edit]

This was meant to be an emergency deploy of one patch but unintentionally deployed all changes from master.

  • Reduce API concurrency and retries (to deal with overload on API cluster)
  • Don't strip \r in API routes
  • Remove redundant \r handling
  • Upgrade to prfun 2.0.0 and smash the global Promise
  • Performance: Use core-js/shim instead of es6-shim
  • A lot of code cleanup

Wednesday, April 22, 2015 around 1:05pm PST: Yes Deployed 3311936a

[edit]
  • T95794: Enforce <pre> for all lines when escaping wikitext
  • Fix base href on _rt routes
  • Accept scrubWikitext as a query parameter

Monday, April 20, 2015 around 1pm PST: Yes Deployed 0cabb5b2

[edit]
  • T94867: Suppress empty headings if scrubWikitext param is provided
  • Add a scrubWikitext param to the API to (optionally) apply normalizations that won't roundtrip
  • T93368: Fix crasher seen in production
  • T96197: <ref> marker metas should remain fosterable
  • Log uncaught exceptions in Parsoid service
  • Edge case bug fix in migrateTrailingNLs DOM pass (for example, in en:SM U-66)
  • Other code cleanup that doesn't affect functionality

Wednesday, April 15, 2015 around 1:20pm PST: Yes Deployed ac7a01b9

[edit]
  • Bug fix serializing nested refs (would refuse to save because of missing <ref> content)
  • Bug fix in selser tests that sometimes normalized element attributes unnecessarily
  • Handle empty content string ("") returned by the API
  • Normalize DOM before running DOM-Diff
  • Fix findFirstEncapsulationWrapperNode -- eliminates dirty diffs in some edge case scenarios
  • Other code cleanup that doesn't affect functionality

Monday, April 13, 2015 around 1pm PST: 8f35374d (skipped)

[edit]
  • Bug fix serializing nested refs (would refuse to save because of missing <ref> content)
  • Bug fix in selser tests that sometimes normalized element attributes unnecessarily
  • Handle empty content string ("") returned by the API
  • Normalize DOM before running DOM-Diff
  • Other code cleanup that doesn't affect functionality

Deploy postponed because beta cluster is down and it is not possible to verify this in beta cluster beforehand.

Wednesday, April 8, 2015 around 1pm PST: Yes Deployed a76bd8a3

[edit]
  • T94599: <a> tags with invalid hrefs should serialize to text
  • T95039: use HTML entities to encode/decode arbitrary data in comments (see also T95040).
  • T94053: Switch from TXStatsD to statsite metrics.

Other changes:

  • Various code style tweaks and clean ups.

Monday, April 6, 2015 around 1pm PST: Yes Deployed d5aa726eb

[edit]
  • T94055: Normalize comments so that Parsoid output is valid XML
  • Edge-case fix for hoisting embedded <link>s from headings
  • T94799: Preserve querystring params while redirecting
  • Don't serialize <a> tags as <a> ever
  • T93973: Remove state from Cite extension
  • Log with supplied x-request-id header

Monday, Mar 30, 2015 around 1pm PST: Yes Deployed 29a5dafb

[edit]
  • Skip link validity tests for strings that won't be used as hrefs: Eliminates erroneous "bad title text" logging messages
  • cleanupAndSaveDataParsoid should be done in its own pass: Fixes incorrect HTML generated in <li>-hack scenarios when v2 API is used
  • Replace duplicate ids in wikitext: Allows Parsoid to handle pages with duplicated ids without corrupting them on serialization (T93739 is an instance of this)
  • T93926: Never serialize a-tag as html
  • T64881: Add original dimension information for images.
  • T93839: Normalize wikilink targets to strip leading "./"
  • T63165, T93715: Ensure reference index is reset at the end of document
  • Use tokenizer info to fix/cleanup tdFixups dom pass

Wednesday, Mar 25, 2015 around 1pm PST: Yes Deployed 0313fcc7

[edit]
  • T87069: Pop comments from the end of table tag attributes
  • Strip out X-Parsoid-Performance headers and associated code -- no longer useful since Parsoid now sends lots of metrics to statsd
  • Bug fix setting TSR in defn lists - fixes DSR inconsistency warnings
  • T93369: Nulls in DSR computation should not be coerced to 0
  • Edge case fix for definition lists: Only return colon when ignoring in tags
  • Associate data-parsoid with duplicated ids (copy-paste in VE can introduce duplicate element ids)

Monday, Mar 23, 2015 around 1:25pm PST: Yes Deployed a5d7483f

[edit]
  • T88081: Fix tokenizing redirect context
  • Use more specific warning labels to help sift through logs in Kibana
  • Use fatal/request instead of fatal when we can't serialize a <ref> ( https://gerrit.wikimedia.org/r/198176 ). This should send a 500 response, not kill the entire worker.

Thursday, Mar 19, 2015 around 6:45 pm PST: Yes Deployed 99d1b214

[edit]
  • T93228: Don't strip id attributes from DOM nodes -- required for <ref> tags
  • T73708: Serialize category redirects with a ':'
  • Additional logging to help debug Visual Editor issues

Thursday, Mar 19, 2015 around 9 am PST: Yes Deployed f5f5f0ed

[edit]
  • T93228: Abort html -> wt serialization when we encounter a <ref> DOM id without a matching DOM element
  • Log errors when Parsoid-like element ids are stripped from HTML elements

Wednesday, Mar 18, 2015 around 1pm PST: Yes Deployed b48f6e25

[edit]
  • Don't serialize HTML id attributes with Parsoid-like elt ids
  • T54341: Ensure that alt image option is handled properly even when it has complex wikitext
  • v2 API: Explicitly set a utf-8 charset in text content-types

Monday, Mar 16, 2015 around 1pm PST: Yes Deployed ccf4c140

[edit]
  • T69850, T90028: Handle entities/nowikis in templated attributes
  • T52683: Enforce single-line context in the serializer
  • T71123: Table cells not properly parsed in an implicit-td context
  • T53961: Improve escaping and nowikiing of template arguments
  • Additional fixes to selective serializer around reusing original source in lists and list items
  • Additional instrumentation (input/output sizes, init times) of Parsoid endpoints

Wednesday, Mar 11, 2015 around 1pm PST: Yes Deployed 73bf3162

[edit]
  • T88318: Fix serialization of table cells with "-" and "+" in them
  • T71482: Convert | to {{!}} in template parameters
  • T92177: Eliminate fatal assertion failures seen in production (found on kibana)
  • T71950: Improvements to <nowiki> wrapping for strings that needed them
  • Fixes to DSR computation algo to eliminate negative DSR deltas (should eliminate the warnings seen in kibana)
  • Updated sitematrix.json to latest changes
  • Explicitly pass rawcontinue=1 to the Mediawiki API (to eliminate deprecation warnings logged on the M/W API end)
  • Log mediawiki API warnings (so we can find and fix API deprecations in future)

Monday, Mar 9, 2015 around 1pm PST: Yes Deployed c8370a48

[edit]
  • T59910, T89627: Several fixes to serialization of lists
  • T72582: Change how LST <section>s are output

Wednesday, Mar 4, 2015 around 1pm PST: Yes Deployed 06c8cf33

[edit]
  • T90517: Fix selser bugs that would occasionally lose newly added comments
  • T85782: Fix broken serialization in some scenarios after table columns are deleted
  • Fix broken performance timer code (broken in Monday, Mar 2, deploy)

Monday, Mar 2, 2015 around 1:15pm PST: Yes Deployed 08643f53

[edit]
  • T88290: Remove duplication of <ref> content in the data-mw.body.html property of <ref> tags
  • T88017: Remove more cases of data-parsoid.src from mw:Extensions
  • Memory usage reports are now generated once every 5 mins and sent to the statsd server

Wednesday, Feb 25, 2015 around 1:00 pm PST: Yes Deployed 5a3aaf71

[edit]
  • Serialize new anchor links (w/o rel) as internal
  • Amend timing metrics
  • T87708: Open tags only affect line when parsing definition list colon
  • T90452: Fix nowiki escaping for <td>

Monday, Feb 23, 2015 around 1pm PST: Yes Deployed d9ac8c21

[edit]
  • Workaround for T90463. (Will be reverted once citoid bug is fixed: T90479.)
  • T90309: Ensure that implicitly-added <references> output have unique ids
  • Fix serializing categories without indent-pre protection (tweaks #REDIRECT handling as well)
  • Don't crash when revision is hidden
  • T88495: Handle more templated <td>-attr scenarios
  • Tweaked naming of selser-related timing stats.
  • Enable timing stats in production (localsettings.js change).

Wednesday, Feb 18, 2015 around 1:30 pm PST: Yes Deployed 17f68256

[edit]
  • T88660: Emit reflists for <ref> with no explicit <references>
  • T85232, T66171: Enable timing stats for Parsoid wt2html and html2wt requests
    • not yet enabled in production (requires change to localsettings.js)

Monday, Feb 16, 2015 around 1pm PST: Yes Deployed 86e76a30

[edit]
  • T51075: Handle template-generated DISPLAYTITLE and DEFAULTSORT
  • T89383: Fix selser regression introduced on Feb 11 deploy
  • T89411: Fix selser in v2 API (to be used by RESTbase)
  • For older MW APIs that doesn't provide that information, default to cached enwiki config for supported link protocols

Wednesday, Feb 11, 2015, around 1:35pm PST: Yes Deployed 4fc3b43d

[edit]
  • T88017: Remove data-parsoid.src for elts with valid data-mw and DSR info
  • T88019: Remove unnecessary <meta> transclusion tags
  • Fixes to handle high load on Parsoid cluster
    • Don't reprocess same token in AttributeExpander unless necessary (eliminates infinite loop scenarios found on some pages)
    • Fixes to make sure fatal errors more consistently force process restarts without leaving behind stuck processes
  • Categories on their own line don't need nowikis around any leading whitespace
  • Non-word characters shouldn't terminate tag names
  • T52373: Hoist categories, language links, redirects, comments out of headings when serializing them
  • T72960: Fix serializing new links with "./" in content string

Monday, Feb 9, 2015, around 1pm PST: dd98dea0 to be deployed (Cancelled)

[edit]
  • T88017: Remove data-parsoid.src for elts with valid data-mw and DSR info
  • T88019: Remove unnecessary <meta> transclusion tags
  • Fixes to handle high load on Parsoid cluster
    • Don't reprocess same token in AttributeExpander unless necessary (eliminates infinite loop scenarios found on some pages)
    • Fixes to make sure fatal errors more consistently force process restarts without leaving behind stuck processes
  • Categories on their own line don't need nowikis around any leading whitespace
  • Non-word characters shouldn't terminate tag names
  • T52373: Hoist categories, language links, redirects, comments out of headings when serializing them

Deployment cancelled. We found some regressions and the fixes for them are still going through round trip testing at this time. So, we'll get these out on Wednesday.

Friday, Feb 6, 2015, around 9:10pm PST: Hotfix of Gerrit change 189036 cherry-pick

[edit]

Jan 28, 2015 deploy of T48811 exposed a longstanding bug in Parsoid which was fixed by Gerrit change 189036. On some pages, due to T88864 where some templates weren't being expanded, the Attribute Expander was effectively being asked to re-expand the template over and over again in an infinite loop. This was being triggered on a few enwiki pages today that was causing processes to get stuck without being restarted. This hotfix prevents the infinite loop.

Friday, Feb 6, 2015, around 11:20am PST: Hotfix of Gerrit change 188982 cherry-pick

[edit]

A bug in our process restart (on fatal errors) was exposed by unrelated bugs in our parse pipeline which manifested as stuck processes on the cluster. This hotfix fixes that by ensuring that fatals continue to restart processes.

Wednesday, Feb 4, 2015, around 1pm PST: Yes Deployed dd4721f4

[edit]
  • Switch to using the compression package instead of the outdated version bundled with connect. In testing, that cleared up the memory leak noticed since the Jan. 15th deploy.
  • Some cleanup including:
    • Changing a few on handlers to once.
    • Using request's qs option for apiargs instead of stringifying those manually.
    • Better error handling for config requests.

Monday, Feb 2, 2015, around 1pm PST: Yes Deployed e3c9ae99

[edit]
  • Set X-Forwarded-Proto when proxying https. This fixes timeouts for ruwikinews which is strict about accepting only https connections.
  • Some performance tweaks to attribute expander to eliminate useless work and useless memory allocation
  • T86902: Fixes to resource module loading URI in the <head> section of Parsoid HTML
  • Fixes to tokenizer to ensure that strings starting with '-' are parsed for directives like language variant markup

Friday, Jan 30, 2015 around 2:35 pm PST: Yes Deployed 2abd0eb6

[edit]

The Jan 15th deploy where Parsoid started using sitematrix info for configuring wikis was missing special handling for some wikis (commonswiki being one of them). This caused timeouts which in turn repeatedly exercised an existing memory leak. This, in turn, caused a slow buildup of leaked memory on the cluster and a higher than normal cpu load. This special Friday deploy fixed the config issues.

Specifically, the following two patches were deployed:

  • Some special wikis should use the default proxy
  • Strip TLS from sitematrix url if we're using the default proxy

Wednesday, Jan 28, 2015 around 1pm PST: Yes Deployed 88605a4a

[edit]
  • T48811: Correctly handle templates that generate part-attribute and part-content of a DOM node.
  • T73412: Preserve blank template parameters
  • T71859: Cleanup of behavior switch production
  • Updates to wikitext serializer to simplify and enable more robust wikitext escaping
  • T66300, T67278, T73462: Magic link fixes (wt2html and html2wt nowiki handling)

Thursday, Jan 15, 2015 around 1pm PST: Yes Deployed 2fdf9298

[edit]
  • T86216, T75385, T60677: Default WMF wikis served by Parsoid fetched from sitematrix API call
  • T85958: Positional params with = in extlink are serialized as named parameters

On Jan 14th 1:20 pm PST, we reverted Parsoid to older deployed version after dirty diffs were seen during post-deploy testing. It turned out that the dirty diffs weren't related to the Parsoid deploy, but now that those issues have been fixed, we'll revisit the Parsoid deployment on Thursday.

Monday, Jan 12, 2015 around 1pm PST: Yes Deployed 2cd6fefa

[edit]
  • Include location of titles in timeout logs
  • Tweaks to Parsoid's cite port to generate identical ref ids as native cite implementation

Wednesday, Jan 7, 2015 around 1pm PST: Yes Deployed 904fab9e

[edit]
  • T85744: Improved handling of extremely large lists -- fixes the load issues seen in production on Jan 3rd
    • Removed hardcoded HTTP 500 response for urwiki:نام_مقامات_اے (deployed on Jan 3rd to prevent this page from overloading the cluster)
  • T85627: Fix edge case tokenizing table lines

Monday, Jan 5, 2015 around 2pm PST: Yes Deployed 0e2997d2

[edit]

Wikitext -> HTML

  • T72786: data-parsoid stripped from template content
  • T71219: Context-aware parsing of definition list colon
  • T58916: Parse extension parameters as plain text
  • T57531: Stray is parsed to meta
  • Marginal improvement parsing templates in definition lists

HTML -> Wikitext

  • T74844, T84921: Several improvements and fixes to nowiki protection for quotes
  • Other improvements and bug fixes to nowiki protection in headings, lists, tables.
  • T72791: Insert an extra newline after new content and existing headings

Other (API, logging, etc)

  • Add logging for html2wt API endpoints
  • Fix robots.txt route
  • Send SIGKILL to kill a timed out worker
  • T75955: API v2 parsing and serialization routes