Parsoid/Deployments/2016
Appearance
< Parsoid | Deployments
Tuesday, December 21, 2016 around 5:03 am PT: Deployed e7e3a4dc on the deploy-20161221 branch
[edit]- ApiRequest: Clone the request options before modifying them.
Tuesday, December 20, 2016 around 7:48 am PT: Deployed 5eb649e8
[edit]- Use mwApiServer as the provider of the full URI of the MW API
- Add a mwApiServer configuration variable
- Add arbcom_cswiki to site matrix
Thursday, December 15, 2016 around 10:24 am PT: Deployed 6719e240
[edit]- T96555: Ignore self-closed tags when extending source
- Drop native LST altogether
- Fix DOMDiff annotations
- Linter:
- Fix bug in self-closing-tag category + other cleanup
- Fix crasher when linting a gallery
- Apply lint sampling when sending it to the logger as well
- Don't provide 'src'
Wednesday, December 14, 2016 around 1:24 pm PT: Deployed 60ee19ac
[edit]wt2html:
- T119265: Add more page-level metadata that MCS can use
- Support extension tags which shadows block level elements
- Move section handling to the LST extension
- T104523: Prevent infinite recursion
- T104662: Allow nested ref tags only in templates
Linting (disabled in production):
- Use ApiRequest.js to post results
- Handle MW API errors that come with a HTTP 200
Debugging:
- Let extensions supply the pp tracing name
Monday, December 12, 2016 around 1:35 pm PT: Updated production config
[edit]- Bump table cell and list item resource limits to 40K (from 30K)
Wednesday, December 7, 2016 around 1:21 pm PT: Deployed 3cf19c6b
[edit]- Bump HTML contentVersion to 1.3.0 (see updated spec)
- T151570: Update SiteMatrix data fork for last 3 wiki creations
- T149209: Deal with newlines in <td> and <th> cells
- T150213: Suppress logs for known unknown contentmodels
- T152073: Reduce request timeout to 110s (from 3min) and worker timeout to 115s (from 3min); Increase M/W batcher API timeout to 65s
- Some configurations moved to vars.yaml in the deploy repo
- s/warning/warn/ to match service-runner's levels
- Don't entity escape extension attribute values from data-mw
- Normalize all extension options, not just native
- Remove unused package gelf-stream
- Linter: Add linting of self-closed tags
- Testing:
- Remove scrolling by access key
- require('should') in lintertests.js for standalone runs
Monday, November 7, 2016 around 1:29 pm PT: Deployed 2c2fe425
[edit]- Cleanup http redirects
- Send error responses in the requested format
- Fix processing listeners in node v7.x
Wednesday, November 2, 2016 around 1:27 pm PT: Deployed 173d7e32
[edit]- T149241: Whitelist content model fallback
- Testing:
- Don't expose dev routes in production
- Get rid of simple debug helpers
- T119228: Stop testing on node v0.10.x
- Linter:
- Add node name for missing-end-tag
- Remove higher resource limits (max wikitext page size, max # list items, max # table cells per page) and fall back to default limits.
And the commits that were attempted to deploy on Oct. 26th (ede4353):
- T141723: Bump mediawiki-title
- T141905: Fix crasher and other bugs of that category
- service-runner doesn't recognize warning level
- Stop asserting that we'll never be encapsulating a flipped range
- Lots of linter fixes / features (currently, linting is disabled in production though)
- Remove html5 treebuilder in favour of domino's
- Bump domino to 1.0.27
- T147742: Trim template target after stripping comments
- T48580, T133320: Allow extensions to handle specific contentmodels
Tuesday, November 1, 2016: Parsoid cluster upgraded to node v4.6
[edit]Ops upgraded node on the Parsoid eqiad cluster to node v4.6. The (backup) codfw cluster had been upgraded on Monday.
Monday, October 31, 2016 around 1:34 pm PT: Deployed e503e801
[edit]- T149504: Fix reflected XSS
Wednesday, October 26, 2016 around 1:15 PT: ede4353 to be deployed Reverted to 63f1e151, contentmodel errs
[edit]- T141723: Bump mediawiki-title
- T141905: Fix crasher and other bugs of that category
- service-runner doesn't recognize warning level
- Stop asserting that we'll never be encapsulating a flipped range
- Lots of linter fixes / features (currently, linting is disabled in production though)
- Remove html5 treebuilder in favour of domino's
- Bump domino to 1.0.27
- T147742: Trim template target after stripping comments
- T48580, T133320: Allow extensions to handle specific contentmodels
Monday, October 24, 2016 around 1:42 pm PT: Deployed 63f1e151
[edit]Wednesday, September 21, 2016 around 1:17 pm PT: Deployed a802de0
[edit]- Tokenizer:
- Encapsulate protected table attributes from wt
- Inline generic_attribute_newline_value and table_attribute_value
- Set srcOffsets for table_attribute and generic_newline_attribute
- HTTP API:
- Page id and revid aren't the same thing
- html2html should require an original or previous revision
Wednesday, September 14, 2016 around 1:11 pm PT: Deployed aed15dda
[edit]- Let native extensions add stylesheets
- Move getAPIProxy to parsoidConfig
- Other minor refactorings and parserTest changes
Monday, September 12, 2016 around 1:40 pm PT: Deployed f7c43009
[edit]- Handle HTML tags in attribute text properly
- AttributeExpander: Tweak check for improved code readability
- Testing:
- Bump worker_heartbeat_timeout to 2mins for testing
- Allow specifying a specific revision for roundtrip-test.js
Tuesday, September 6, 2016 around 10:37 am PT: Deployed 7863e6ad
[edit]- T142617: Handle invalid titles in transclusions
- Sanitizer fixes:
- Decode all char refs in text
- Ignore some fields when freezing SanitizerConstants for node v6.5 -- no-op for Wikimedia cluster that runs node v4.x
- node-module updates:
- Bump service-runner to v2.1.0
- Remove bunyan
- Some minor cleanups
Monday, August 29, 2016 around 1:10 pm PT: Deployed 48cf803e
[edit]- Run localSettings.setup after assigning options
- Use service-runner's metrics reporter in the http api
- Updates in preparation for supporting version 2.x content in the future -- should be no-op for version 1.x content
- Support downgrading 2.x content to 1.x
- No content reuse from semantically different content versions
- T143356: Establish precedence for data-mw in 2.0.0 content
Monday, August 22, 2016 around 1:12 pm PT: Deployed df53a991
[edit]- T142998: html2wt: Fix crasher in DOM normalization code
- T141370: Use service-runner's logger as a backend to Parsoid's logger
Wednesday, August 17, 2016 around 1:09 pm PT: Deployed 3cf877bb
[edit]- html2wt: Always emit canonical wikitext for url links
- html2wt: Emit url-links where appropriate no matter what rel attribute says
Monday, August 15, 2016 around 1:09 pm PT: Deployed f039dcf6
[edit]- migrateTrailingNLs DOM pass: Code simplifications and some subtle edge case bug fixes
- T138864: Deal with edge cases serializing links
- Remove deprecated "disablepp" MediaWiki API param and pass "disablelimitreport" instead
- Increase resource limits for wikitext size, max table cells, and max list items
- With the upgrade to node v4, we have more breathing room for parsing large pages
Wednesday, August 10, 2016 around 1:10 pm PT: Deployed 4de49e26
[edit]- Handle caption-like text outside tables
- Table captions: Remove unneeded mw:TSRMarker meta token + add TSR info in tokenizer which leads to more accurate DSR offsets.
- When table wikitext shows up outside tables and are converted to strings, strip attached mw:TSRMarker tags
- computeDSR: Fix source of pathological O(n^2) behavior
Tuesday, August 9, 2016 around 11:15 am PT: Deployed a577d80e
[edit]- Fix crasher in escapeWikitext
- T140898: Update site matrix for tcy.wikipedia.org
Tuesday, August 2, 2016 - Tuesday August 9, 2016: Upgrade Parsoid cluster to node v4.x and Jessie
[edit]- T135176: Over the week, Operations upgraded the cluster gradually.
- The eqiad cluster was fully migrated by Friday, August 5th.
- The codfw cluster was fully migrated by Tuesday, August 9th.
Monday, August 1, 2016 around 1:15 pm PT: Deployed abf396eb
[edit]- Fix title parsing of subpages during initialization (addresses crashers while parsing these pages)
- Only apply data-* attributes in /pagebundle/ paths (API cleanup)
- Determines the content version in the html2wt direction, enabling content upgrade
Tuesday, July 26, 2016 around 10:12 am PT: Deployed 285b6983
[edit]- Use mediawiki-title package to replace homegrown Title code (resolves T113322, T133425, and T139135)
- Reintroduce a 3-minute request timeout
- Bump some minor / patch level versions of dependencies (addresses a security advisory)
- Prevent JSON.stringify circular refs in template wrapping trace/error logs
Thursday, July 21, 2016 around 9:30 am PT: Deployed ed2f8228
[edit]- Test deploy to verify trebuchet deployment is not broken after all the tinkering done during the service-runner deploy. The deployed change was a change that only affects parser tests.
Wednesday, July 20, 2016 between 7:30 - 8:20 am PT: Deployed 45beb6c0
[edit]- T90668: Update Parsoid to use the service-runner framework
- In collaboration with Services & Ops teams
- wtp1001 and wtp1002 were transitioned over July 19, 2016 between 8:00 - 9:00 am PT
Monday, July 11, 2016 around 1:10 pm PT: Deployed e738c415
[edit]- T131564: Respect $wgInterwikiMagic setting while parsing lang-links
- T139388: DOMDiff: Skip over encapsulated content rather than about-id content (fixes problem with lost edits in content nested in elements with templated attributes)
- Code cleanup (don't expect functional changes): Use a more appropriate DOM helper (s/hasParsoidAboutId/isEncapsulationWrapper/) where appropriate
Monday, June 27, 2016 around 1:08 pm PT: Deployed dd8e644d
[edit]- Template wrapping: Eliminate pathological tpl-range nesting scenario
Thursday, June 23, 2016 around 10:30 am PT: Deployed 18022c96
[edit]- Emit single newline separator in table wikitext for new content
- Make the http connect timeout configurable
- Update many deps by minor version
- T137406: Ensure newlines are added where required around thead/tbody/tfoot
- T96195: Remove node 0.8 support (does not affect WMF deploy of Parsoid)
Wednesday June 15, 2016 around 1:10 pm PT: Deployed 3445eceb
[edit]- T137406: Emit |- between thead/tbody/tfoot
Non-functional changes (these will come into play once we move to v2.0.0 of Parsoid HTML):
- Roundtrip 2.0.0 content
- T114413: Provide HTML2HTML endpoint in Parsoid
Monday, June 6, 2016 around 1:15 pm PT: Deployed e8d6092e
[edit]- Normalize all lists to not mix wikitext and HTML list syntax (selser prevents unnecessary dirty diffs in production)
Thursday, June 2, 2016 around 10:40pm PT: Deployed 7188080b
[edit]- T134389: Serialize content in HTML tables using HTML tags
- T125419: Fix selser issues serializing first table row
- Selser: Bug fix reusing separator text from original source
Wednesday, June 1, 2016 around 1:15 pm PT: Deployed afb0d522
[edit]- Bump core-js from v1.2.6 to v2.4.0
- Bump yargs from v1.3.1 to v4.7.1
- Don't use non-standard array generic functions (Array.reduce, etc.) - removed from newer version of core-js
- Use normalized form of default page "Main_Page" instead of "Main Page"
- T135596: Return client error for missing data attributes
- Fix up the internal forms to use v3 post endpoint
- Add a page/wikitext/:title route to GET wikitext for a page
Thursday, May 19, 2016 around 11:38am PT: Deployed 67816adf
[edit]- T100681: Remove deprecated v1/v2 HTTP APIs.
- T130638: Content negotiation; Add data-mw as separate JSON blob in the page bundle.
- Strict Accept header checking is turned off; we will return 1.2.x format if an invalid Accept header is provided (which is allowed by RFC 2616).
CLEARED DIRTY REPOS which had this patch applied as root during the restbase/changeprop/parsoid outage:
diff --git a/lib/api/routes.js b/lib/api/routes.js index 4d08922..d372c2f 100644 --- a/lib/api/routes.js +++ b/lib/api/routes.js @@ -377,6 +377,7 @@ module.exports = function(parsoidConfig, processLogger) { var v1Wt2html = function(req, res, wt) { var env = res.locals.env; var p = apiUtils.startWt2html(req, res, wt).then(function(ret) { + if ( ret.oldid === 106801025 ) { return false; } if (typeof ret.wikitext === 'string') { return apiUtils.parseWt(ret) // .timeout(REQ_TIMEOUT)
Wednesday, May 4, 2016 around 1:15 pm PT: Deployed b0d015fa
[edit]- T134017: Update cached SiteMatrix, mainly for jamwiki
Monday, May 2, 2016 around 1:15 pm PT: Deployed 0a26f3a4
[edit]- html -> wt: For invalid links, text doesn't need escaping in link context
- DOMDiff: Fix marking data-is-block on extra base nodes
- Add autoload mechanism for user extension code -- proof-of-concept for future use
- Update shrinkwrap after 23c97752
- Code cleanup: should not affect functionality
- Keep the data-* attributes at the edges of the DOM
- Remove ParsoidCacheRequest
- Organize post-processors distinguishing handlers
- Move the dumper to DOMUtils and use more widely
Monday, April 25, 2016 around 1:05 pm PT: Deployed d5363193
[edit]- T130645: Pass the right title to PHPParseRequest
- Don't allow unclosed extension tags
- Code cleanup: should not affect functionality
- T95325: Move tsrDelta to dp.tmp
- Rename DU.serializeChilden to DU.serializeToXML
- storeDataParsoid is an env variable, not a Parsoid config property
Monday, April 11, 2016 around 1:15pm PT: Deployed e3766b79
[edit]- Count api version use
- Don't dom-diff on a cloned node
- T95325: Migrate temporary data to dp.tmp
- Suppress errors raised when getting debugging info
- Code cleanup: should not affect functionality
- Fix some variable shadowing
- Stop working on cloned nodes in parserTests
- Rename timer to stats, since we do counting too
- Fix regression testing tool
- Fix crasher and more informative rt errors
Wednesday, April 6, 2016 around 1:15 pm PT: Deployed 5f6c0c60
[edit]- T116020, T53852: Serialize localized image options (already cherry-picked yesterday)
- Stop suppressing escaping errors
- Remove the broken_template rule in the PEG tokenizer -- no need to wrap {{, {{{, }}, }}} in <nowiki> spans
- Code cleanup: should not affect functionality
- Cleanup some fallback rules in the PEG tokenizer
- Use Util.placeholder in a few more places
- Be consistent with dp.src check
Tuesday, April 5, 2016 around 2:40pm PT: Deployed a5be1cdc
[edit]- T116020, T53852: Cherry-pick of image option localization patch to match alias reordering in mediawiki core version 1.27.0-wmf.20.
- Deployed cherry-pick from
deploy-20160405
branch.
Monday, April 4, 2016 around 1:10 pm PT: Deployed 579ec3e6
[edit]- Fix log type in cite implementation
- Code cleanup: should not affect functionality
- Move dp.src handlers to their respective dom handlers
- Add new env.normalizeAndResolvePageTitle helper and use it
Wednesday, March 30, 2016 around 1:15 pm PT: Deployed a20ef276
[edit]- Bump HTML version number to 1.2.1
- Declare charset with <meta charset>
- Add html/dp version numbers in <head> instead of full content type
- T113331: Move auto-generated refs flag from data-parsoid to data-mw
- Default ParsoidConfig.loadWMF to false
- Bump node-uuid to 1.4.7 for nsp
Wednesday, March 23, 2016 around 1:15 pm PT: Deployed 5538d868
[edit]- Don't construct regexp with a regexp when flags need to be set
- Don't export Namespace since it isn't used anywhere else
- T129752: Include user agent in request logs
- Tweak error prefixes for ease of browsing in logstash
- Promisify the exposed batching methods
- T128659: Handle async createSocket
Monday, March 7, 2016 around 1:15pm PT: Deployed 5db1d28b
[edit]- Cleanup and tweaks of transclusion formatting for clarity and fewer dirty diffs
- Fix breakage in counting of HTTP status codes (broken by fix for T127983)
Tuesday, March 1, 2016 around 10:50am PT: Deployed 1f7ed5d0
[edit]- T128319: Fix bug in formatting of transclusions for block-format templates
- Remove overloading of pipe stop in the PEG tokenizer -- eliminates incorrect parsing of pipes in external links
Monday, February 29, 2016 around 1:25pm PT: Deployed d809ad7a
[edit]- T127983: Don't crash on misconfigured statsd host
- T108134: Match html5 unquoted attribute parsing
- Break for [[ in table attribute values too
Wednesday, Feb 24, 2016 around 1:15 pm PT: Deployed 581a43c7
[edit]- Bump HTML content-type version to 1.2.0 (from 1.1.0) and data-parsoid content-type version to 0.0.2 (from 0.0.1)
- Update parsoid content type meta tags in the <head>
- <meta property="mw:parsoidVersion" content="0"/> is now changed to <meta property="mw:html-content-type" content='text/html; charset=utf-8; profile="mediawiki.org/specs/html/1.2.0"'/> to be more consistent with the version information that is output in the response headers.
- For the non-pagebundle API endpoints, <meta property="mw:data-parsoid-content-type" content='application/json; charset=utf-8; profile="mediawiki.org/specs/data-parsoid/0.0.2"'/> is also emitted.
- T125266: Remove user/contribution information from header
- T90479: Assert param value serializes to a string
- T104599, T111674: Fetch and use templatedata while serializing transclusions
- data-parsoid semantics updated to use 'foo=bar' as the default transclusion arg spacing.
- Remove data-mw.body.extsrc for the <references> tag (unused, and bloats data-mw)
Thursday, Feb 18, 2016 around 11:00 am PT: Deployed dfbafb60
[edit]- T127218: Update sitematrix for ady.wikipedia.org
Wednesday, Feb 10, 2016 around 1:15 pm PT: Deployed 8976ab93
[edit]- Assert when flipped ranges are expected in template wrapping
- This should have no functional changes in parsing. At best, it will catch a bug / failed expectation in the template wrapping code.
Monday, Feb 8 2016 around 1:15 pm PT: Deployed 4d44fcc7
[edit]- Fix worker shutdown code in server.js + use it to restart stuck workers and to shutdown the Parsoid service
- Expect that this will fix the scenario with stuck worker processes when Parsoid service is restarted during deploys.
Wednesday, Feb 3, 2016 around 2:45 pm PT: Deployed 98619f7f
[edit]- Fix complex single-line nowiki handling
- More robust algorithm + can eliminate some spurious nowikis
- T115289: Disable migrateTrailingNLs if table has had content fostered out of it
- Some code cleanup
- Removed some FIXMEs in nowiki escaping in <td>s
- Tweaks to attribute parsing in the PEG tokenizer
- Warn if prefix/domain is not unique during configuration
- ParsoidConfig changes: Don't proxy nonglobal wikis (temporary special handling for labswiki and labstestwiki)
- Config changes:
- Remove hardcoded references to internal API LVS endpoint.
- Removed references to unused parsoidcache.
- Removed explicit config entry for labswiki (ParsoidConfig handles it now).
Monday, Feb 1, 2016 around 1:15 pm PT: 2fcc841f to be deployed Cancelled deploy to fix nowiki regressions
[edit]Warn if prefix/domain is not unique during configurationFix complex single-line nowiki testsCan eliminate some spurious nowikisBut, can introduce spurious nowikis around [{{echo|foo}}] style wikitext -- 0.07% of pages in rt testing were affected, but with selective serialization, we expect impact to be small. We will consider possible solutions to minimize nowikis in this scenario, nevertheless.
T115289: Disable migrateTrailingNLs if table has had content fostered out of itConfig changes: Remove hardcoded references to internal API LVS endpoint + removed references to unused parsoidcache.
Wednesday, Jan 20, 2016 around 1:45 pm PT: Deployed f1ddfb88
[edit]- T122816: Record when a range is subsumed from overlapping
- Temporarily disable the request timeout (since they don't abort request processing and cancel cpu timeouts as well)
- Reduce cpu timeout value to 3 minutes
Monday, Jan 11, 2016 around 1:15 pm PT: Deployed 07494cf2
[edit]wt2html
- T73154: Remove the vestiges of pipetrick entirely
- T114225, T121611: Note that DOM tree building uses restrictive checks (documentation fix)
- T122054: Strip nowiki spans from templated / extension content
- Match permitted attributes to php's getAttribsRegex
html2wt
- Normalize DOM by stripping \u200e, \u200f next to category links (This is controlled by a config switch that we will turn on, if necessary)
- Edge case fixes to serializing lists with templated portions
T119883: Performance fixes (for large DOMs)
- Use startsWith() instead of regex to match tag names in the DOM
- Optimise shadow meta deletion
- Bump domino to 1.0.21 (with performance fixes)
Other