Topic on Talk:Parsoid/Archive 2

Performance tuning guide

13 comments • 00:34, 7 September 2016 8 years ago

13

Summary by T0lk

Adjusting Nginx configuration had largest performance improvement.

T0lk (talkcontribs)

Is there any information about performance tuning for parsoid/visualeditor? I was running into a lot of timeout errors until I increased var HTTP_CONNECT_TIMEOUT = 5 * 1000; in ApiRequest.js to something like 20 or 30 seconds, and even then it's not enough to handle the largest pages on my wiki.

04:51, 1 September 2016 8 years ago

Arlolra (talkcontribs)

Unfortunately, I can't think of any specific documentation to point you to for performance tuning, but there's a lot of general information on the web about node.js applications.

But we can dig in to this issue. HTTP_CONNECT_TIMEOUT is the amount of time Parsoid is willing to wait for your MediaWiki API to accept the incoming connection. So, it doesn't seem like the size of the page should influence that variable (well, ignoring the amount of competing resources on the page, which probably increases proportionally with the size of the page).

How much concurrency does your MediaWiki API allow? HTTP_MAX_SOCKETS may be a more useful nob. You can try playing around with a tool like ApacheBench to verify.

15:36, 1 September 2016 8 years ago

T0lk (talkcontribs)

I'll try that tool, thank you. Is the concurrency MediaWiki allow's different from HTTP_MAX_SOCKETS defined in parsoid's ApiRequest.js? It's set to 15 in ApiRequest.js.

02:13, 2 September 2016 8 years ago

Arlolra (talkcontribs)

HTTP_MAX_SOCKETS is how many concurrent connections Parsoid will open with the MediaWiki API.

What you need to check is how many the server (Apache?) serving the MediaWiki API permits.

You can try setting HTTP_MAX_SOCKETS low, like 2, and see if that has any effect on timeouts. But more usefully, you want to see about increasing the amount that your server accepts.

02:24, 2 September 2016 8 years ago

T0lk (talkcontribs)

I really appreciate your help Arlolra. What follows is mostly for other users who might benefit from what I've learned.

I'm running nginx on a fairly small Amazon EC2 Instance. The summary is that I saw significant performance gains by configuring worker_processes and worker_connections correctly for my server size. See this guide. Just doing that cut parsing times in half. However, they were still prohibitively high and I was getting timeouts most large pages. The second thing I did is create a new dedicated parsoid server, and began spinning up instances of various sizes to test how CPU and RAM changed parsing times. The "t2.micro" instance with 1GB of RAM handles parsoid requests just fine. Sending requests there cut parsing in half again, and now a normal size page loads almost instantly. There are now only 2 pages on my wiki that give timeout errors. This setup has the advantage of separating traffic page load demands from parsing demands which can be high.

Edited 07:31, 2 September 2016 8 years ago

Arlolra (talkcontribs)

Great, glad it's working for you and thanks for the summary. I'm sure others will benefit. Can we consider the matter resolved?

For the pages that are still timing out, which timeout are they hitting? There are various timers in Parsoid. And, out of curiosity, how big are the pages?

15:25, 2 September 2016 8 years ago

T0lk (talkcontribs)

Yes, I think I'm hitting hardware limitations. I listed the full log here: Topic:Tapfo239pcgkrikd. First it's Failed API requests, {"error":{"code": "ETIMEDOUT"},"retries-remaining":1} followed by Template Expansion failure for "97bc6349e16b5680ae9006d5c0b88d0b": Error: ETIMEDOUT.

The more templates used, the more images used, or the more citations the higher the chance of a timeout, but LongPages reports they are in the neighborhood of 135,000 bytes to 167,000 bytes. It's probably safe to say a page over 150,000 bytes that has templates will fail.

EDIT: I'm wondering now if building in some form of caching like RESTBase? may be the solution to my problem with larger pages.

Edited 16:41, 2 September 2016 8 years ago

Arlolra (talkcontribs)

I see. That error is more likely the result of the following timeout, not the above discussed,

https://github.com/wikimedia/parsoid/blob/b0d015fac7bc52d239b2e2b25abfef89e0d9f68c/lib/mw/ApiRequest.js#L652 https://github.com/wikimedia/parsoid/blob/b0d015fac7bc52d239b2e2b25abfef89e0d9f68c/lib/config/ParsoidConfig.js#L39

You can try bumping it in your config with, parsoidConfig.timeouts.mwApi.preprocessor = 45 * 1000;

(Also, maybe see if it's any particular expansion that's slow and digging into why that template is performing so poorly on wiki.)

Another thing you can try is batching your requests,

Extension:ParsoidBatchAPI

This should help since it sounds like you've been hitting MediaWiki cpu limitations with lots of templates/images. Did you try increasing the size of the instance that MediaWiki itself is running on?

Also, the TLS connection between MediaWiki and Parsoid is probably adding some additional unnecessary overhead, but try the above recommendations before worrying about that.

16:41, 2 September 2016 8 years ago

T0lk (talkcontribs)

Awesome, thank you for telling me about the batchapi. The suggestion about looking into templates was a huge help too.

I had a custom template that was checking the page's title to see if it contained a certain word. mw.title.getCurrentTitle().nsText. That turned out to be very expensive. Removing that check reduced parse times from 80,000 ms to 20,000 ms on a page that was using that template 200 times. That allowed me to edit pages up to 800,000 bytes, the largest on my wiki.

I'm still having trouble on pages with more then 300 citations. Various changes to the citation templates didn't do anything, but this was a very expensive page even before parsoid got involved, after "Show preview", parser profiling data said it was consuming 18 seconds of cpu time. Obviously I'm maxing out my cpu here and it's creating bottlenecks down the line. If I could get rid of the cpu bottleneck that might resolve my last parsoid slowness issue.

If I have extra time later I might boot an expensive instance with plenty of resources and see what happens.

22:27, 2 September 2016 8 years ago

Arlolra (talkcontribs)

Great, let me know how that goes.

Another thing to consider is using HHVM http://hhvm.com/blog/7205/wikipedia-on-hhvm

And node.js v4.x

23:53, 2 September 2016 8 years ago

T0lk (talkcontribs)

I have spent the last hour reading about HHVM, I am very excited by it. It seems like MediaWiki-Vagrant is the easiest way to go about implementing HHVM. Would you recommend trying to setup Vagrant with HHVM, or just HHVM by itself?

00:13, 3 September 2016 8 years ago

T0lk (talkcontribs)

I migrated servers last night from my old 32-bit ubuntu paravirtual instance (PV) to a 64-bit hardware assisted virtual instance (HVM) in preparation for HHVM.

Here's a highlight from some of the pages I was keeping track of each time I made changes. One page is 151,838 bytes, before I began any modifications it was taking 35,000ms to process. Simply upgrading to a very large server with no other changes brought that down to 23,000. Back on the slow server, after tuning nginx it was down to 20,000. Removing expensive template calls and post migration we're at 2,600ms. Another page, 39,462 bytes but with enough images and citations to always cause timeouts (~40,000 ms) is parsing now in 9,000ms. The 167,101 byte page with 300 citations is parsing in just 12,000ms (down from something like 150,000). I haven't done extensive testing, but my guess is that I won't be getting anymore timeout errors! My new HMV server actually has .7GB less ram then the old PV one (same cost however).

Edited 17:26, 3 September 2016 8 years ago

Arlolra (talkcontribs)

Congrats! Unfortunately, I don't have much experience with HHVM. The setup is probably complicated by the extensions you have enabled on your wiki. But I'd start with it by itself and just see how that goes.

15:43, 6 September 2016 8 years ago