Topic on User talk:SSastry (WMF)/Flow

Parsoid Test Coverage

16 comments • 16:07, 11 November 2017 7 years ago

16

Alsee (talkcontribs)

Hi, I see you're listed as Parasoid lead.

Recently I've submitted quite a few bug reports, and some WMF staff have expressed puzzlement at why I have such a miserable Flow experience and why I run into a steady stream of problems, up to and including Unable to parse content due to a Parsoid failure. I think I've figured out a big issue. I was looking at Parsoid/Round-trip_testing. It looks like the WMF attempted to collect a large body of representative test data. parsoid-tests currently claims:

100% parsed without errors,
99.84% round-tripped without semantic differences, and
74.56% round-tripped with no character differences at all.

But if your body of test data isn't actually representative of real world use, you're going to miss a lot of issues that rarely or never appear in the data you did test against.

The majority of edits are not saves.... the majority of edits are actually done within an edit-preview-cycle, possibly ending with a single save. I can tell you, you're going to find a *lot* of stuff that shows up within the edit-preview-cycle that rarely shows up in a final save. It seems you also didn't test against edits outside of article space, so there's another big chunk of stuff you're missing. (In Flow, merely clicking between the two editing modes generates a round-trip which routinely mangles what I had... and some of those no-semantic-difference changes are gross).

Is there any chance you can collect a large body of preview-diffs to do a round-trip test against? Preferably one test-set from article space, and one test-set sampled from all non-article space?

10:12, 10 October 2015 9 years ago

SSastry (WMF) (talkcontribs)

@Alsee Thanks for reaching out directly about this and your suggestions about possible gaps in our test coverage.

As of July 2015, the round trip testing tests against 160K pages coming from a collection of 30 wikis. enwiki has the largest representation @30K pages. In addition, 30% of the 160K pages were picked from the RC stream, so they are pages being edited. So yes, you are right that we did attempt to collect a large body of test data that also attempted to be representative.

I think you are making the observation that for Flow, this is not sufficient. For starters, I want to look at the bug reports you've filed so I know what is going on. It would be helpful if you point me to that. It is not necessary, but steps to reproduce them would be even better. Once I have a chance to look at that, it would help us figure out how to improve our test coverage (including but not limited to looking at non-article space wikitext usage).

13:45, 10 October 2015 9 years ago

Alsee (talkcontribs)

My other reply got mangled. I'll leave that post as-is, in case you want to investigate the mess. Flow is not reliable communication medium for discussing Flow bugs because it doesn't have a stable TrueText like wikitext does, and because trying to show troublesome Flow content generally triggers problems. I'll do my best build a non-damaged version of my original post here. Oh jeeze, I copy-pasted it all here and immediately got those little arrows, and a round-trip immediately resulted in new semantic changes. For example the text used to contain the word *was* and a roundtrip turned it into bulletpoint was*

Original post below:

----------------------------------------------------------

I'll probably split separate subjects into separate replies.

I think you are making the observation that for Flow, this is not sufficient.

Hmm. My first thought was that this issue would spill over to article editing too.... but on second thought maybe not much. I don't really use Visual Editor and I'm still trying to think through the actual-use implications. I think the fact that VisualEditor can't round-trip edit modes during an edit will mostly avoid the nasty trigger situations. But there are definitely latent issues you're not seeing, and I think real-use article-editing errors rates will still be a bit higher than your testing suggests.

The great thing about real wikitext is that newbies can type all sorts of malformed crap into it.... experienced editors can type a malformed half-edit... anyone can accidentally or experimentally type all sorts of malformed crap into it.... and it cheerfully gives you a reasonable best-effort render. No errors, no complaints, the worst that happens is you don't get what you hoped for. Then you play around with your wikitext until you do get what you want.... and you save a single relatively-clean generally-well-formed edit.

In Flow it's common to repeatedly flip edit modes while working up a single edit. That can hit Parasoid with all sorts of random malformed wikitext. And the worst part isn't a rare Parsoid Parse error. The worst part isn't even a fluke semantic change. The worst part is that Parasoid constantly throws your wikitext out the window. Your wikitext is gone, and Parasoid gives you a new mutated blob of wikitext.

Just a single <nowiki>

Becomes <span><nowiki></span>

which becomes <span><nowiki></span>

I lost the edit that triggered the Parasoid Failure. I had reflexively tried to make italics while in visual mode and I got something like this:

<nowiki>'blah' blah blah ''</nowiki>blah blah

The weird placement of the first nowiki tag caught my eye, and I made some random minor edit to see what would happen. I might have unbalanced the nowikis. I toggled the edit mode through 4 or 5 round-trips, it mutated each time. On the final roundtrip Parasoid itself mutated it into something that triggered a Parasoid failure.

(Then I started talking about bugs in that post.)

Ok, I avoided any roundtripping on this post. What I see on my screen right now looks correct. If I *don't* add a new reply, it's because this save looks correct.

Edited 21:38, 10 October 2015 9 years ago

SSastry (WMF) (talkcontribs)

You can now remove that other mangled post. I know what is going on. I filed https://phabricator.wikimedia.org/T115236 to figure this out -- you uncovered a bug in Flow <-> Parsoid interaction. Once we figure this out, the weird normalizations you are seeing should go away.

01:36, 12 October 2015 9 years ago

Alsee (talkcontribs)

The post above came out correctly.

I can report a reproducible semantic-change test case:

Starting with Flow in default Visual mode.
Go to the damaged post below(permalink). Click edit.
CTRL-A, CTRL-C to copy everything.
Click on a REPLY link somewhere, and CTRL-V paste. (You should still be in Visual mode)
Notice that there are no bullet points anywhere.
Click for Wikitext mode. Click back to Visual mode.
Notice that there is now a bulletpoint.

21:53, 10 October 2015 9 years ago

SSastry (WMF) (talkcontribs)

The arrow lines that you see (and reported) is a Firefox specific issue with copy-paste in Visual Editor (Visual Editor is embedded inside Flow). I cannot right away find the bug report for you. But, in any case, the scenario of the bullet being introduced is a bug in Parsoid (that should never happen -- we should have added a nowiki around the bullet because of the line-breaks introduced by the copy-paste. We'll investigate how to reproduce this and fix this.

01:38, 12 October 2015 9 years ago

Alsee (talkcontribs)

I just found this on Jimbo's talk page:

There are still lots of improvements yet to make, not least integrating wikitext and visual editing together properly and removing the hack of having a second edit tab that jumbles up the interface... Best, -Elitre (WMF) (talk)

It sounds like the plan is for general editing to work like Flow editing, where you get rid of the real wikitext parser and run everything through Parasoid. If so, then I need to revise my comment above, where I indicated that this issue isn't going to massively spill over to article editing. If you try integrating wikitext and visual editing together properly by running everything through Parasoid, three things will happen:

It's going to start massively mangling articles the same way it mangles Flow messages. Trying to deal with typo-ridden fragmentary incomplete edits is going to hammer all of Parsoid's failure cases.
Experienced editors are going to scream bloody murder. Even when there's no-semantic-changes, it's WTF why does it constantly destroy what I wrote.
Newbies will be severely impacted and probably driven off, when Parsoid makes it impossible to figure out how anything works. Parsoid throws away the wikitext they entered, and gives them back new and different wikitext. They will have no clue why their wikitext changed. The re-written wikitext may use stuff they haven't figured out yet. For newbies trying to learn and experiment, a simple preview test turns into a WTF-just-happened-I-can't-understand-this-I-quit.

Edited 20:51, 11 October 2015 9 years ago

SSastry (WMF) (talkcontribs)

Note that this bug you noticed here is specific to Flow. https://phabricator.wikimedia.org/T115236 does not affect Visual Editor. In addition, there are other techniques that Parsoid uses for Visual Editor that prevents dirty diffs. Between VE and Parsoid, we will nevertheless be doing a lot of testing to make sure something like this does not happen, especially since you have specifically alerted us to this possibility.

01:42, 12 October 2015 9 years ago

Alsee (talkcontribs)

I reported copy-paste completely mangling everything. I assume it will get fixed.

I reported reverting-an-edit completely destroying the original message. I assume it will get fixed.

I reported that the page you see on first save doesn't match what everyone sees on future fresh loads. I assume it will get fixed.

I reported that the editing interface crapped out, making editing virtually impossible. I assume it will get fixed.

I reported on summarize-edits causing history links to generate fatal exceptions. I assume it will get fixed.

I reported on all of the menu items becoming inaccessible if the content in the post is "too big". I assume it will get fixed.

I reported that oversight was never implemented. (Because my private info got sucked into a copy-paste test post.) I assume it will get fixed.

I reported test-results that mixing top-posting and bottom-posting turns large discussions into incomprehensible spaghetti. I *hope* there's a plan to fix it.

I can't even begin to remember everything I've reported - every time I touch flow everything explodes faster than I can report it all.

When you say "making sure something like this does not happen", are you merely saying you plan to get the mangling rate down to the figures claimed for Visual Editor? Or are you saying it's actually going to work as well as our current wikitext editor? (i.e. evil gremlins never-ever-ever creep in and rewrite what we wrote?) Because as I understand it, there's not much prospect that will ever get fixed with Parasoid.

Edited 08:15, 12 October 2015 9 years ago

SSastry (WMF) (talkcontribs)

I can only tell you what we'll do on the Parsoid end, not about Flow.

As for "making sure something like this does not happen", I cannot promise "never-ever-ever". Bidirectional conversion between wikitext and HTML is hard, and we have come a long way there. The problems you saw in Flow posts that frustrated you is because of a bug (which I linked to above) where some information is not getting to Parsoid. But, I appreciate your confidence in our abilities :-), but let us wait till we get there perhaps before making final judgements. But, more seriously, I do appreciate your reports as they pertain to Parsoid.

12:14, 12 October 2015 9 years ago

This post was hidden by Alsee (history)

SSastry (WMF) (talkcontribs)

This post can now be deleted if you wish.

01:43, 12 October 2015 9 years ago

Alsee (talkcontribs)

I just realized, we both got so distracted by bug-flood that neither of us noticed that you didn't address the semantic-change bug that completely destroyed the post. The one that led me to make a second post. The one where Flow randomly changed it's mind about how to parse the nowikis and changed it's mind about having a line in italics. The nowikis were stable through roundtrips... until it wasn't.

12:33, 12 October 2015 9 years ago

SSastry (WMF) (talkcontribs)

I did, but in the generic thread above. TL:DR; is that I filed T115236

12:47, 12 October 2015 9 years ago

Alsee (talkcontribs)

Ok. The Phab only mentioned the no-semantic-change rewriting of the wikitext view. I now take it that the Phab is intended to cover the changing rendered-view as well.

19:33, 12 October 2015 9 years ago

Alsee (talkcontribs)

I hid it, but my understanding is that the WMF rejected my report that Flow fails to support deleting.

08:30, 12 October 2015 9 years ago