Jump to content

Topic on Extension talk:CodeMirror/5

Is there any way to disable it by default every time?

7
ԱշոտՏՆՂ (talkcontribs)

When I open very large pages, CodeMirror makes it very slow to open. Is there any way (maybe a js script?) to make sure that editor always opens with CodeMirror off and I can turn it on when I need it?

2001:861:4CD0:200:6447:7E59:9A13:A3C3 (talkcontribs)

Syntax coloring on very large files can be slow even outside wikis, e.g. in local text editors like Notepad++. The reason is that this coloring tries to apply everything at noce on the whole file even if, when opening the file, the coloring is just needed for the currently viewed part (usually the top portion), and could be made in the background, only on the visible part (and slightly above or below it by roughly one page). The most critical part however is not the syntax analysis, but applying the stylesheet and transforming the document into many spans: this requires lot of memory allocations and many uses of internal APIs (on a web page this involves using the DOM API).

But even without syntax coloring, this may occur just for splitting the document into lines and counting them (notably when automatic linewrap is enabled, forcing not just a parsing, but also computing display metrics), just to determine what to show or positioning scrollbars.

I don't think that loading the needed javascript is a real issue (it should not related to network delays with many requests, the scripts should be loaded once and cached in the browser), or loading the file content itself (unless it is very large) but to the fact that this requires lot of internal work.

If files are really large, one way would be to load it into an editable internal temporary cache and split it into separate segments (e.g. by page every 1000 lines or roughly 4KB), and generating an internal index that allows loading it and parsing it on demand, and saving lot of memory. This would allow fast scrolling, e.g. to the end of file without having to parse and colorize everything from the beginning, without needing many network requests to sync each segment. Only when the edited file will be saved/submitted, this splitted editing cache stored in some temporary file format (whose actual segment sizes would be variable, allowing fast insertions/deletions without moving lot of memory and using many reallocations) would be joined to a sequential stream. Such temporary "paged" cache would have a binary structure, and would be a bit larger than the actual file content, but editing would be much faster, and in fact would much less be stresssing the memory. The editor would also work within much lower memory constraints (notably if the browser uses a 32-bit implementation or runs in a constained environment).

However such improved design would require major redesign of the editor code, to use this "divide-and-conquer" optimization strategy. Basic text editors do not use it, but various wellknown IDEs used for code editing for working on large projects have implemented it.

ԱշոտՏՆՂ (talkcontribs)

I understand. I'm okay with it being slow. I just don't want it to start highlighting automatically because sometimes I don't need it. I was wondering if there is a way to make it disabled every time you open the editor but with the option to turn it on if you need it.

Verdy p (talkcontribs)

This is a valid request, in my opinion, to have CodeMirror enabled only on demand, and disabled by default, possibly with a button on the basic editor UI to activate it. But there will still be a similar issue with the basic editor, which also has size limits.

And CodeMirror should not turn on automatically for files larger than some reasonable threshold (which may be eventually tunable in preferences, or where syntax highlighting would use a simpler parser requiring much less style modification of edited files, by fragmenting it less into too small spans).

There's a problem here: Can we make assumptions about reasonable thresholds, without performing some measurements, i.e. monotoring some timed metrics, both for successes and failures/timeouts/crashes on both on the server side and on the client side (with enough precision about the client-side platform or browser type and its capabilities ?" Ideally such metrics would also be useful for quality assurances on the CodeMirror extension, i.e. for A/B testing in cases of changes of implementation, or for tracking possible critical problems like detection of vulnerabilities or DDoS attacks against the server, or against some ranges of client type, or just to help developers track what they need to change and where improvements should be made. If we had some could tuning, we could avoid the need for users to use any user-specific settings in their preferences, or the server-side editor could display warnings banners if such edit would likely cause a crash of the client, or a blocking error if this can cause damages on the server side.

This also applies to "syntax highlight" tags within editable MediaWiki pages, which fails on the server for large files and does not always fallback automatically to the basic "plain text" language model after maximum sizes are reached.

This is also the same problem when enabling by default the VisualEditor on large pages, when they can only be safely edited using the simpler MediaWiki code editor: there are size limits for the MediaWiki online editor, already implemented into MediaWiki setting paximum page sizes, but another limit could also count the maximum number of tokens generated by the parser, untill it renounces using the VisualEditor and falls back to the basic code editor. Note that the maximum page size applies as well for pages edited with more capable external editors, and transfered by bulk uploads (but this could cause problems for rendering page histories if they can't compute differences, which also requires a complex parsing that may exhaust server resources).

That's also the reason why most media files are generally not parsed at all and just handled as blobs within edit histories, just tracking the existence of different versions (just detecting differences by computing a digital fingerprint like SHA1 on the whole blob), however limited parsers may just check if the file encoding is safe, and if it embeds some limited metadata which could be differentiated in histories, notably in photos, or for embedded licences. Their actual editing is made externally, and media files are transferred by full file downloads/uploads, MediaWiki having no tools by default to edit them more incrementally (except a few files formats like tabular data, requiring a custom extension).

May be there's a need for improvement for CodeMirror, so that stored pages using them would have some metadata attached and stored separately (just like existing metadata indicating the page content model and other file properties), indicating if the advanced CodeMirror editor should be enabled or disabled for specific large files, notably if they require frequent updates with more cooperation with more users.

MediaWiki is not very well tuned to load/save and track changes in the history for very large editable files. Such files should be limited in size in your project, try splitting these too large files if you can in your code design. This will also help other users cooperate on your project, when they have a less capable browser, with less memory or slower CPUs. This is in fact not specific to MediaWiki, this is a common issue as well for other cooperative projects, e.g within GitHub (even if their servers are very large and support very complex projects).

[There's a similar but unrelated problem with very large lists of user-tracked pages, but the issue is not in the client editor but in the data model on the server; the client implemented in the UI, loads these huge lists very slowly, then makes lots of modification in the DOM, exhausting the resources of the browser which may become unresponsive on load or can crash the session; correcting this by using more selective filters requires changes in MediaWiki on the server, so this is unrelated; for now all what users can do is to purge their tracking list completely or disable the automatic addition in this list; however this affects only users editing many pages on the same wiki over a long period of time, with the automatic tracking option enabled by default even for minor changes; or use an external editor to download and upload the list filtered/sorted in some external editor, using a more advanced MediaWiki feature to do that in their user profiles; there has been improvements for that, but the issue remains, however it does not affect a lot of users.]

Jdforrester (WMF) (talkcontribs)

The preference is 'sticky', so if you enable it whilst making one edit, the next time you open the editor it will be enabled (and vice versa); as a quick short-term solution, if you use it, you can try to remember to disable it again before publishing. Obviously this isn't great, but it's a start.

Adding a user preference to always disable CodeMirror is an understandable request, but should be weighed against the cost of doing so (mostly, the added complexity for other users). We could also add a runtime performance check to auto-disable on load based on some metric?

ԱշոտՏՆՂ (talkcontribs)

@Jdforrester (WMF) thanks. Is there any way to control it via a user script? If you give me the tail, I can write it myself. Any API and/or example?

Maybe override the click event on the Publish changes button and disable it with JS?

Jdforrester (WMF) (talkcontribs)

The preference key is usecodemirror and it appears to be off by default. There's advice on API:Options on how to set your preferences, but I'm in meetings right now, sorry!

Reply to "Is there any way to disable it by default every time?"