Jump to content

Manual:$wgUseTidy

From mediawiki.org
Tidy: $wgUseTidy
Use tidy to make sure HTML output is sane.
Introduced in version:1.3.0
Deprecated in version:1.26.0 (Gerrit change 235401; git #2c6c954e)
Removed in version:1.33.0 (Gerrit change 467972; git #6db35b3c)
Allowed values:(boolean)
Default value:false

Details

Use "HTML Tidy" to make sure HTML output is sane.

HTML Tidy is a free tool that fixes broken HTML. See w:HTML tidy and http://www.w3.org/People/Raggett/tidy/

You may wish to setup this tool, and set $wgUseTidy=true, to ensure that the wiki outputs reasonably clean and compliant HTML, even when malicious or foolish users add corrupt/badly formatted HTML to wiki pages.

Note that MediaWiki already does some built-in checks and corrections to user's HTML, and limits the range of HTML tags and attributes which can be used (unless you set $wgRawHtml =true Dangerous!) Limitations are described at Help:HTML in wikitext . The logic for this is found in includes/parser/Sanitizer.php. As such, you may decide that running HTML tidy over the output is not necessary.

HTML tidy will irreversibly and unexpectedly mangle standard HTML markup when it feels like it. For example, wikitext like [[Link|<div>Text</div>]] will not actually produce a clickable link. There are several dozen particular bugs identified that are likely never to be fixed (see task T4542 and its list of blockers). If you enable Tidy, you're in for a world of hurt.

Configuration

The location of the tidy configuration file can be set using $wgTidyConf - before MediaWiki 1.10, this was required. In later versions, a working default is provided.

However, this may not always work. See $wgTidyInternal for some more installation information.


Effects

Tidy is still required to mix wiki table and HTML table syntax, as well as simple wikicode and html-style markup.

example code Parser without Tidy Tidy
Mixed nested tags.
{|
|| foo
<tr><td>bar</td></tr>
|}
<table>
<tr>
<td> foo
<p>&lt;tr&gt;&lt;td&gt;bar&lt;/td&gt;&lt;/tr&gt;</p>
</td></tr></table>
<table>
<tr><td>foo</td></tr>
<tr><td>bar</td></tr>
</table>
Mixed open/close tags.
'''foo</b>
<b>foo&lt;/b&gt;</b>
<b>foo</b>
Definition list nesting
; hi
:# one
<dl><dt> hi
<ol><li> one</li></ol>
</dt></dl>
<dl><dt>hi</dt><dd>
<ol><li>one</li></ol>
</dd></dl>

Tidy can correct most bad HTML, which can be bad user input like

<table><tr></td></table>

or conflicting or badly written extensions (and even some bugs in the core software).

However, it does not resolve all strict XHTML validation issues, such as duplicate xml ID attribute values, or IDs starting with numbers.

See also