Jump to content

Topic on Project:Support desk/Flow

Why disallow all the different spelling of Special

8
Summary by Biologically

Both the reason and scope of using different language variations of Namespaces was explained in the answers.

Biologically (talkcontribs)

I found this part in the robots.txt file of both mediawiki.org and Wikipedia.

Disallow: /wiki/Special:

Disallow: /wiki/Spezial:

Disallow: /wiki/Spesial:

Disallow: /wiki/Special%3A

Disallow: /wiki/Spezial%3A

Disallow: /wiki/Spesial%3A

Why is this necessary to block all these different spellings?

Wargo (talkcontribs)

It will create multiple URLs of the same pages which is bad for search engines.

Biologically (talkcontribs)

I couldn't understand. Those sites are disallowing or blocking the indexing of these pages, so how would it create multiple URLs?

Wargo (talkcontribs)

Special pages does not need to be indexed because they are only tools, not actual content.

Bawolff (talkcontribs)

Because different languages spell Special differently (For example, on de.wikipedia.org you have https://de.wikipedia.org/wiki/Spezial:Leerseite )

Blocking them isn't neccessary, sometimes though web spiders crawling through things like special:Search can slow down the site, but most special pages also contain robot meta tags to stop indexing too.

Biologically (talkcontribs)

This answer is perfect. I understood.

I blocked the special pages for the reason mentioned by you using -

User-agent: *

Disallow: /wiki/Special:

So, according to your explanation I should probably block those other variations too, as my wiki is also multilingual and global.

Should I do this same thing for other "words" blocked in the robots.txt file too - e.g. "/wiki/Template:" (here the word template)?

Bawolff (talkcontribs)

You only need to block the versions that correspond to the content language ($wgLanguageCode) of your wiki.

So even if your wiki is multilingual, if your $wgLanguageCode is set to 'en' (like https://meta.wikimedia.org) then you only need to block the english variant.

For other namespaces, same thing basically applies.

You may also want to check out $wgNamespaceRobotPolicies and $wgExemptFromUserRobotsControl configs.

Biologically (talkcontribs)

Now I completely understood. Thank you so much.