Manual:Pywikibot/Compat/standardize notes.py
Appearance
This page is documentation for Pywikipedia Compat, which is no longer supported. This page is kept for historical interest. It may document scripts and features that are obsolete and/or no longer supported. Do not rely on the information here being up-to-date. |
standardize_notes.py is a python bot script for improving references and citations.
The current version
- Uses en:Wikipedia:Footnotes numbered link format through "ref" and "note" templates.
- Converts inline URLs to Footnotes format links to citation entries.
- Converts older link formats such as Footnotes2 to Footnotes format.
- Searches for existing footnotes or citations in several section names, such as "Notes" or "References".
- Creates citation entries using citation templates.
- For citations for which only a URL is known, attempts to access the URL.
- Attempts to get title information from HTTP or PDF information.
- "Web reference" citations are created by default.
- "News reference" citations are created for certain URLs.
- For citations for which only a URL is known, attempts to access the URL.
- Rebuilds notes section in same sequence as text references.
- Some multiple references are converted to "ref_label" and "note_label" multiple links.
- Detects some duplicate citations, labels them with "see above".
- Keeps existing notes, does not yet examine content of notes other than leading "*" or "#" list indicator.
Default behavior includes displaying changes and asking for confirmation to perform the intended changes.
If no title information is found or there are errors in accessing the URL, the title will contain the URL.
Command line options (in addition to the general options for all bots):
-sql | Retrieve information from a local SQL dump (cur table, see https://dumps.wikimedia.org).
|
-file | Work on all pages given in a local text file.
|
-cat | Work on all pages which are in a specific category.
|
-page | Only edit a single page.
|
-regex | Make replacements using regular expressions. (Obsolete; always True) |
-except:XYZ | Ignore pages which contain XYZ. If the -regex argument is given, XYZ will be regarded as a regular expression. |
-namespace:n | Namespace to process. Works only with a sql dump. |
-always | Don't prompt you for each replacement. |
To process a single page page, one can use:
python standardize_notes.py -page:Somepage