Manuale:Pywikibot/PAWS
Pywikibot |
---|
|
- Si veda Wikitech:PAWS per i dettagli
Questo documento contiene una veloce panoramica dell'uso di Pywikibot in un notebook che esegue un'istanza nell'ambiente Wikimedia Cloud Services utilizzando PAWS (PAWS: A Web Shell).
bash file.sh
Crea un profilo Wikimedia
Per seguire questa guida è sufficiente avere un account Wikipedia/Wikimedia. Per creare un account utilizzare Special:CreateAccount.
Una volta creato il profilo si controlli di aver effettuato l'accesso verificando che il proprio username compaia in alto a destra alla pagina https://test.wikipedia.org
If you are a new user on Wikimedia log in with your account on Meta-Wiki, Wikipedia, Wikidata, and Commons. And in each of them read and delete all pending messages you have (on the top).
Accedere a un quaderno
Per avviare un quaderno, andare all'indirizzo https://hub-paws.wmcloud.org/hub
Cliccare su accedere con MediaWiki (Sign in with MediaWiki) e permetti (Allow) a Utilizzare OAuth per l'autenticazione ("Use OAuth for Authentication"). La prima volta che accedi a un Terminale PAWS, devi creare un server. Clicca su Avvia il mio server (Start my Server"). Il server potrebbe impiegare qualche minuto ad avviarsi.
Una volta completato, si sarà rediretti all'URL https://paws.wmflabs.org/paws/user/<username>/tree
Avvia un Terminale
Per avviare un nuovo terminale,
- vai nella tua home
- click: File > New > Terminal
Si aprirà una nuova finestra con il prompt '$' di Linux.
Il Terminale PAWS non è un emulatore, ma una vera shell bash eseguita su una vera installazione di GNU/Linux in un container docker, quindi puoi eseguire qualsiasi comando bash e usare qualsiasi comando sia disponibile sul sistema.
Puoi vedere alcuni dei comandi disponibili utilizzando ls /bin/
.
$ ls /bin/
bash cat domainname journalctl mkdir pwd stty tar zcmp
unzip2 chacl echo kill mknod rbash su tempfile zdiff
../..
$ ls /usr/bin/
2to3-3.4 dvipdf lcf printf systemd-path
X11 dwp ld prlimit systemd-run
../..
To see them all, press TAB twice.
Login to the wiki
This will establish your account on the server and allow you to log in from the command line. Il seguente comando dovrebbe confermare che puoi connetterti al testwiki. Il collegamento utilizza OAuth, quindi non dovrebbe necessario inserire una password.
$ pwb.py login
Logging in to wikipedia:test as <username>
Logged in on wikipedia:test as <username>.
You can connect pywikibot to a different wiki by creating a file named user-config.py in your $HOME
directory (/home/paws
) and adding mylang and family variables:
mylang = 'test'
family = 'wikipedia'
You can type vim user-config.py
in the terminal, then I to insert text, add the text, then Esc to exist insert mode, then :wq and Enter to finishing editing.
Creare una pagina
Puoi creare una pagina, inserendo il seguente comando nel Terminale, sostituendo a '<username>' il tuo username e premendo 'Y' quando richiesto per confermare le modifiche.
$ pwb.py add_text -up -talk -page:"User talk:<username>" -text:"Hello. ~~~~"
Loading User talk:<username>...
>>> User talk:<username> <<<
@@ -0,0 +1 @@
+ Hello. ~~~~
Do you want to accept these changes? ([Y]es, [N]o, [a]ll, open in [b]rowser): Y
Page [[User talk:<username>]] saved
Hai completato la tua prima modifica. Puoi visionare le modifiche apportate alla pagina https://test.wikipedia.org/wiki/User_talk:<username>.
Puoi sapere di più sugli script utilizzati eseguendoli con l'opzione '-help'.
$ pwb.py add_text -help
...
Fetch a page
Si possono scaricare più pagine contemporaneamente con il comando "listpages".
Per scaricare il contenuto della pagina che hai creato nella precedente sezione, inserisci il seguente comando:
$ pwb.py listpages -page:"User talk:<username>" -save
1 <username>
Saving User talk:<username> to /home/paws/User_talk_<username>
1 page(s) found
Ora se si esegue $ ls
la pagina dovrebbe essere presente.
A real script example
When a website used on Wikipedia changes its URL, the links on Wikipedia become outdated, and possible also dead links if the website doesn't redirect from the old URLs to the new URLs. For example, Encyclopedia Britannica (EB) has changed their links, such as moving pages from http://www.britannica.com/EBchecked/media/ to http://www.britannica.com/topic/[topic name]/images-videos/*. You can find the list of usages of the old URL on English Wikipedia at w:Special:LinkSearch/http://www.britannica.com/EBchecked/media. Updating all those links manually will be very time consuming. Thankfully EB has maintained redirects from their old URLs to the new URLs, so this does not need to be fixed immediately.
For a simpler example, English Wikipedia currently contains links to http://britannica.com/EBchecked/ instead of http://www.britannica.com/EBchecked/; i.e. a 'www.' subdomain is missing in the URL.
There are currently 14 cases on English Wikipedia: w:Special:LinkSearch/http://britannica.com/EBchecked/
Wikipedia in other language also have this problem. e.g. there is one case on German Wikipedia: w:de:Spezial:Weblinksuche/http://britannica.com/EBchecked/
In order to fix those links, we can use Pywikibot replace.py script. In this demo we will use the '-simulate' argument to avoid writing to the wiki, as there are strict rules about automated editing of English Wikipedia.
First, let's list all of the pages which link to http://britannica.com/EBchecked/.
$ pwb.py listpages -lang:en -weblink:"britannica.com/EBchecked/"
1 Bhatner fort
2 Mohammad Ishaq Khan
3 Fringe theories/Noticeboard/Archive 7
4 El Riego phase
5 Catalonia/Archive 4
6 Stephen I of Hungary
7 Stephen I of Hungary/Archive 1
8 Väinö Tanner
9 Tokaji
10 Transylvania/Archive5
11 Hungarians in Romania
12 Transylvania
13 Uttarakhand
14 Françoise Giroud
14 page(s) found
Now we check those pages actually have the literal URL in the page; i.e. they are not using a template.
$ pwb.py listpages -lang:en -weblink:"britannica.com/EBchecked/" -grep:"britannica.com\/EBchecked"
1 Bhatner fort
2 Mohammad Ishaq Khan
3 Fringe theories/Noticeboard/Archive 7
4 El Riego phase
5 Catalonia/Archive 4
6 Stephen I of Hungary
7 Stephen I of Hungary/Archive 1
8 Väinö Tanner
9 Tokaji
10 Transylvania/Archive5
11 Hungarians in Romania
12 Transylvania
13 Uttarakhand
14 Françoise Giroud
14 page(s) found
Now use replace to add the missing "www."
$ pwb.py replace -lang:en -simulate -weblink:"britannica.com/EBchecked/" -grep:"britannica.com\/EBchecked" "http://britannica.com/EBchecked/" "http://www.britannica.com/EBchecked/"
The summary message for the command line replacements will be something like: Bot: Automated text replacement (-http://britannica.com/EBchecked/ +http://www.britannica.com/EBchecked/)
Press Enter to use this automatic message, or enter a description of the
changes your bot will make:
Logging in to wikipedia:en as <username>
Retrieving 14 pages from wikipedia:en.
Retrieving 14 pages from wikipedia:en.
>>> Stephen I of Hungary <<<
@@ -47 +47 @@
- Stephen's birth date is uncertain because it was not recorded in contemporaneous documents.{{sfn|Györffy|1994|p=64}} Hungarian and Polish chronicles written centuries later give three different years: 967, 969 and 975.{{sfn|Kristó|2001|p=15}} The unanimous testimony of his three late 11th-century or early 12th-century [[hagiographies]] and other Hungarian sources, which state that Stephen was "still an adolescent" in 997,<ref>''Hartvic, Life of King Stephen of Hungary'' (ch. 5), p. 381.</ref> substantiate the reliability of the later year (975).{{sfn|Györffy|1994|p=64}}{{sfn|Kristó|2001|p=15}} Stephen's ''[[Life of Saint Stephen, King of Hungary (Vita minor)|Lesser Legend]]'' adds that he was born in [[Esztergom]],{{sfn|Györffy|1994|p=64}}{{sfn|Kristó|2001|p=15}}<ref name=Britannica>{{cite encyclopedia|title=Stephen I|url=http://britannica.com/EBchecked/topic/565415/Stephen-I|encyclopedia=[[Encyclopædia Britannica]]|publisher=Encyclopædia Britannica, Inc.|year=2008|accessdate=2008-07-29}}</ref> which implies that he was born after 972 because his father, [[Géza, Grand Prince of the Hungarians]], chose Esztergom as royal residence around that year.{{sfn|Györffy|1994|p=64}} Géza promoted the spread of Christianity among his subjects by force, but never ceased worshipping pagan gods.{{sfn|Kontler|1999|p=51}}{{sfn|Berend|Laszlovszky|Szakács|2007|p=331}} Both his son's ''[[Life of Saint Stephen, King of Hungary (Vita maior)|Greater Legend]]'' and the nearly contemporaneous [[Thietmar of Merseburg]] described Géza as a cruel monarch, suggesting that he was a despot who mercilessly consolidated his authority over the rebellious Hungarian lords.{{sfn|Berend|Laszlovszky|Szakács|2007|p=331}}{{sfn|Bakay|1999|p=547}}
+ Stephen's birth date is uncertain because it was not recorded in contemporaneous documents.{{sfn|Györffy|1994|p=64}} Hungarian and Polish chronicles written centuries later give three different years: 967, 969 and 975.{{sfn|Kristó|2001|p=15}} The unanimous testimony of his three late 11th-century or early 12th-century [[hagiographies]] and other Hungarian sources, which state that Stephen was "still an adolescent" in 997,<ref>''Hartvic, Life of King Stephen of Hungary'' (ch. 5), p. 381.</ref> substantiate the reliability of the later year (975).{{sfn|Györffy|1994|p=64}}{{sfn|Kristó|2001|p=15}} Stephen's ''[[Life of Saint Stephen, King of Hungary (Vita minor)|Lesser Legend]]'' adds that he was born in [[Esztergom]],{{sfn|Györffy|1994|p=64}}{{sfn|Kristó|2001|p=15}}<ref name=Britannica>{{cite encyclopedia|title=Stephen I|url=http://www.britannica.com/EBchecked/topic/565415/Stephen-I|encyclopedia=[[Encyclopædia Britannica]]|publisher=Encyclopædia Britannica, Inc.|year=2008|accessdate=2008-07-29}}</ref> which implies that he was born after 972 because his father, [[Géza, Grand Prince of the Hungarians]], chose Esztergom as royal residence around that year.{{sfn|Györffy|1994|p=64}} Géza promoted the spread of Christianity among his subjects by force, but never ceased worshipping pagan gods.{{sfn|Kontler|1999|p=51}}{{sfn|Berend|Laszlovszky|Szakács|2007|p=331}} Both his son's ''[[Life of Saint Stephen, King of Hungary (Vita maior)|Greater Legend]]'' and the nearly contemporaneous [[Thietmar of Merseburg]] described Géza as a cruel monarch, suggesting that he was a despot who mercilessly consolidated his authority over the rebellious Hungarian lords.{{sfn|Berend|Laszlovszky|Szakács|2007|p=331}}{{sfn|Bakay|1999|p=547}}
Do you want to accept these changes? ([y]es, [N]o, [e]dit, open in [b]rowser, [a]ll, [q]uit): N
...
In PAWS, and any terminal that supports color, the diff of changes will show the added "www." in green text color, making it easier to find the proposed changes.
Inside Pywikibot
Next we will use the PAWS Python session.
- Go to your PAWS home,
- click 'New' on the right hand side, and
- select 'Python 3'.
This will open a new window.
In the text box, enter the following and in the Cell menu select 'Run' (or pressing shift+enter to run).
import pywikibot
A new text box will appear below. Run the following to create an APISite object connected to https://test.wikipedia.org/:
site = pywikibot.Site('test', 'wikipedia')
Describe "site" by entering it into the new text box and selecting "Run".
site
It should show
Out[3]: APISite("test", "wikipedia")
Create a page object:
page = pywikibot.Page(site, 'test')
Check it exists by running:
page.exists()
It should output
VERBOSE:pywiki:Found 1 wikipedia:test processes running, including this one. Out[5]: True
Show the text on the page:
page.text
Change the page text in the object:
page.text = 'Hello world'
Save the page to the wiki:
page.save()
The response should be:
Page [[Test]] saved
INFO:pywiki:Page [[Test]] saved
The interactive Python 3 notebook allows many lines to be run together. The above could be put into one text box and Run
import pywikibot
site = pywikibot.Site('test', 'wikipedia')
page = pywikibot.Page(site, 'test')
page.text = 'Hello world!'
page.save()
The log of your interactive Python session can be saved or downloaded for future reference.
Accessing online documentation in PAWS
Pywikibot documentation may be found at wmdoc:pywikibot. It is primarily sourced from docstrings, which can be loaded in the interactive Python 3 notebook using the Python built-in function help().
For example, to look at the arguments for the save method above, run either:
help(page.save)
or
help(pywikibot.Page.save)
Editing Pywikibot scripts
The Pywikibot library and scripts are located in /srv/paws, and are read-only. The installed Pywikibot library cannot be modified in PAWS.
Scripts may be modified after copying them into your PAWS home.
For example, to run a modified "checkimages.py":
- In the terminal, enter
cp /srv/paws/pwb/scripts/checkimages.py ~
- In a browser, go to your PAWS home and click on the file
checkimages.py
.
- In the browser, you can edit the file.
Edit the code -- for instance, just after the start = time.time()
code on line 1775, add a new line 1776 that will print out your name: print("MYNAME's version.")
- In the editing interface, use the File menu and click Save to save your modifications.
- In the terminal, enter
pwb.py ~/checkimages.py -simulate -limit:10
(If no '-limit:x' defined, the program would run until all images checked, it may take long time.)
See also
- wikitech:PAWS/PAWS and Pywikibot
- Using Pywikibot with PAWS tutorial - A tutorial that helps users get started with using Pywikibot and PAWS
- Example notebooks using Pywikibot - A list of notebooks hosted on PAWS that use Pywikibot
- PAWS cheatsheet by one user (e.g. about API and database access)
- Source code on GitHub
- Small wiki toolkits workshop about running basic Pywikibot scripts
- Self-study materials based on the small wiki toolkits workshop
- Workshop handbook based on the small wiki toolkits workshop
- Se hai bisogno di ulteriore aiuto per configurare il tuo Pywikibot visita il canale IRC #pywikibot connect o la mailing list pywikibot@.