Solicitação GET/POST, para analisar o conteúdo de uma página e obter o resultado.
Documentação da API
A documentação a seguir é a saída de Special:ApiHelp/parse, gerada automaticamente pela versão de pré-lançamento do MediaWiki em execução neste site (MediaWiki.org).
Parse the content of this revision. Overrides page and pageid.
Type: integer
prop
Which pieces of information to get:
text
Gives the parsed text of the wikitext.
langlinks
Gives the language links in the parsed wikitext.
categories
Gives the categories in the parsed wikitext.
categorieshtml
Gives the HTML version of the categories.
links
Gives the internal links in the parsed wikitext.
templates
Gives the templates in the parsed wikitext.
images
Gives the images in the parsed wikitext.
externallinks
Gives the external links in the parsed wikitext.
sections
Gives the sections in the parsed wikitext.
revid
Adds the revision ID of the parsed page.
displaytitle
Adds the title of the parsed wikitext.
subtitle
Adds the page subtitle for the parsed page.
headhtml
Gives parsed doctype, opening <html>, <head> element and opening <body> of the page.
modules
Gives the ResourceLoader modules used on the page. To load, use mw.loader.using(). Either jsconfigvars or encodedjsconfigvars must be requested jointly with modules.
jsconfigvars
Gives the JavaScript configuration variables specific to the page. To apply, use mw.config.set().
encodedjsconfigvars
Gives the JavaScript configuration variables specific to the page as a JSON string.
indicators
Gives the HTML of page status indicators used on the page.
iwlinks
Gives interwiki links in the parsed wikitext.
wikitext
Gives the original wikitext that was parsed.
properties
Gives various properties defined in the parsed wikitext.
limitreportdata
Gives the limit report in a structured way. Gives no data, when disablelimitreport is set.
limitreporthtml
Gives the HTML version of the limit report. Gives no data, when disablelimitreport is set.
parsetree
The XML parse tree of revision content (requires content model wikitext)
parsewarnings
Gives the warnings that occurred while parsing content (as wikitext).
parsewarningshtml
Gives the warnings that occurred while parsing content (as HTML).
headitems
Deprecated. Gives items to put in the <head> of the page.
Do a pre-save transform (PST) on the input, but don't parse it. Returns the same wikitext, after a PST has been applied. Only valid when used with text.
Apply the selected skin to the parser output. May affect the following properties: text, langlinks, headitems, modules, jsconfigvars, indicators.
One of the following values: apioutput, authentication-popup, cologneblue, fallback, json, minerva, modern, monobook, timeless, vector, vector-2022
contentformat
Content serialization format used for the input text. Only valid when used with text.
One of the following values: application/json, application/octet-stream, application/unknown, application/x-binary, text/css, text/javascript, text/plain, text/unknown, text/x-wiki, unknown/unknown
contentmodel
Content model of the input text. If omitted, title must be specified, and default will be the model of the specified title. Only valid when used with text.
One of the following values: Chart.JsonConfig, GadgetDefinition, Json.JsonConfig, JsonSchema, Map.JsonConfig, MassMessageListContent, NewsletterContent, Scribunto, SecurePoll, Tabular.JsonConfig, css, flow-board, javascript, json, sanitized-css, text, translate-messagebundle, unknown, wikitext
mobileformat
Return parse output in a format suitable for mobile devices.
Maximum number of values is 50 (500 for clients that are allowed higher limits).
templatesandboxtitle
Parse the page using templatesandboxtext in place of the contents of the page named here.
templatesandboxtext
Parse the page using this page content in place of the page named by templatesandboxtitle.
templatesandboxcontentmodel
Content model of templatesandboxtext.
One of the following values: Chart.JsonConfig, GadgetDefinition, Json.JsonConfig, JsonSchema, Map.JsonConfig, MassMessageListContent, NewsletterContent, Scribunto, SecurePoll, Tabular.JsonConfig, css, flow-board, javascript, json, sanitized-css, text, translate-messagebundle, unknown, wikitext
templatesandboxcontentformat
Content format of templatesandboxtext.
One of the following values: application/json, application/octet-stream, application/unknown, application/x-binary, text/css, text/javascript, text/plain, text/unknown, text/x-wiki, unknown/unknown
#!/usr/bin/python3""" parse.py MediaWiki API Demos Demo of `Parse` module: Parse content of a page MIT License"""importrequestsS=requests.Session()URL="https://en.wikipedia.org/w/api.php"PARAMS={"action":"parse","page":"Pet door","format":"json"}R=S.get(url=URL,params=PARAMS)DATA=R.json()print(DATA["parse"]["text"]["*"])
PHP
<?php/* parse.php MediaWiki API Demos Demo of `Parse` module: Parse content of a page MIT License*/$endPoint="https://en.wikipedia.org/w/api.php";$params=["action"=>"parse","page"=>"Pet door","format"=>"json"];$url=$endPoint."?".http_build_query($params);$ch=curl_init($url);curl_setopt($ch,CURLOPT_RETURNTRANSFER,true);$output=curl_exec($ch);curl_close($ch);$result=json_decode($output,true);echo($result["parse"]["text"]["*"]);
JavaScript
/** * parse.js * * MediaWiki API Demos * Demo of `Parse` module: Parse content of a page * * MIT License */consturl="https://en.wikipedia.org/w/api.php?"+newURLSearchParams({origin:"*",action:"parse",page:"Pet door",format:"json",});try{constreq=awaitfetch(url);constjson=awaitreq.json();console.log(json.parse.text["*"]);}catch(e){console.error(e);}
MediaWiki JS
/** * parse.js * * MediaWiki API Demos * Demo of `Parse` module: Parse content of a page * MIT License */constparams={action:'parse',page:'Pet door',format:'json'};constapi=newmw.Api();api.get(params).done(data=>{console.log(data.parse.text['*']);});
Exemplo 2: Analisar uma secção de uma página e obter seus dados de tabela
{"parse":{"title":"Wikipedia:Unusual articles/Places and infrastructure","pageid":38664530,"wikitext":{"*":"===Antarctica===\n<!--[[File:Grytviken church.jpg|thumb|150px|right|A little church in [[Grytviken]] in the [[Religion in Antarctica|Antarctic]].]]-->\n{| class=\"wikitable\"\n|-\n| '''[[Emilio Palma]]'''\n| An Argentine national who is the first person known to be born on the continent of Antarctica.\n|-\n| '''[[Scouting in the Antarctic]]'''\n| Always be prepared for glaciers and penguins.\n|}"}}}
Código de exemplo
parse_wikitable.py
#!/usr/bin/python3""" parse_wikitable.py MediaWiki Action API Code Samples Demo of `Parse` module: Parse a section of a page, fetch its table data and save it to a CSV file MIT license"""importcsvimportrequestsS=requests.Session()URL="https://en.wikipedia.org/w/api.php"TITLE="Wikipedia:Unusual_articles/Places_and_infrastructure"PARAMS={'action':"parse",'page':TITLE,'prop':'wikitext','section':5,'format':"json"}defget_table():""" Parse a section of a page, fetch its table data and save it to a CSV file """res=S.get(url=URL,params=PARAMS)data=res.json()wikitext=data['parse']['wikitext']['*']lines=wikitext.split('|-')entries=[]forlineinlines:line=line.strip()ifline.startswith("|"):table=line[2:].split('||')entry=table[0].split("|")[0].strip("'''[[]]\n"),table[0].split("|")[1].strip("\n")entries.append(entry)file=open("places_and_infrastructure.csv","w")writer=csv.writer(file)writer.writerows(entries)file.close()if__name__=='__main__':get_table()
Erros possíveis
Código
Informação
missingtitle
A página que especificou não existe.
nosuchsection
Não existe nenhuma secção section em page.
pagecannotexist
O espaço nominal não permite páginas reais.
invalidparammix
Os parâmetros page, pageid, oldid, text não podem ser usados em conjunto.
Os parâmetros page, pageid, oldid, title não podem ser usados em conjunto.
Os parâmetros page, pageid, oldid, revid não podem ser usados em conjunto.