Hi, I was looking in latvian wiki with a string search inside full text. This query does not look to work properly :https://w.wiki/B3Ct returns result but if you do look "neizmantots" inside the wikicode, it is not findable.
Talk:Wikidata Query Service/User Manual/MWAPI
Appearance
This page and Wikidata Query Service/User Manual#MediaWiki API give examples using the following params:
gsrsearch, gsrlimit, gcmprop, gcmlimit
Where are they documented?
The page says "It is permissible to add input parameters not specified in the configuration, they will be passed to the service query. Please refer to the API documentation for the lists of parameters each service has". I searched in API:Query#Generators and can't find them there.
It would be very useful for SPARQL devs to have a full list of params listed on this page, maybe with links to their definitions in the MW API page.
These are the same parameters you put in actual API request, e.g. when using API sandbox. There's no full list of parameters, because each API has its own parameters and those can be anything. So what I would suggest is using API tool - like API sandbox - first to assemble the API call and ensure it works properly, and then copy the parameter names/values from there to MWAPI call in WDQS.
It would be very useful if you could illustrate finding info in the API sandbox. Eg I wanted to see the params for "Generator" but the sandbox field "action" doesn't have such choice. API:Query#Generators doesn't mention "gsrsearch".
Please make it easier for folk who know SPARQL but not MWAPI to use this exension. Thanks in advance!
I agree with Vladimir - I found these through a random StackOverflow answer. Further attempts to understand how they work (or even what they are doing) have been fruitless, because seemingly there is zero - zip - nada - no documentation whatsoever. I can't even find the source code to read!
Dunno, the documentation at Special:ApiSandbox seems quite readable to me (after you click around a bit). E.g. the generator/search parameters gsrsearch
and gsrlimit
are documented by clicking at the “generator=search” section at the left menu of e.g. Special:ApiSandbox#action=query&format=json&generator=search&formatversion=2&gsrsearch=test.
- If I start from Special:ApiSandbox and select action=query, WHERE do I select "generator"?
- Below "action" there's an "expand" that shows links like https://www.mediawiki.org/w/api.php?action=help&modules=query
- Searching there for "search" brings me to https://www.mediawiki.org/w/api.php?action=help&modules=query%2Bsearch, which documents srsearch, srlimit (not gsrsearch, gsrlimit; although gsrsearch appears in a sample link at the bottom), and doesn't document gcmprop, gcmlimit. @Mormegil Can you find the documentation of gsrsearch, gsrlimit, gcmprop, gcmlimit somewhere?
I'm sure the interactive API and the automatically generated documentation are very useful as a reference, if you already know the MWAPI well. But they are hard to use if it's the first time you're trying MWAPI.
When you select action=query, a section for action=query appears in the main tree on the left. When you click the section, all available parameters for action=query are offered, and generator
is among them. You can choose search
in its dropdown; after that, the corresponding generator=search
section appears in the main tree again. When you click on it, all parameters for generator=search appear, and among those, gsrsearch, gsrlimit. (gcmprop and gcmlimit are properties for generator=categorymembers, so, by choosing categorymembers for generator, and by clicking on generator=categorymembers which appears in the main tree, you’ll see offered those).
Take a look at the Examples at the bottom of the last page:
- Search for meaning.
- api.php?action=query&list=search&srsearch=meaning [open in sandbox]
- Search texts for meaning.
- api.php?action=query&list=search&srwhat=text&srsearch=meaning [open in sandbox]
- Get page info about the pages returned for a search for meaning.
- api.php?action=query&generator=search&gsrsearch=meaning&prop=info [open in sandbox]
A novel user like me would have the following questions:
- What's the difference between list=search and generator=search, and how do I know which to use in which case?
- Why two links use srsearch but one uses gsrsearch, and how do I know which to use when?
What is even the difference between the 3 examples?
- How is "search for" different from "search texts for"? The two examples return the same data
- Ok, I get how the third call is different: it returns page title and metadata, not search hits. From this I can surmise that generator returns a list of pages, whereas list returns a list of search hits
Can this service handle output elements that contain text nodes? I'm struggling to get back any output for the pageviews
property:
SELECT ?title ?wd ?pageviews WHERE {
SERVICE wikibase:mwapi {
bd:serviceParam wikibase:api "Generator" .
bd:serviceParam wikibase:endpoint "en.wikipedia.org" .
bd:serviceParam mwapi:titles "List of mountain peaks by prominence" .
bd:serviceParam mwapi:generator "links" .
bd:serviceParam mwapi:gplprop "ids|title|type" .
bd:serviceParam mwapi:gpllimit "max" .
bd:serviceParam mwapi:pvipmetric "pageviews" .
bd:serviceParam mwapi:pvipdays "1" .
bd:serviceParam wikibase:limit 50 .
?title wikibase:apiOutput mwapi:title.
?wd wikibase:apiOutputItem mwapi:item.
?pageviews wikibase:apiOutput "pageviews/pvip/text()".
}
}
I've tried a variety of XPaths, even pageviews/pvip/@date
, but the ?pageviews
column always ends up empty.
Each item in the API response looks like this:
<page _idx="220167" pageid="220167" ns="0" title="Aconcagua">
<pageviews>
<pvip date="2021-04-27" xml:space="preserve">1266</pvip>
</pageviews>
</page>
This seems to work
SELECT ?title ?wd ?pageviews WHERE { SERVICE wikibase:mwapi { bd:serviceParam wikibase:api "Generator" . bd:serviceParam wikibase:endpoint "en.wikipedia.org" . bd:serviceParam mwapi:titles "List of mountain peaks by prominence" . bd:serviceParam mwapi:generator "links" . bd:serviceParam mwapi:gplprop "ids|title|type" . bd:serviceParam mwapi:gpllimit "max" . bd:serviceParam mwapi:prop "info|pageprops|pageviews" . bd:serviceParam mwapi:pvipdays "1" . bd:serviceParam wikibase:limit 50 . ?title wikibase:apiOutput mwapi:title. ?wd wikibase:apiOutputItem mwapi:item. ?pageviews wikibase:apiOutput "pageviews/pvip/text()". } }
The following query get the number of items by gender in a Wikipedia article.
SELECT ?gender ?genderLabel (COUNT(?item) AS ?count)
WHERE {
SERVICE wikibase:mwapi {
bd:serviceParam wikibase:endpoint "fr‧wikipedia.org";
wikibase:api "Generator";
mwapi:generator "links";
mwapi:titles "Sociologie";.
?item wikibase:apiOutputItem mwapi:item.
}
FILTER BOUND (?item) # Safeguard to not get a timeout from unbound items when using ?item below
?item wdt:P21 ?gender .
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}
GROUP BY ?gender ?genderLabel
I've been using thus query regularly this last two weeks and there was no bug but since yesterday night (March, 2nd 2021), this doesn't work anymore. Do you have any explanation for this behaviour change? PAC2 (talk) 20:18, 3 March 2021 (UTC)
I doubt very much that this query have ever worked. The MWAPI endpoint is misspelled: ""fr‧wikipedia.org" instead of "fr.wikipedia.org"
PS: I recommend using (COUNT(*) AS ?count) instead of (COUNT(?item) AS ?count). There is no need here to check that ?item is bound and that the value is error free which the latter version does. The change wont save much time here, but it is a good thing to remember to use when possible as it can save considerably time when counting large numbers.
Following the examples in the page https://www.mediawiki.org/wiki/Wikidata_Query_Service/User_Manual/MWAPI#Examples, I try to get the list of articles created by a user inside SPARQL. I've tried the following request :
SELECT ?title WHERE {
SERVICE wikibase:mwapi {
bd:serviceParam wikibase:endpoint "fr‧wikipedia.org";
wikibase:api "Generator";
mwapi:generator "usercontribs";
mwapi:user "PAC2";
mwapi:show "new";.
?title wikibase:apiOutput mwapi:title.
}
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}
But this doesn't work. Is it possible to do it or not ?
As you can see at API:Query "usercontribs" is only available in API action=query calls as a list parameter, and not as a generator. List queries are not a directly supported service for MWAPI. But a possible hack is that it is possible to make a query API call with both a generator and a list section. The drawback is that only one result can be fetched (to the same variables) per API call because the output configuration will be for the generator part of the call.
There is a recent example of an MWAPI call with a combined generator and list query for usercontribs at https://www.wikidata.org/wiki/Wikidata:Request_a_query/Archive/2021/02#Wikidata_items_I_created
How the MWAPI is technically implemented? IE is it some Blazegraph extension or is there some externeal code in github etc?
It's a Blazegraph extension that you can find here: https://github.com/wikimedia/wikidata-query-rdf/tree/master/blazegraph/src/main/java/org/wikidata/query/rdf/blazegraph/mwapi
Hi,
I tried to query wikidata with entietySearch and I get no result. A few weeks ago everything was working.
Also the first example in this article does not return any result.
Has anything changed or is this a temporary issue?
same here, entity search return empty. Even when one the examples on the article is used.
The WDQS team have identified the issue and are working on it - see the task here: https://phabricator.wikimedia.org/T263952.
A better place to contact the development team is on Wikidata here - they keep track of this page: https://www.wikidata.org/wiki/Wikidata:Contact_the_development_team/Query_Service_and_search#Mwapi_service_not_working_for_the_last_couple_of_days_(September_30)
Hi now that https://wcqs-beta.wmflabs.org is up and running I was experimenting with how to combine SDC SPARQL queries with information stored in SQL database like category membership, presence of specific templates, etc. I could not fine any way with exception of wikibase:mwapi service, I tried
SELECT ?file ?wd ?fileStr {
SERVICE wikibase:mwapi {
bd:serviceParam wikibase:api "Generator" .
bd:serviceParam wikibase:endpoint "commons.wikimedia.org" .
bd:serviceParam mwapi:gcmtitle "Category:Artworks with mismatching structured data P6243 property" .
bd:serviceParam mwapi:generator "categorymembers" .
bd:serviceParam mwapi:gcmtype "page" .
bd:serviceParam mwapi:gcmlimit "max" .
bd:serviceParam mwapi:gcmsort "timestamp" .
?pageid wikibase:apiOutputItem mwapi:pageid.
?ns wikibase:apiOutput "@ns".
}
#?file schema:contentUrl ?url .
FILTER (?ns = "6") # files only
BIND (replace(str(?pageid),'http://www.wikidata.org/entity/','https://commons.wikimedia.org/entity/M') as ?fileStr)
BIND (str(?file) as ?fileStr)
?file wdt:P6243 ?wd .
<nowiki>}</nowiki>
but so far I did not managed to get it to work. I was thinking that since
SELECT ?file ?wd ?fileStr {
BIND (str(?file) as ?fileStr)
?file wdt:P6243 ?wd .
} limit 10
and
SELECT ?fileStr {
SERVICE wikibase:mwapi {
bd:serviceParam wikibase:api "Generator" .
bd:serviceParam wikibase:endpoint "commons.wikimedia.org" .
bd:serviceParam mwapi:gcmtitle "Category:Artworks with mismatching structured data P6243 property" .
bd:serviceParam mwapi:generator "categorymembers" .
bd:serviceParam mwapi:gcmtype "page" .
bd:serviceParam mwapi:gcmlimit "max" .
bd:serviceParam mwapi:gcmsort "timestamp" .
?pageid wikibase:apiOutputItem mwapi:pageid.
?ns wikibase:apiOutput "@ns".
}
#?file schema:contentUrl ?url .
FILTER (?ns = "6") # files only
BIND (replace(str(?pageid),'http://www.wikidata.org/entity/','https://commons.wikimedia.org/entity/M') as ?fileStr)
} limit 10
both create ?fileStr like "https://commons.wikimedia.org/entity/M9094174" than I can combine them in order to query SDC statements within a category. Any idea how to get this to work?
I think that just converting the FileStr to URI should make it a proper M-item for SDC. However, my example query below is pretty slow so i think that it may needs to be splitted to two (like here).
SELECT ?file ?p6243 {
SERVICE wikibase:mwapi {
bd:serviceParam wikibase:api "Generator" .
bd:serviceParam wikibase:endpoint "commons.wikimedia.org" .
bd:serviceParam mwapi:gcmtitle "Category:Artworks with mismatching structured data P6243 property" .
bd:serviceParam mwapi:generator "categorymembers" .
bd:serviceParam mwapi:gcmtype "page" .
bd:serviceParam mwapi:gcmlimit "max" .
bd:serviceParam mwapi:gcmsort "timestamp" .
?pageid wikibase:apiOutputItem mwapi:pageid.
?ns wikibase:apiOutput "@ns".
}
#?file schema:contentUrl ?url .
FILTER (?ns = "6") # files only
BIND (URI(replace(str(?pageid),'http://www.wikidata.org/entity/','https://commons.wikimedia.org/entity/M')) as ?file)
?file wdt:P6243 ?p6243
} limit 10
With help from other forums I did managed to get the query to work. See c:Commons:SPARQL_query_service/queries/examples#Wikidata_items_of_files_in_Category:Artworks_with_structured_data_with_redirected_P6243_property .
Hi, I tried to fetch revision like this, but i could not figure out how to access to actual content which should be under the key "*"
. Do you know how i should do that?
SOLVED: Example is now fixed based on answer below
SELECT * WHERE {
BIND(wd:Q42 AS ?item)
?item wdt:P18 ?image.
BIND(STRAFTER(wikibase:decodeUri(STR(?image)), "http://commons.wikimedia.org/wiki/Special:FilePath/") AS ?fileTitle)
SERVICE wikibase:mwapi {
bd:serviceParam wikibase:endpoint "commons.wikimedia.org";
wikibase:api "Generator";
wikibase:limit "once";
mwapi:generator "allpages";
mwapi:gapfrom ?fileTitle;
mwapi:gapnamespace 6; # NS_FILE
mwapi:gaplimit 1;
mwapi:prop "revisions";
mwapi:rvprop "content".
?contentmodel wikibase:apiOutput 'revisions/rev/@contentmodel'.
?contentformat wikibase:apiOutput 'revisions/rev/@contentformat'.
?content wikibase:apiOutput 'revisions/rev/text()' .
}
}
There is no key "*". MWAPI request output in XML format from the API and uses the XPath query language to find the wanted elements in the XML output. The XML has the context as the text in a "rev" element that haves "revisions" as parent element, so you have to add the triple
?content wikibase:apiOutput 'revisions/rev/text()' .
to the "SERVICE wikibase:mwapi" call in your SPARQL query.
It worked! Thank you very much.
Is there an example of how to get the WD items of all members of a WP category recursively? Other tools maybe?
You can do it with PetScan if you need just a list of wikidata id:s
1.) Select target categories and wiki in "Categories" tab 2.) Set "use wiki" value to "Wikidata" in "Other sources" tab so it will fetch the wikidata ids 3.) Select preferred format in "Output" tab
Example query - https://petscan.wmflabs.org/?psid=17439495