Extension:WikiSearch/api
#WikiSearchFrontend
on the page that contains #WikiSearchConfig
which will automatically create a search engine.Performing a search requires:
- A search configuration page on your wiki containing a search configuration created by
#WikiSearchConfig
; - An API request performing the actual search, pointing to the page ID of your page containing the configuration.
Configuration page
[edit]See Extension:WikiSearch/usage for this documentation.
Performing a search
[edit]Performs a search and returns the list of search results. If the API is in debug mode, this endpoint also returns the raw ElasticSearch query that was used to perform the search.
Parameters
[edit]Parameter | Type | Description |
---|---|---|
pageid
|
integer
|
The MediaWiki page ID of the page from which the search configuration should be retrieved. Needs to be a valid page ID of a page containing a configuration. |
term
|
string
|
The search term query to use for the main free-text search. This corresponds to the main search field on a search page. Defaults to the empty string. When no term is given, all results are returned.
|
from
|
integer
|
The cursor to use for pagination. from specifies the offset of the results to return. Defaults to 0 .
|
limit
|
integer
|
The limit on the number of results to return (inclusive). Defaults to 10 .
|
filter
|
list
|
The filters to apply to the search. Defaults to the empty list. See below for additional information about the syntax. |
aggregations
|
list
|
The aggregations to generate from the search. Defaults to the empty list. See below for additional information and how to specify the aggregations. |
sorting
|
list
|
The sortings to apply to the search. Defaults to the empty list. See below for additional information about and how to specify the sortings. |
Example request
[edit]JavaScript
[edit]var params = {
action: 'query',
format: 'json',
meta: 'WikiSearch',
filter: [{"value":"5","key":"Average rating","range":{"gte":5,"lte":6}}],
from: '0',
limit: '10',
pageid: '698',
aggregations: [{
"type": "range",
"ranges": [
{
"from": 1,
"to": 6,
"key": "1"
},
{
"from": 2,
"to": 6,
"key": "2"
},
{
"from": 3,
"to": 6,
"key": "3"
},
{
"from": 4,
"to": 6,
"key": "4"
},
{
"from": 5,
"to": 6,
"key": "5"
}
],
"property": "Average rating"
}]
}
api = new mw.Api();
api.post(params).done(function(data) {
console.log(data);
});
cURL
[edit]curl https://wiki.example.org/api.php \ -d action=query \ -d format=json \ -d meta=WikiSearch \ -d filter=[{"value":"5","key":"Average rating","range":{"gte":5,"lte":6}}] \ -d from=0 \ -d limit=10 \ -d pageid=698 \ -d aggregations=[ {"type":"range","ranges":[ {"from":1,"to":6,"key":"1"}, {"from":2,"to":6,"key":"2"}, {"from":3,"to":6,"key":"3"}, {"from":4,"to":6,"key":"4"}, {"from":5,"to":6,"key":"5"} ],"property":"Average rating"} ]
Example response
[edit]{
"batchcomplete": "",
"result": {
"hits": "[<TRUNCATED, SEE BELOW FOR PARSING>]",
"total": 1,
"aggs": {
"Average rating": {
"meta": [],
"doc_count": 1,
"Average rating": {
"buckets": {
"1": {
"from": 1,
"to": 6,
"doc_count": 1
},
"2": {
"from": 2,
"to": 6,
"doc_count": 1
},
"3": {
"from": 3,
"to": 6,
"doc_count": 1
},
"4": {
"from": 4,
"to": 6,
"doc_count": 1
},
"5": {
"from": 5,
"to": 6,
"doc_count": 1
}
}
}
}
}
}
}
Parsing the response
[edit]This section assumes you have successfully made a request to the API using PHP and have stored the raw API result in the
variable $response
.
The $response
object is a JSON encoded string, and needs to be decoded before it can be used:
$response = json_decode($response, true);
After having decoded the $response
object, the response usually contains two keys (three if debug mode is enabled):
Field | Type | Description |
---|---|---|
batchcomplete
|
string
|
Added by MediaWiki and not relevant for API users. |
result
|
object
|
Contains the result object of the performed search. |
query
|
object
|
The raw ElasticSearch query used to perform this search. This field is only available when debug mode is enabled. |
Generally, we are only interested in the API result object, so we can create a new variable only containing that field:
$result = $response["result"];
This $result
field will look something like this:
{
"hits": "[<TRUNCATED, SEE BELOW FOR PARSING>]",
"total": 1,
"aggs": {
"Average rating": {
"meta": [],
"doc_count": 1,
"Average rating": {
"buckets": {
"1": {
"from": 1,
"to": 6,
"doc_count": 1
},
"2": {
"from": 2,
"to": 6,
"doc_count": 1
},
"3": {
"from": 3,
"to": 6,
"doc_count": 1
},
"4": {
"from": 4,
"to": 6,
"doc_count": 1
},
"5": {
"from": 5,
"to": 6,
"doc_count": 1
}
}
}
}
}
}
The hits
field
[edit]The hits
field contains a JSON-encoded string of the ElasticSearch search results. This field needs to be decoded using json_decode
before it can be used. The field directly corresponds to the hits.hits
field from the ElasticSearch response. See the ElasticSearch documentation for very detailed documentation about what this field looks like.
To get the associated page name of any search result, the subject.namespacename
and subject.title
hit-field in the hits
field may be concatenated using a colon, like so:
$hits = json_decode($result["hits"], true);
foreach ($hits as $hit) {
$namespace_name = $hit["subject"]["namespacename"];
$page_title = $hit["subject"]["title"];
$page_name = sprintf("%s:%s", $namespace_name, $page_title);
echo $page_name;
}
The subject.namespacename
hit-field contains the name of the namespace in which the search result lives, and the subject.title
hit-field contains the name of the page that matched the search (without a namespace prefix). To get the full URL for this page, you can prepend http://<wikiurl>/index.php/
to the page name.
The hits
field also contains the generated highlighted snippets, if they are available. These can be accessed through the highlight
hit-field, like so:
$hits = json_decode($result["hits"], true);
foreach ($hits as $hit) {
$highlights = $hit["highlight"];
foreach ($highlights as $highlight) {
// $highlight is an array of highlighted snippets
$highlight_string = implode("", $highlight);
echo $highlight_string;
}
}
See also the ElasticSearch Highlighting documentation.
The aggs
field
[edit]The aggs
field directly corresponds to the aggregations
field from the ElasticSearch response. See the ElasticSearch documentation for further details.
The total
field
[edit]The total
field contains the total number of results found by ElasticSearch. This field is not influenced by the limit and always displays the total number of results available, regardless of how many were actually returned.
Filters syntax
[edit]The filter
parameter takes a list of objects. These objects have the following form:
PropertyRangeFilter
[edit]This filter only returns pages that have the specified property with a value in the specified range.
{
"key": "Age",
"range": {
"gte": 0,
"lt": 100
}
}
The above filter only includes pages where property Age
has a value that is greater than or equal to 0
, but strictly less than 100
.
The range
parameter takes an object that allows the following properties:
gte
: Greater-than or equal togt
: Strictly greater-thanlte
: Less-than or equal tolt
: Strictly less-than
PropertyValueFilter
[edit]This filter only returns pages that have the specified property with the specified value.
{
"key": "Class",
"value": "Manual"
}
The above filter only includes pages where the property Class
has the value Manual
. The value
may be any of the following data types:
- string
- boolean
- integer
- float
- double
PropertyValuesFilter
[edit]This filter only returns pages that have the specified property with any of the specified values.
{
"key": "Class",
"value": ["foo", "bar"]
}
The above filter only includes pages where the property Class
has the value foo
or bar
.
HasPropertyFilter
[edit]This filter only returns pages that have the specified property with any value.
{
"key": "Class",
"value": "+"
}
The above filter only includes pages that have the property Class
. It does not take the value of the property into account.
PropertyTextFilter
[edit]This filter only returns pages that have the specified property with a value that matches the given search query string.
{
"key": "Class",
"value": "Foo | (Bar + Quz)",
"type": "query"
}
The above filter executes the given query and only includes pages that matched the executed query. The query syntax is identical to the simple query syntax used by ElasticSearch.
Aggregations syntax
[edit]The aggregations
parameter takes a list of objects. These objects have the following form:
PropertyRangeAggregation
[edit]{
"type": "range",
"ranges": [
{ "to": 50 },
{ "from": 50, "to": 100 },
{ "from": 100 }
],
"property": "Price",
"name": "Prices"
}
from
parameter is inclusive, and the to
parameter is exclusive. This means that for an aggregation from (and including) 1
up to and including 5
, the from
and to
parameters should be 1
and 6
(!) respectively.PropertyAggregation
[edit]{
"type": "property",
"property": "Genre",
"name": "Genres"
}
Sortings syntax
[edit]The sortings
parameter takes a list of objects. These objects have the following form:
PropertySort
[edit]{
"type": "property",
"property": "Genre",
"order": "asc"
}
The above filter sorts the results based on the value of the property Genre
in an asc
ending order. It is also possible to sort in a desc
ending order.
Highlight API
[edit]The highlight API has the following properties:
query
: The query to generate highlighted terms fromproperties
: The properties over which the highlights need to be calculatedpage_id
: The page ID of the page on which the highlights need to be calculatedlimit
: The number of highlighted terms to calculate; this does not always correspond directly with the number of terms returned, since duplicates are removed after the query to ElasticSearchsize
: The (approximate) size of snippets to generate, leave blank to highlight individual words