Wikidata - Wikisource Integration Modules
Overview and Background
[edit]The following documentation helps you to deploy various MediaWiki modules and configure a bot, that will help your Wikisource to retrieve a book's metadata from existing data on Wikidata and display it on your Wikisource. For example, this index page on Punjabi Wikisource, is displaying the title, author, translator, publisher, address, and year, information from its respective Wikidata item. The following the pages and their respective functions;
- Modules
- MediaWiki:Proofreadpage index data config: Data configuration presentation for index pages.
- MediaWiki:Proofreadpage index template: Interface text for index pages.
- Module:Index data: To retrieve data from Wikidata.
- Module:Index template: To display the retrieved data on the index page form.
- A significant part of the modules was written by Tpt from French Wikisource, and further improvements made by Bodhisattwa from Bengali Wikisource, and Tshrinivasan as part of the WikiCite Project Grant.
- Bot
- User:WD-WS Integration Bot: While the above modules retrieve data from Wikidata and display it on the index pages, they work only after the respective Wikidata QIDs are added to the index page form, which has to be done manually. The bot helps to automate the process of adding Wikidata QIDs to the index pages, to an extent. With the help of the index page, the bot traces the main pages of books and then their linked Wikidata items.
- The bot has been programmed by Tshrinivasan as part of the WikiCite Project Grant.
Implementation
[edit]Note: Please post a message on the talk page if you need help with deploying the modules on your Wikisource.
Modules
[edit]Step 1: Proofreadpage index data config
[edit]TODO:
- Please copy only highlighted lines of code from the drop down below and add them to your Wikisource's "MediaWiki:Proofreadpage index data config", the URL would be langcode.wikisource.org/wiki/MediaWiki:Proofreadpage_index_data_config.
For example, https://bn.wikisource.org/wiki/মিডিয়াউইকি:Proofreadpage_index_data_config. - Please add the code after the "Type" object ends, as shown below.
- Please translate/transliterate the label "Wikidata Item" at line 27 (below), into your language.
{
"Type": {
"type": "string",
"size": 1,
"default": "book",
"label": "Type",
"header": true,
"values": {
"book": "Book",
"journal": "Journal",
"collection": "Collection",
"phdthesis": "Phdthesis",
"dictionary": "Dictionary",
"film": "Film",
"audio": "Audio"
},
"help": "Select the type of the book",
"data": "type"
},
"wikidata_item": {
"type": "wikibase-itemid",
"size": 1,
"default": "",
"label": "Wikidata Item",
"header": true,
"data": "wikibase-itemid"
},
Step 2: Proofreadpage index template
[edit]TODO: Please copy only the highlighted line of code from the drop down below and add them to your Wikisource's "MediaWiki:Proofreadpage index template", the URL would be langcode.wikisource.org/wiki/MediaWiki:Proofreadpage_index_template.
For example, https://bn.wikisource.org/wiki/মিডিয়াউইকি:Proofreadpage_index_template.
{{#invoke:Index template|indexTemplate
|type={{{Type|}}}
|wikidata_item={{{wikidata_item|}}}
|title={{{Title}}}
|subtitle={{{Subtitle|}}}
|volume={{{Volume|}}}
|edition={{{Edition|}}}
|author={{{Author}}}
|translator={{{Translator}}}
|editor={{{Editor}}}
|illustrator={{{Illustrator|}}}
|publisher={{{Publisher}}}
|address={{{Address|}}}
|printer={{{Printer|}}}
|year={{{Year|}}}
|source={{{Source|}}}
|image={{{Image|}}}
|progress={{{Progress|}}}
|pages={{{Pages|}}}
|volumes={{{Volumes}}}
|remarks={{{Remarks}}}
|notes={{{Notes|}}}
}}
Step 3: Index data
[edit]TODO:
- Please copy the entire code from the drop down below.
- Please create a new page "Module:Index data" on your Wikisource, the URL would be langcode.wikisource.org/wiki/Module:Index_data.
For example, https://pa.wikisource.org/wiki/Module:Index_data. - Please pay attention to highlighted text, and the comment above the line for instructions regarding translation and customization.
local wikidataTypeToIndexType = {
['Q3331189'] = 'book',
['Q1238720'] = 'journal',
['Q28869365'] = 'journal',
['Q191067'] = 'journal',
['Q23622'] = 'dictionary',
['Q187685'] = 'phdthesis'
}
local indexToWikidata = {
['subtitle'] = 'P1680',
['volume'] = 'P478',
['edition'] = 'P393',
['author'] = 'P253075',
['translator'] = 'P655',
['editor'] = 'P123',
['illustrator'] = 'P110',
['publisher'] = 'P760',
['printer'] = 'P872',
['address'] = 'P291',
['publishedin'] = 'P253129',
['year'] = 'P766',
['parts'] = 'P253130',
}
function indexDataWithWikidata(frame)
local args = {}
for k,v in pairs(frame.args) do
if v ~= '' then
args[k] = v
end
end
local item = nil
if args.wikidata_item then
item = mw.wikibase.getEntity(args.wikidata_item)
if item == nil then
mw.addWarning('The Wikidata entity identifier [[d:' .. args.wikidata_item .. '|' .. args.wikidata_item .. ']] put in the "Wikidata entity" parameter of the Book page: does not seem valid.')
end
end
if not item then
return {
['args'] = args,
['item'] = nil
}
end
if not args.type then
for _, statement in pairs(item:getBestStatements('P31')) do
if statement.mainsnak.datavalue ~= nil then
local typeId = statement.mainsnak.datavalue.value
if wikidataTypeToIndexType[typeId] then
args.type = wikidataTypeToIndexType[typeId]
end
end
end
end
if not args.image then
for _, statement in pairs(item:getBestStatements('P18')) do
if statement.mainsnak.datavalue.value ~= nil then
args.image = statement.mainsnak.datavalue.value
end
end
end
if not args.title then
local value = item:formatStatements('P1476')['value'] or ''
if value == '' then
value = item:getLabel() or ''
end
if value ~= '' then
local siteLink = item:getSitelink()
if siteLink then
value = '[[' .. siteLink .. '|' .. value .. ']]'
end
--Please translate the text "View and edit data on Wikidata" into your language.
args.title = value .. ' [[File:OOjs UI icon edit-ltr.svg|View and edit data on Wikidata|10px|baseline|class=noviewer|link=d:' .. item.id .. '#P1476]]'
end
end
if not args.year then
for _, statement in pairs(item:getBestStatements('P577')) do
if statement.mainsnak.datavalue ~= nil then
local current_year = statement.mainsnak.datavalue.value.time
args['year'] = mw.ustring.sub(current_year, 2, 5)
end
end
end
for arg, propertyId in pairs(indexToWikidata) do
if not args[arg] then
local value = item:formatStatements(propertyId)["value"]
if value ~= '' then
args[arg] = value
end
end
end
return {
['args'] = args,
['item'] = item
}
end
local p = {}
function p.indexDataWithWikidata(frame)
return indexDataWithWikidata(frame)
end
return p
Step 4: Index template
[edit]TODO:
- Please copy the entire code from the drop down below
- Please create a new page "Module:Index template" on your Wikisource, the URL would be langcode.wikisource.org/wiki/Module:Index_template.
For example, https://pa.wikisource.org/wiki/Module:Index_template. - Please pay attention to highlighted text and the comment above the line for instructions regarding translation and customization.
- Please create required categories after the modules are deployed, if they do not already exist.
function withWikidataLink(wikitext, category)
if wikitext == nil then
return nil
end
new_wikitext = mw.ustring.gsub(wikitext, '%[%[([^|%]]*)%]%]', function(page)
return addWikidataToLink(page, mw.ustring.gsub(page, '%.*/', '') , category)
end)
if new_wikitext ~= wikitext then
return new_wikitext
end
return mw.ustring.gsub(wikitext, '%[%[([^|]*)|([^|%]]*)%]%]', function(page, link)
return addWikidataToLink(page, link, category)
end)
end
function addWikidataToLink(page, label, category)
local title = mw.title.new( page )
if title == nil then
return '[[' .. page .. '|' .. label .. ']]'
end
if title.isRedirect then
title = title.redirectTarget
end
local tag = mw.html.create('span')
local itemId = mw.wikibase.getEntityIdForTitle(title.fullText)
tag:wikitext('[[' .. page .. '|' .. label .. ']]')
if itemId ~= nil then
--transalate "View information on Wikidata"
tag:wikitext(' [[Image:Wikidata.svg|10px|link=d:' .. itemId .. '|View information on Wikidata]]')
if category ~= nil then
tag:wikitext('[[Category:' .. category .. ']]')
end
end
return tostring(tag)
end
function addRow(metadataTable, key, value)
if value then
metadataTable:tag('tr')
:tag('th')
:attr('score', 'row')
:css('vertical-align', 'top')
:wikitext(key)
:done()
:tag('td'):wikitext(value)
end
end
function splitFileNameInFileAndPage(title)
local slashPosition = string.find(title.text, "/")
if slashPosition == nil then
return title.text,nil
else
return string.sub(title.text, 1, slashPosition - 1), string.sub(title.text, slashPosition + 1)
end
end
function indexTemplate(frame)
local data = (require 'Module:Index_data').indexDataWithWikidata(frame)
local args = data.args
local item = data.item
local page = mw.title.getCurrentTitle()
local html = mw.html.create()
--Translate "Books with a Wikidata ID" and "Books without a Wikidata ID"
if item then
html:wikitext('[[Category:Books with a Wikidata ID]]<indicator name="wikidata">[[File:Wikidata.svg|20px|element Wikidata|link=d:' .. item.id .. ']]</indicator>')
else
html:wikitext('[[Category:Books without a Wikidata ID]]')
end
local left = html:tag('div')
if args.remarks or args.notes then
left:css('width', '53%')
end
left:css('float', 'left')
if args.image then
local imageContainer = left:tag('div')
:css({
float = 'left',
overflow = 'hidden',
border = 'thin grey solid'
})
local imageTitle = nil
if tonumber(args.image) ~= nil then
imageTitle = mw.title.getCurrentTitle():subPageTitle(args.image)
else
imageTitle = mw.title.new(args.image, "Media")
end
if imageTitle == nil then
imageContainer:wikitext(args.image)
else
local imageName, imagePage = splitFileNameInFileAndPage(imageTitle)
if imagePage ~= nil then
imageContainer:wikitext('[[File:' .. imageName .. '|page=' .. imagePage .. '|250px]]')
else
imageContainer:wikitext('[[File:' .. imageName .. '|250px]]')
end
end
end
local metadataContainer = left:tag('div')
if args.image then
metadataContainer:css('margin-left', '150px')
end
local metadataTable = metadataContainer:tag('table')
if args.title then
if item then
addRow(metadataTable, 'Title', withWikidataLink(args.title))
else
addRow(metadataTable, 'Title', '[[' .. args.title .. ']]')
end
else
--Translate "You must enter the title field of the form"
mw.addWarning('You must enter the title field of the form.')
end
addRow(metadataTable, 'Subtitle', withWikidataLink(args.subtitle))
--Translate "Books with volume" and "Books without volume"
if args.volume then
addRow(metadataTable, 'Volume', '[[' .. args.volume .. ']]' )
html:wikitext('[[Category:Books with volume]]')
else
html:wikitext('[[Category:Books without volume]]')
end
--Translate "Books with edition" and "Books without edition"
if args.edition then
addRow(metadataTable, 'Edition', '[[' .. args.edition .. ']]')
html:wikitext('[[Category:Books with edition]]')
else
html:wikitext('[[Category:Books without edition]]')
end
--Translate "Books with author", "Books without Author", and "Books by"
if args.author then
if item then
addRow(metadataTable, 'Author', withWikidataLink(args.author))
html:wikitext('[[Category:Books with author]]')
local authors = item:formatPropertyValues( 'P253075', { mw.wikibase.entity.claimRanks.RANK_NORMAL } )['value']
for author in string.gmatch(authors, '([^,]+)') do
html:wikitext('[[Category:Books by ' .. author .. ']]')
end
else
addRow(metadataTable, 'Author', '{{Al|' .. args.author .. '}}')
end
else
html:wikitext('[[Category:Books without author]]')
end
--Translate "Books with translator" and "Books without translator"
if args.translator then
if item then
addRow(metadataTable, 'Translator', withWikidataLink(args.translator))
html:wikitext('[[Category:Books with translator]]')
else
addRow(metadataTable, 'Translator', '{{Al|' .. args.translator .. '}}')
end
else
html:wikitext('[[Category:Books without translator]]')
end
--Translate "Book with editor" and "Books without editor"
if args.editor then
if item then
addRow(metadataTable, 'Editor', withWikidataLink(args.editor))
html:wikitext('[[Category:Books with editor]]')
else
addRow(metadataTable, 'Editor', '{{Al|' .. args.editor .. '}}')
end
else
html:wikitext('[[Category:Books without editor]]')
end
--Translate "Books with illustrator" and "Books without illustrator"
if args.illustrator then
addRow(metadataTable, 'Illustrator', withWikidataLink(args.illustrator))
html:wikitext('[[Category:Books with illustrator]]')
else
html:wikitext('[[Category:Books without illustrator]]')
end
--Translate "Books with publisher" and "Books without publisher"
if args.publisher then
if item then
addRow(metadataTable, 'Publisher', withWikidataLink(args.publisher))
html:wikitext('[[Category:Books with publisher]]')
else
addRow(metadataTable, 'Publisher', withWikidataLink(args.publisher))
html:wikitext('[[Category:Books with publisher]]')
end
else
html:wikitext('[[Category:Books with No Publisher]]')
end
--Translate "Books with place of publication" and "Books without place of publication"
if args.address then
addRow(metadataTable, 'Address', withWikidataLink(args.address))
html:wikitext('[[Category:Books with Place of Publication]]')
else
if args.publishedin then
addRow(metadataTable, 'Published In', withWikidataLink(args.publishedin))
html:wikitext('[[Category:Books with Place of Publication]]')
else
html:wikitext('[[Category:Books without Place of Publication]]')
end
end
--Translate "Books with year", "Books without year", and "Books published in"
if args.year then
addRow(metadataTable, 'Year', withWikidataLink(args.year))
html:wikitext('[[Category:Books with year]]')
html:wikitext('[[Category:Books published in ' ..args.year..']]')
else
html:wikitext('[[Category:Books without year]]')
end
--Translate "Books with printer" and Books without printer
if args.printer then
addRow(metadataTable, 'Printer', withWikidataLink(args.printer))
html:wikitext('[[Category:Books with printer]]')
else
html:wikitext('[[Category:Books without printer]]')
end
if args.source == 'djvu' or args.source == 'pdf' then
addRow(metadataTable, 'Source', '[[:File:' .. mw.title.getCurrentTitle().text .. '|' .. args.source .. ']]')
local query = 'SELECT ?item ?itemLabel ?pages ?page WHERE {\n ?item wdt:P996 <http://commons.wikimedia.org/wiki/Special:FilePath/' .. mw.uri.encode(mw.title.getCurrentTitle().text, 'PATH') .. '> .\n OPTIONAL { ?page schema:about ?item ; schema:isPartOf <https://bn.wikisource.org/> . }\n OPTIONAL { ?item wdt:P304 ?pages . }\n SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],bn".\n}}'
--Translate "Wikidata items"
html:wikitext('<indicator name="index-scan-wikidata">[[File:Wikidata Query Service Favicon.svg|20px|Wikidata items|link=https://query.wikidata.org/embed.html#' .. mw.uri.encode(query, 'PATH') .. ']]</indicator>')
else
addRow(metadataTable, 'Source', args.source)
end
--Replace the following the proofread status categories with the ones currently being used on your Wikisource.
if args.progress == 'T' then
addRow(metadataTable, 'Progress', '[[Category:Completed Books]] [[:Category:Completed Books | Completed]]')
elseif args.progress == 'V' then
addRow(metadataTable, 'Progress', '[[category:Books to validate]] [[:Category:Books to validate | To validate]]')
elseif args.progress == 'C' then
addRow(metadataTable, 'Progress', '[[category:Books to correct]] [[:category:Books to correct | To correct]]')
elseif args.progress == 'OCR' then
addRow(metadataTable, 'Progress', '[[category:Books without a text layer]] [[:category:Books without a text layer | Add an OCR text layer]]')
elseif args.progress == 'L' then
addRow(metadataTable, 'Progress', '[[category:Books to repair]] <span style = "color: # FF0000;"> [[:Category:Books to repair|Defective source file]]</span>')
elseif args.progress == 'X' then
addRow(metadataTable, 'Progress', '[[category:Extracts and compilations]] [[:category:Extracts and compilations | Incomplete source:extract or compilation]]')
else
addRow(metadataTable, 'Progress', '[[Category:Unknown progress books]] [[:category:Unknown progress books | Unknown progress]]')
end
addRow(metadataTable, 'Series', args.volumes)
if args.pages then
left:tag('div'):css('clear', 'both')
left:tag('h3'):wikitext('Pages')
left:tag('div'):attr('id', 'pagelist'):css({
background = '#F0F0F0',
['padding-left'] = '0.5em',
['text-align'] = 'justify'
}):newline():wikitext(args.pages):newline()
else
mw.addWarning('You must enter the pagination of the facsimile (Pages field) ')
end
if args.remarks or args.notes then
local right = html:tag('div'):css({
width = '44%;',
['padding-left'] = '1em',
float = 'right'
})
if args.remarks then
right:tag('div'):attr('id', 'remarks'):wikitext(args.remarks)
end
if args.notes then
right:tag('hr'):css({
['margin-top'] = '1em',
['margin-bottom'] = '1em'
})
right:tag('div'):attr('id', 'notes'):wikitext(args.notes)
end
end
--Please translate or replace the following type categories.
if args.type == 'book' then
html:wikitext('[[Category:Index - Books]] ')
elseif args.type == 'journal' then
html:wikitext('[[Category:Index - Periodicals]] ')
elseif args.type == 'collection' then
html:wikitext('[[Category:Index - Collections]] ')
elseif args.type == 'dictionary' then
html:wikitext('[[Category:Index - Dictionaries]] ')
elseif args.type == 'phdthesis' then
html:wikitext('[[Category:Index - Theses]] ')
end
html:wikitext('[[Category:Index]] ')
if args.source ~= 'djvu' then
html:wikitext('[[Category:Non djvu book]] ')
elseif args.source == 'pdf' then
html: wikitext ('[[Category:PDF book]]')
elseif args.source == 'ogg' then
html: wikitext ('[[Category:OGG file]]')
elseif args.source == 'webm' then
html: wikitext ('[[Category:webm file]]')
end
if not args.remarks then
html: wikitext ('[[Category:Indexed pages]]')
end
return tostring(html)
end
local p = {}
function p.indexTemplate( frame )
return indexTemplate( frame )
end
return p
Bot
[edit]Please contact the bot operator to get the bot running on your Wikisource. Please keep the Category:Books without Wikidata ID
handy before you contact.