MediaWiki Developer Meet-Up 2009/Notes/WikiWord
Appearance
(Redirected from Project:Developer meet-up 2009/Notes/WikiWord)
- WikiWord extract a thesaurus from Wikipedia.
- Homepage: http://brightbyte.de/page/WikiWord
- thesis extract: http://brightbyte.de/page/WikiWord/Excerpt
- Navigatior: http://toolserver.org/~daniel/wikiword/wikiword.php
- Thesaurus supplies relations:
- term <-> concept (meaning relation)
- concept <-> concept (related, similar, broader, narrower)
- concepts = wiki articles
- terms = title, redirect, anchor text, sort key, etc
- multilingual
- concepts from multiple wikipedias combined
- terms in multiple languages refering to one concept
- useful for indexing, disambiguation
- plan: multilingual image search for commons (german blog post)
- ideas for improvement:
- get magic names and patterns from pywikipediabot config
- use incremental updates as much as possible
- look at coocurrance in paragraphs, look at co-coocurance
- for image serach: index by yimage caption (used images)