Jump to content

Wikimedia Discovery/FAQ: Difference between revisions

From mediawiki.org
Content deleted Content added
Line 5: Line 5:
= Are you building a new search engine? =
= Are you building a new search engine? =


No. We are improving the existing [https://wikitech.wikimedia.org/wiki/Search CirrusSearch] infrastructure with better relevance, [https://www.mediawiki.org/wiki/Wikimedia_Engineering/2015-16_Q2_Goals#Search multi language], [https://www.mediawiki.org/wiki/Wikimedia_Engineering/2015-16_Q3_Goals#Search multi projects] search and incorporating new [https://maps.wikimedia.org data sources] for our projects. We want a relevant and consistent experience for users across searches for both [https://www.wikipedia.org/ wikipedia.org] and our project sites.
We are not building Google. We are improving the existing [https://wikitech.wikimedia.org/wiki/Search CirrusSearch] infrastructure with better relevance, [https://www.mediawiki.org/wiki/Wikimedia_Engineering/2015-16_Q2_Goals#Search multi language], [https://www.mediawiki.org/wiki/Wikimedia_Engineering/2015-16_Q3_Goals#Search multi projects] search and incorporating new [https://maps.wikimedia.org data sources] for our projects. We want a relevant and consistent experience for users across searches for both [https://www.wikipedia.org/ wikipedia.org] and our project sites.


= But if you're adding new data sources, isn't that a search engine? =
= But if you're adding new data sources, isn't that a search engine? =

Revision as of 20:37, 5 November 2015

What is the Knowledge Engine?

"Knowledge Engine" was an early term used to describe a number of initiatives that related to search and discovery of content. It was/is not a product and instead was meant to easily reference what the Discovery team was focusing on. We've since stopped using the term as it caused confusion.

Are you building a new search engine?

We are not building Google. We are improving the existing CirrusSearch infrastructure with better relevance, multi language, multi projects search and incorporating new data sources for our projects. We want a relevant and consistent experience for users across searches for both wikipedia.org and our project sites.

But if you're adding new data sources, isn't that a search engine?

Our first new data source is OpenStreetMap data for Maps which our wikivoyage community is already starting to experiment with. There are other data sets that we could potentially surface (census, national gallery, etc) but that will be up to our communities to decide. Some of these could certainly show up in search results and we have phabricator tasks around improving GeoData content T112026. The goal is to expand the amount of knowledge and expand the context beyond just textual search. We want to begin by showcasing content from other wiki projects including appropriate languages based on query input.

Does that mean we are looking to shift search traffic away from third parties?

No. We love all the third party traffic that we get and hope that it increases over time. What we are trying to focus on is providing a search experience that doesn't look like this:

  1. Search on Google, Bing, etc
  2. Follow Wikipedia Link
  3. Read
  4. Leave and search Google, Bing, etc again because you are specifically looking for a Wikipedia article but couldn't find it using CirrusSearch

What does your overall strategy look like ?

  • Year 0 - Look inward and improve the search experience across our projects
  • Year 1 - Look outward and see if we can incorporate new data streams and public curation models for relevance

What does year 0 include ?

We call year 0 Discovery because we are focused on learning and understanding user pathways and appreciation for other knowledge sources.

What does year 1 include ?

Potential ideas that we need your feedback on

  • Identify pathways for the community to improve relevance via WikiData
  • Actively highlight difficult to find knowledge and empower the ability to surface it in search, reading and editing flows
  • Research open sources of knowledge to continually strengthen the legitimacy of our content through curation by humans and machines

How does this align with strategy?

  • Relevancy, accuracy and trustworthy ratings on index entities
  • Extended context to geospatial, temporal, multimedia and relational paths of knowledge
  • Display Inter-wiki projects (internal) and potentially open data sources
  • Mobile, voice, and modern consistent interface opportunity
  • Multiple-lingual and global respective experiences and results

Strategic Consultation

How do you know if we are succeeding for our users?

How can I help?