Jump to content

Wikibase/Introduction to modeling data

From mediawiki.org

Overview

[edit]

This guide is for you if:

  • You have an empty Wikibase before you
  • You have data you need to model using your Wikibase

Model your own data

[edit]

There's one thing this guide can't do for you: it can't help you model your data.

  1. You and the other people who work with this data are the only ones with the information necessary to decide how best to model this data.
  2. There's no one right answer to how to model your data. Modeling is a series of choices about how to organize your data, and those choices look different based on your data, which may change while you work with it. Those choices may also look different when you're halfway through the process.
  3. Data models sometimes need to change. This can be a painful process if the new model entails recreating a significant portion of your data, yet some users have found this unavoidable as their data model evolved. Thus, it's advisable to take as much time as you need to create your data model before even logging into your Wikibase instance.

What this guide can do is point at some of the decisions you will need to make and offer you a starting point.

Concepts

[edit]

In Wikibase, you'll need to think of your data in terms of the concepts Wikibase uses to store data: items , properties , statements , and so forth.

Example of an item page with statement: Jimmy Wales
Example of an item page with statement: Jimmy Wales


In the above example you see an item page. It contains a statement: that Jimmy Wales (item ) is an instance of (property ) a human (item ).

It may be fairly straightforward to think of how data might be modeled as items, but the moment you start to consider modeling properties, you face some crucial decisions.

If you've never modeled data before or if you just don't know where to start, dip your toe in by looking at a robust, established Wikibase instance -- for example, Wikidata. Seeing how someone else modeled their data will help you model yours, if only by showing you a way you can immediately see is unsuitable for your data.

For more advanced concepts in data modeling, this primer on Wikibase data modeling is a must-read.

Data modeling example -- more properties

More properties, or more items?

[edit]

If your intention were, for example, to model familial relationships, creating a property called "parent of" that expects an item as its data type is a perfectly fine decision. You will then also need properties like "child of", "sibling of", "cousin of", and so forth, yielding a model that has many properties and fewer items.


Data modeling example -- more items

But you might also choose to create a single property: "has relationship". Then you would need to create one item for each type of relationship ("father", "sister") and have those constitute the list of valid values for that property ("has relationship: father"), and then conceive of a way to relate the entire statement to another item.


Adding more properties to your data model will lead to fewer items in your Wikibase and compel a certain way of thinking about every future item and statement you plan; adding fewer properties will lead to more items and an entirely different way of thinking.

Interoperability

[edit]

Do you plan someday to have your data relate with (or map onto) the data in Wikidata or another Wikibase instance? Consider starting your Wikibase by creating some of the same properties that are in Wikidata. That way you know you will be able to map statements with those properties to Wikidata in the future.

To get a better sense of how interoperable we hope to make the entire Linked Open Data web, including your Wikibase if possible, peruse Wikimedia Deutschland's strategy documents.

Resources

[edit]

One excellent way to learn about modeling data is to see how others did it. We’ve supplied a few good examples as well as some reference materials for your edification.