Jump to content

Help talk:Tabular data

About this board

Sharing across wikis

1
Diegodlh (talkcontribs)

Hi! I've read in the JsonConfig extension documentation that JsonConfig configuration pages (such as these Tabular Data pages) can be shared across wikis. See Multiple configs shared in a cluster section.

On the other hand, I see in the configuration of Wikimedia servers that JsonConfig seems to have been configured so that configuration pages in the "Data" namespace are stored locally in "Commons", but shared across wikis. Look for the if ( $wmgEnableJsonConfigDataMode ) block in CommonSettings.php.

I thought this meant I could create a Tabular Data page in "Meta", for example, under Data:Sample.tab, and that it would be stored in Commons. However, it doesn't seem to work that way.

What does "shared across wikis" means, then? I see in the "Map Data" help page that one can use a Map Data page in Commons from any wiki (see Usage section). Is that what "shared across wiki" means?

Thank you!

Reply to "Sharing across wikis"
Smihael (talkcontribs)

Would it be possible to increase or abandon the 400 char per "cell" and 2MB overall limitation? What would be the correct place to discuss this?

Jdforrester (WMF) (talkcontribs)

It is definitely not possible to extend the 2MiB overall limit, no; that's a fundamental limit from MediaWiki. If you're dealing with that much data, this simple system is probably not right for you.

For the cell limit, I'd suggest a discussion on Commons about making such a change and then filing a Phabricator task asking for the change.

Reply to "Limitations"
Julio974fr (talkcontribs)

How do we add categories to tab data files? I tried using the category bar at the bottom of the page but it didn't work. ~~~~

PerfektesChaos (talkcontribs)

IIRC it has been discussed already on phabricator, that these JSON and TAB pages do need a /doc page mechanism, like Lua modules, to provide wikitext notes and hints and links to related documentation pages and help pages. That goes for all kinds of code pages, also CSS and JavaScript, even MediaWiki: system message pages. Such transcluded subpages can provide categories.

Reply to "Categories"

Is this still being developed?

6
أحمد (talkcontribs)

Are the tabular and map data features still being developed?

How are they implemented? Is there an extension to realise it in a MW instance? A patch?

I'm working on a wiki about urbanism that could benefit from such functionality.

Currently we store CSV files as files and resort to a combination of Extension:Data Transfer and Extension:External Data to process it, and plan to do work with Extension:Graph, but I'd like to bring the data storage to the wiki and the paradigm of the wikipage instead.

Any insights?

Jdforrester (WMF) (talkcontribs)
Elli (talkcontribs)

Is this at risk of being deprecated?

Jdforrester (WMF) (talkcontribs)

I don't believe so, but I don't make those decisions, sorry.

Elli (talkcontribs)

Thanks, just wanted to know, as I'd like to use this for a project on enwiki/commons but don't want to risk it being removed.

GreenC (talkcontribs)

Please don't abandon! :-) Very much liked.

Reply to "Is this still being developed?"
Zebulon84 (talkcontribs)

Is there a way to indicate the date of the data in the table, other than adding a column with the same date everywhere, or hacking the description with a fake "date" language ?

In general, could we be allowed to add personalized metadata in the JSON, even if that means names with specific prefix, or in an optional metadata object.

Yurik (talkcontribs)

@zebulon84 i'm not sure i understand, could you give an example of the data you are trying to store, and what it may look like? Thx

Zebulon84 (talkcontribs)

Data I store (for test purpose at the moment): c:Data:Sandbox/Zebulon84/Communes Grand Est.tab

My goal is to use this data to generate with lua various lists with data coming from other sources (like this one) automatically (today we have to give the names in the list in as much parameters).

I'd like to know the when is the data have been update (not the date of the upload on commons, but the date of the source) to be able to compare that with date coming from other file when using this data in a lua module, and indicate it in the list title.

Yurik (talkcontribs)

@Zebulon84 if i understood you correctly: you are trying to add a timestamp (date), when the data was last refreshed for a given row? If so, just add it as a new string column, and parse it in Lua as a date. A different issue -- in the c:Data:Sandbox/Zebulon84/Communes Grand Est.tab sample, you have a string column "name". I think it is better to use Wikidata for that, unless you keep it for debugging? But looks good otherwise!

Zebulon84 (talkcontribs)

Not for a given row, but for the complete table. The source of this data updates it once per year, and the publish date is different from the update date. I'd like to have this update date. I'm going to include it in the description, but it would be easier to use if I can have it separately.

For the names, I may remove them, but I need to check if Wikidata is accurate first (will do it soon).

Reply to "Date"
FlyingChrysalis (talkcontribs)
FlyingChrysalis (talkcontribs)

Found a temporary solution: give an empty table with the wikitext link in the Sources.

Yurik (talkcontribs)
FlyingChrysalis (talkcontribs)

Righto, will do. Thanks! :)

Reply to "Redirects"

"The content format application/json+pretty is not supported by the content model wikitext"

5
Discasto (talkcontribs)

I'm trying to figure out how to use Tabular Data to store the logs of the Commons Wiki Loves contests I'm handling (Wiki Loves Monuments, Wiki Loves Earth and the like). At the moment, I'm using the CSV format, which is nice to be read and written down from pywikibot and handled through Pandas (I'm using <pre></pre> to encapsulate it). I've noticed that Tabular Data seems to be the current option for Mediawiki. I've created commons:Data:Wiki_Loves_in_Spain/Wiki_Loves_Earth/2017/Log.tab from PAWS using pywikibot but possibly in a wrong way (don't know why yet). However, when I try to edit it in order to see what's the difference with, for instance commons:Data:Ncei.noaa.gov/weather/New York City.tab (the content seems to be the same) I get the cryptic message in this item subject and I'm not actually able to edit the page. Any help on this?

Yurik (talkcontribs)

@Discasto it seems when you were uploading the data via API, you used wrong format, and somehow it got accepted (it should have been rejected during the save). Take a look at other .tab pages, and see what content format they use via API, so that you can set the same params in the edit API call.

Discasto (talkcontribs)

Hi @Yurik: I'm not using (directly) the API. I'm using the 'save' method in Pywikibot and the content I'm saving is right (in fact, I'm cutting and pasting the content of a "valid" tab (the one I'm mentioning). If you don't give me more information about what's "wrong" (I guess it has to do with the MIME type, but Pywikibot does not provide any means for setting the MIME type), I cannot talk to the Pywikibot guys. What can be saved in a Data page?

Additionally, I've followed a simple example (API:Edit/Editing with Python) to write a string encoding a valid JSON document with the same result. It seems as if the API does not allow to write on the Data namespace. Any idea?

Yurik (talkcontribs)

@Discasto I don't know the exact settings for the pywikibot to edit data pages. I think you need to set contentformat and contentmodel parameters. See https://commons.wikimedia.org/w/api.php?action=help&modules=edit for available values, but more specifically, take a look at an existing page like this one: https://commons.wikimedia.org/w/api.php?action=query&prop=info&titles=Data:Sandbox/Doc_James/Obesity_Males_CC-BY-SA.tab -- as you can see, "contentmodel" = "Tabular.JsonConfig". Hope this helps.

Discasto (talkcontribs)
Reply to ""The content format application/json+pretty is not supported by the content model wikitext""

How to make a graph based on Tabular Data?

5
Summary by 197.218.81.48

Access it using:

Or using userscripts.

Jarekt (talkcontribs)

I am looking for a way to create a graph based on c:Data:Photo challenge/statistics.tab . Are there any tools to insert a graph on a page based on that table? Also How do I add links to other Commons pages?

197.218.81.48 (talkcontribs)

The only way to access such graphs is using lua. Basically just extract the data, and write the code to create the graph.

See Extension:Graph/Guide#External Data for examples.

197.218.81.48 (talkcontribs)

Actually, only lua can access the tabular / map data.

Yurik (talkcontribs)

@Jarekt, the above is not correct. You can use .tab & .map pages directly in the <graph> objects. See usage.

Yurik (talkcontribs)
Reply to "How to make a graph based on Tabular Data?"
Jasper Deng (talkcontribs)

What implications will this have for Wikidata? As far as I can tell, this does not seem to be a priori compatible with Wikidata's existing data model. I specifically would like assurance that the adoption of this will not be to the detriment of Wikidata usage.

Yurik (talkcontribs)

@Jasper Deng Tabular data and Wikidata pursue very different goals -- Wikidata is a FACTS database. You can store the height of Everest, but you cannot store daily weather on Everest for the past several years. While those are also "facts", storing them on Wikidata is not a good idea. In short, these two techs compliment each other, and Wikidata team plans to have much better datasets support going forward.

Reply to "Relation to Wikidata"
Mike Linksvayer (talkcontribs)
"license": "CC0-1.0+", which means the data can be used under the CC0 version 1.0, or (at your option) any later version

Or later version is superfluous for CC0-1.0, which has no conditions at all, let alone any pertinent to license versions. The "+" notation and "or (at your option) any later version" is advisable for A/L/GPLv2/3 because those licenses aren't compatible with later versions of themselves without such additional permission. There's no reason to copy these when implementing other licenses. Suggest instead:

"license": "CC0-1.0", which means the data has been released under CC0 version 1.0
Yurik (talkcontribs)

@Mike Linksvayer, thanks. IANAL, so I put it there because I suspected that it would be possible for CC to create another version of CC0 license in case for some weird legal reason CC0-1.0 would not cover some usecase, and so having the "+" would ensure that even if data under CC0-1.0 doesn't cover that usecase, the data can be automatically re-licensed under the better license version. I will poke our legal about this. Do you think having that "+" sign would ever hurt?

Mike Linksvayer (talkcontribs)

@Yurik not sure how taking material contributed under "CC0-1.0+" as available under hypothetical CC0-2.0 would hold up; probably depends on what "some usecase" is. Generally I wouldn't suggest adding something intended to be legally meaningful without poking your legal first. :-) Note that Wikidata does not purport to use "+".

Yurik (talkcontribs)

Hmm, perhaps we should change it then. Thanks for the heads up! I will poke our legal, and I think we should make it the same as WD.

Reply to "CC0-1.0+"