Jump to content

Manual:Extension support/What's out there query

From mediawiki.org

The process was a bit complex,

Raw Data

[edit]

First I visited the s23 stats for mediawiki, specified my "good" at 0, and lines as high as it would go (in this case 1350).

Selected the whole table, and copy+pasted it into OpenOffice Calc.

OO Calc

[edit]

I did a Select-All, and hit Format->Default formatting to remove links and HTML. Then I removed columns I didn't care about, and saved the file as a CSV so I could run regular expressions against it.

Text Editor

[edit]

OO Calc had converted a lot of my number columns after stripping into text fields instead of numbers.. so I first did a regular expression search+replace in PSPad for that

/"([0-9]*)"/</source> into <source lang="text">$1

Next I used another regexp to expand the version columns from

/"([0-9]+)\.([0-9]+)(:?\.([0-9]+))([^"]*)"/

into

$1,$2,$3,$4

— meaning I had major, minor, revision and special note columns. Some rows had N/A or garbage or blank for version, so I did one last regexp to handle those, doing

/^("[^"]+","[^"]")/

into <v lang="text">$1$2,,,</syntaxhighlight> to add extra blank columns.

OO Calc

[edit]

Back in OO Calc I now sorted by Major column, and deleted anything that didn't have a version. Then I re-saved my .csv file.

PHPMyAdmin

[edit]

Using PHPMyAdmin I created a simple table to hold my columns,

major,minor,revision,note,good,active

remember to make note a varchar and the rest integers. I used PHPMyAdmin's csv import tool to load the data from the file.

Then I ran a query to get the basic results:

SELECT v_major, v_minor, count(*) AS total
FROM `wiki_versions` 
WHERE good >= 10 AND active > 10 AND v_note <> "wmf1"
GROUP BY v_major, v_minor
ORDER BY v_major DESC, v_minor DESC

Here we've picked some arbritrary parameters.

  • We've eliminated wmf1 versions because these are really Wikimedia wiki's, not independant setups.
  • We've also said "good" should be at least 10, which means there are at least 10 content articles in the main namespace on the wiki
  • We've specified at least 10 active editors editing the wiki. s23 defined what active means, I'm not sure what it is. You'd be surprised how many very large and useful wikis have a small number of active editors.

A simple query should also give us the total number if we're too lazy to add, useful for generating percentages.

SELECT count(*) AS total
FROM `wiki_versions` 
WHERE good >= 10 AND active > 10 AND v_note <> "wmf1"

Make sure to use the same parameters as above in the WHERE section.

Wiki

[edit]

I copy and pasted the table back into wiki, and manually calculated the percentages. If you had a big table it may be worth doing it in the SQL or going back to OO Calc, but in my case there wasn't a lot of version disparity (which is good!).