Difficult task. Here are some intermediate results.
First I tried ALTER TABLE to binary. Maybe this will work with the Mediawiki application, but there is one reason why I did not continue to explore. Binary means we will not be able to read the contents of text columns in the MySQL shell. Example:
mysql> alter table category convert to character set binary;
mysql> select * from category limit 1;
+--------+----------------------+-----------+-------------+-----------+
| 151434 | 0x4C697374656E | 599 | 3 | 0 |
+--------+----------------------+-----------+-------------+-----------+
1 row in set (0,00 sec)
The result is not practical for me, so I stopped here.
Next idea was ALTER TABLE to utfmb4. This might work with Mediawiki, - I don't know -, but once more it did not solve my original problem within MySQL shell. Any special characters e.g. german umlauts appear to be cryptic glyphs in the shell, before ALTER TABLE as well as after.
mysql> alter table category convert to character set utf8mb4;
mysql> select * from category where cat_id=151642;
+--------+-----------+-----------+-------------+-----------+
| 151642 | Städte | 2505 | 0 | 0 |
+--------+-----------+-----------+-------------+-----------+
1 row in set (0,00 sec)
mysql> select * from category where cat_title='Städte';
Empty set (0,00 sec)
mysql> select * from category where cat_title like 'St%dte';
+--------+-----------+-----------+-------------+-----------+
| 151642 | Städte | 2505 | 0 | 0 |
+--------+-----------+-----------+-------------+-----------+
1 row in set (0,00 sec)
To solve the special characters issue, it seems to be necessary to first convert to binary and then in a second step to utf8mb4.
mysql> alter table category convert to character set binary;
mysql> alter table category convert to character set utf8mb4;
This is only half of the jackpot. The conversion to binary changes the columns' data types to varbinary, so the second ALTER just sets the table's defaults. It is necessary to restore the columns:
mysql> alter table category modify cat_title varchar(255) character set utf8mb4 unique not null default 'NULL';
Now everything is fine within the MySQL shell:
mysql> select * from category where cat_title='Städte';
+--------+-----------+-----------+-------------+-----------+
| 151642 | Städte | 2505 | 0 | 0 |
+--------+-----------+-----------+-------------+-----------+
1 row in set (0,00 sec)
The double conversion is not my invention, I found it on the web:
//nicj.net/mysql-converting-an-incorrect-latin1-column-to-utf8/
Other users seem to have the same problem. It is a "devil of a job" to manually convert each single table and column, especially because we'll have to restore any single column attribute which is very error-prone. Therefore the script this guy developed might be useful (//github.com/nicjansma/mysql-convert-latin1-to-utf8).
Currently I am running some tests with this script. Still having issues. There are a few duplicate keys, which I need to eliminate before doing the conversion, not a big issue. Some of the columns seem to be left unchanged for whatever reason. The drama is going on ...