Jump to content

Topic on Project:Support desk/Flow

Special:Mostlinked and WantedPages never ending

12
2003:C2:3F41:E00:896A:31C4:4B22:4203 (talkcontribs)
MediaWiki 1.35.5
PHP 7.4.3 (fpm-fcgi)
MySQL 8.0.29-0ubuntu0.20.04.3
ICU 66.1

Howdy,

we have changed to $wgMiserMode = true some months ago, updateSpecialPages is running every night. Great function, however there are 2 SpecialPages that are running and running for 5, 8, 12 hours and never coming to and end. For the moment, I have excluded them from being updated by the script so that the script runs through within some 5 minutes.

Is there a way to find out why Mostlinked/Wantedpages are hanging? How could I find out what they're doing all the time? Any hints appreciated.

PS. For whatever reason this problem emerged after changing the database's character set, Topic:Wqktznc6b8nyc29g, coincidence or causality? I can't imagine.

Bawolff (talkcontribs)

How many pages are in your wiki?

Different special pages take different time, and some of them can be quite slow, but 12 hours seems excessive even for a very large wiki. However maybe if the server runs out of ram and starts swapping stuff or something. Do other things work when this is happening? Is mysqld using cpu? I/o? During this time?

Its not impossible that the charset change effected something but it seems kind of unlikely

During this, is the query listed on the server via SHOW PROCESSLIST; ? Are there a large number of queries in flight? If you do SHOW EXPLAIN FOR <processid>; for the wanted pages query, what is the output.

2003:C2:3F0F:9800:68A6:A572:4C81:CA77 (talkcontribs)

Thank you Bawolff, and sorry, I have had private issues the last few days.


We have round about 133,000 pages, thereof 57,000 articles (approx. 40,000 redirects, don't know where the difference is)

I started the Mostlinked job manually 5 hrs ago and it is still running. The mysqld process uses 100...150% CPU but users working on articles will hardly notice the server load. It is a virtual system with 4 CPUs, total CPU usage varying between 15% and 40%. No noticeable swapping.


The database processlist shows the event scheduler, the special page, my shell process and a count of sleeping processes (usually 0 to 5 as far as I saw). Explaining the special page process:


mysql> explain for connection 158074;

+----+-------------+-----------+------------+-------+---------------+--------------+---------+-----------------------------+---------+----------+----------------------------------------------+

| id | select_type | table     | partitions | type  | possible_keys | key          | key_len | ref                         | rows    | filtered | Extra                                        |

+----+-------------+-----------+------------+-------+---------------+--------------+---------+-----------------------------+---------+----------+----------------------------------------------+

|  1 | SIMPLE      | pagelinks | NULL       | index | NULL          | pl_namespace | 1030    | NULL                        | 2391153 |   100.00 | Using index; Using temporary; Using filesort |

|  1 | SIMPLE      | page      | NULL       | ref   | name_title    | name_title   | 4       | ppdb.pagelinks.pl_namespace |    5684 |   100.00 | Using where; Using index                     |

+----+-------------+-----------+------------+-------+---------------+--------------+---------+-----------------------------+---------+----------+----------------------------------------------+

2 rows in set (0,00 sec)


Though I have some SQL experience, I don't understand the EXPLAIN output. There are 2.43 million records in the pagelinks table (not 2,39m as shown). And why is key_len = 1030? Here is the pagelinks description:


mysql> describe ppdb.pagelinks;

+-------------------+--------------+------+-----+---------+-------+

| Field             | Type         | Null | Key | Default | Extra |

+-------------------+--------------+------+-----+---------+-------+

| pl_from           | int unsigned | NO   | PRI | 0       |       |

| pl_namespace      | int          | NO   | PRI | 0       |       |

| pl_title          | varchar(255) | NO   | PRI |         |       |

| pl_from_namespace | int          | NO   | MUL | 0       |       |

+-------------------+--------------+------+-----+---------+-------+


mysql> show indexes from ppdb.pagelinks;

+-----------+------------+------------------------+--------------+-------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+------------+

| Table     | Non_unique | Key_name               | Seq_in_index | Column_name       | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment | Visible | Expression |

+-----------+------------+------------------------+--------------+-------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+------------+

| pagelinks |          0 | PRIMARY                |            1 | pl_from           | A         |      139218 |     NULL |   NULL |      | BTREE      |         |               | YES     | NULL       |

| pagelinks |          0 | PRIMARY                |            2 | pl_namespace      | A         |      227346 |     NULL |   NULL |      | BTREE      |         |               | YES     | NULL       |

| pagelinks |          0 | PRIMARY                |            3 | pl_title          | A         |     2391153 |     NULL |   NULL |      | BTREE      |         |               | YES     | NULL       |

| pagelinks |          1 | pl_backlinks_namespace |            1 | pl_from_namespace | A         |          20 |     NULL |   NULL |      | BTREE      |         |               | YES     | NULL       |

| pagelinks |          1 | pl_backlinks_namespace |            2 | pl_namespace      | A         |         165 |     NULL |   NULL |      | BTREE      |         |               | YES     | NULL       |

| pagelinks |          1 | pl_backlinks_namespace |            3 | pl_title          | A         |      379761 |     NULL |   NULL |      | BTREE      |         |               | YES     | NULL       |

| pagelinks |          1 | pl_backlinks_namespace |            4 | pl_from           | A         |     2356577 |     NULL |   NULL |      | BTREE      |         |               | YES     | NULL       |

| pagelinks |          1 | pl_namespace           |            1 | pl_namespace      | A         |          20 |     NULL |   NULL |      | BTREE      |         |               | YES     | NULL       |

| pagelinks |          1 | pl_namespace           |            2 | pl_title          | A         |      181466 |     NULL |   NULL |      | BTREE      |         |               | YES     | NULL       |

| pagelinks |          1 | pl_namespace           |            3 | pl_from           | A         |     2378003 |     NULL |   NULL |      | BTREE      |         |               | YES     | NULL       |

+-----------+------------+------------------------+--------------+-------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+------------+

Bawolff (talkcontribs)

So keylen is how many bytes of the index is used (sometimes only part of the index is used).

Its normal for row estimates to be off. They are order of magnitude gueses.


So, the concerning thing i see from that join, is the second row for the join to the page table. I was expecting the number of rows to be 1, the key length to be more than 4, and filtered to be closer to 0. It sounds like its using only the namespace, instead of namespace and title as the index. This will significantly slow things down.

Maybe this is caused by charset issues - i dont know enough about mysql internals to really say, but maybe if pagelinks.pl_title and page.page_title were different charsets/collations, it might prevent the index from being used, or something like that.

The other possibility is its something to do with page_title not being in the group by clause, but i dont know why that would make a difference.

2003:C2:3F24:200:7936:FD16:1C06:D7CD (talkcontribs)

It seems we are on the track here. Invaluable hint, thank you ...! The pagelinks.pl_title and page.page_title are slightly different, the second one is ai_ci.


I am still struggling with correcting this (my own) mistake, trying to rebuild the pagelinks table but it takes a very long time.


I'll inform you later, in a few weeks because we are going on vacation now.

2003:C2:3F30:A000:55E0:6D6D:CE8:3CD5 (talkcontribs)

Continuing at the very bottom.

2003:C2:3F0F:9800:68A6:A572:4C81:CA77 (talkcontribs)

There is one mysterious message in the mysql log:


2022-06-28T09:11:00.315992Z 1 [ERROR] [MY-012574] [InnoDB] Unable to lock ./ibdata1 error: 11


repeated 15 times between 11.00am and 11.14am, 11am is exactly the time I startet my test.

2003:C2:3F0F:9800:68A6:A572:4C81:CA77 (talkcontribs)

OMG, sorry, please ignore the previous posting. The lock error appears again and again, once every minute, until 12.40, the last log entries are:


2022-06-28T09:12:40.346740Z 1 [ERROR] [MY-012574] [InnoDB] Unable to lock ./ibdata1 error: 11

2022-06-28T09:12:40.346916Z 1 [ERROR] [MY-012592] [InnoDB] Operating system error number 11 in a file operation.

2022-06-28T09:12:40.346980Z 1 [ERROR] [MY-012596] [InnoDB] Error number 11 means 'Resource temporarily unavailable'

2022-06-28T09:12:40.347071Z 1 [ERROR] [MY-012215] [InnoDB] Cannot open datafile './ibdata1'

2022-06-28T09:12:40.347233Z 1 [ERROR] [MY-012959] [InnoDB] Could not open or create the system tablespace. If you tried to add new data files to the system tablespace, and it failed here, you should now edit innodb_data_file_path in my.cnf back to what it was, and remove the new ibdata files InnoDB created in this failed attempt. InnoDB only wrote those files full of zeros, but did not yet use them in any way. But be careful: do not remove old data files which contain your precious data!

2022-06-28T09:12:40.347280Z 1 [ERROR] [MY-012930] [InnoDB] Plugin initialization aborted with error Cannot open a file.

2022-06-28T09:12:40.844981Z 1 [ERROR] [MY-010334] [Server] Failed to initialize DD Storage Engine

2022-06-28T09:12:40.845406Z 0 [ERROR] [MY-010020] [Server] Data Dictionary initialization failed.

2022-06-28T09:12:40.846542Z 0 [ERROR] [MY-010119] [Server] Aborting

2022-06-28T09:12:40.847841Z 0 [System] [MY-010910] [Server] /usr/sbin/mysqld: Shutdown complete (mysqld 8.0.29-0ubuntu0.20.04.3)  (Ubuntu).

(END)


What does this mean?

Bawolff (talkcontribs)

You might have to ask at a mysql forum for that one.

2003:C2:3F24:200:7936:FD16:1C06:D7CD (talkcontribs)

Sorry, false alarm. The lock error never appeared again, so it is clear that it has nothing to do with the repoted SpecialPage problem.

2003:C2:3F30:A000:55E0:6D6D:CE8:3CD5 (talkcontribs)

Bawolff, once again you are our hero.


Now I have corrected the pagelinks.pl_title and page.page_title collation (... and all other xxxx.title), so they all match utf8mb4_0900_as_ci ... and voilà, all of the special pages including Mostlinked and WantedPages come to an end within a few minutes.


Many, many thanks!!!

Bawolff (talkcontribs)

Glad to hear it worked out

Reply to "Special:Mostlinked and WantedPages never ending"