Jump to content

Manual talk:RefreshLinks.php

Add topic
From mediawiki.org
Latest comment: 1 year ago by Buoysel in topic Multihreaded run
[edit]

Hi, can you give an example of article? I'm trying this php maintenance/refreshLinks.php [starting_article] but it doesn't work :( --Mark 07:57, 11 December 2009 (UTC)

php maintenance/refreshLinks.php 8000
will make it start with page id 8000. Added that. --88.130.74.229 19:06, 20 June 2014 (UTC)Reply

What about langlinks?

[edit]

Does it refresh this table?

Troubleshooting

[edit]

I've added a troubleshooting section as the known memory issues have not been dealt with here. See also Help:Download @de-Wiki. --DuyTrinh 08:45, 2 November 2011 (UTC)Reply

Cool, good point. I added instructions to avoid this in the first place. Cheers --[[kgh]] 22:26, 2 November 2011 (UTC)Reply

Script stops processing data

[edit]

This script gets to

Refreshing links table.
Starting from page_id 1 of 3651.
100
200

and then just sits there. According to TOP php and mysqld both drop to 0%CPU so I cant imagine that its still processing anything. I know it says it can take a long time, but after an hour it was still in the same place.

Is there a known problem?

I need to run this because i imported a large chunk of data into the DB (exported a chunk of data made a Find/Replace change, and import) and I need to run this to get it updated.

Chunking refreshLinks.php

[edit]

Where and how to run this script?

num_pages=$(php /path/to/mediawiki/maintenance/showSiteStats.php | grep "Total pages" | sed 's/[^0-9]*//g')
end_id=0
delta=2000

echo "Beginning refreshLinks.php script"
echo "  Total pages = $num_pages"
echo "  Doing it in $delta-page chunks to avoid memory leak"

while [ "$end_id" -lt "$num_pages" ]; do
start_id=$(($end_id + 1))
end_id=$(($end_id + $delta))
echo "Running refreshLinks.php from $start_id to $end_id"
php /path/to/mediawiki/maintenance/refreshLinks.php --e "$end_id" -- "$start_id"
done

# Just in case there are more IDs beyond the guess we made with showSiteStats, run 
# one more unbounded refreshLinks.php starting at the last ID previously done
start_id=$(($end_id + 1))
echo "Running final refreshLinks.php in case there are more pages beyond $num_pages"
php /path/to/mediawiki/maintenance/refreshLinks.php "$start_id"

Copy the code to your console and hit enter to start

[edit]

You simply need to copy the code, to your command line, and make sure that you adjust the

/path/to/mediawiki

Example

root@host/var/www/html
num_pages=$(php /path/to/mediawiki/maintenance/showSiteStats.php | grep "Total pages" | sed 's/[^0-9]*//g')
end_id=0
delta=2000

echo "Beginning refreshLinks.php script"
echo "  Total pages = $num_pages"
echo "  Doing it in $delta-page chunks to avoid memory leak"

while [ "$end_id" -lt "$num_pages" ]; do
start_id=$(($end_id + 1))
end_id=$(($end_id + $delta))
echo "Running refreshLinks.php from $start_id to $end_id"
php /path/to/mediawiki/maintenance/refreshLinks.php --e "$end_id" -- "$start_id"
done

# Just in case there are more IDs beyond the guess we made with showSiteStats, run 
# one more unbounded refreshLinks.php starting at the last ID previously done
start_id=$(($end_id + 1))
echo "Running final refreshLinks.php in case there are more pages beyond $num_pages"
php /path/to/mediawiki/maintenance/refreshLinks.php "$start_id"

After Pasting the code, hit enter.

Multihreaded run

[edit]

Since this script takes an extremely long time to finish on a large wiki, I would like to run it in multiple processes, maybe like so:

Process 1: php /path/to/mediawiki/maintenance/run.php refreshLinks --e 10000 -- 1
Process 2: php /path/to/mediawiki/maintenance/run.php refreshLinks --e 20000 -- 10001
Process 3: php /path/to/mediawiki/maintenance/run.php refreshLinks --e 30000 -- 20001
Process 4: php /path/to/mediawiki/maintenance/run.php refreshLinks --e 40000 -- 30001

...and so on. Does anybody know if this script can run fine in parallel, or would there be some concerns with this? Buoysel (talk) 23:44, 9 November 2023 (UTC)Reply