Jump to content

Topic on Extension talk:Replace Text/Archive 2019 to 2024

2001:7C0:28E0:102:0:0:0:DA (talkcontribs)

Hello everyone,

I am in the middle of updating a really old MW that was still running the HarvardReferences Extension. The current style for citations there looks like this: [Author Year: S. Pagenumber] (S stands for Seite = page in German), so e.g. [Smith 2014: S. 124]. Since I have to replace all those references with usual Cite references, I tried the following regex to find them but it did not return any result:

^\\[([^]]+) (\\d{4}): S\\. (\\d+)\\]$

Can you help me in figuring out what I am doing wrong at the moment?


Thanks in advance!

Dinoguy1000 (talkcontribs)

You don't need to double-escape: ^\[([^]]+) (\d{4}): S\. (\d+)\]$ should work fine (though note that I haven't tested this regex, so it may have other issues as well).

2001:7C0:28E0:102:0:0:0:DA (talkcontribs)

Thank you! I tried the expression you suggested but I stell get no pages on return. Simpler expressions (such as 'a(.*)c') work, though.

2001:7C0:28E0:102:0:0:0:DA (talkcontribs)

*still

Clump (talkcontribs)

Try escaping the closing square bracket in your first class (i.e., write [^\]] instead of [^]]) or perhaps removing the beginning and/or end of line markers (^ and $).

2A03:80:D3D:A400:21E0:382:80EB:B92F (talkcontribs)

Thanks! I just tried it but I still get the same result (no pages found). Could this be a DB issue or do you still think that this has something to do with the structure of the regex? As I said: simpler expressions do miraculously work...

Dinoguy1000 (talkcontribs)

At this point, the thing I'd suggest is to start with part of your target regex, test that it works, and then slowly add parts back to it until you find a part that causes it to stop working.

MvGulik (talkcontribs)

Its the "^" and "$" parts that don't work as intended/expected when I tried it.

Try "\n" or else real CR's/Enters. (depends on Replace Text version)

Ergo: "\n\[([^]]+) (\d{4}): S\. (\d+)\]\n"

Or:

"<hidden Enter>

\[([^]]+) (\d{4}): S\. (\d+)\]<hidden Enter>

"

Both options worked with Replace Text 1.7 (cba3752) 18:03, 14 March 2023. (+10.5.24-MariaDB)

MvGulik (talkcontribs)

If I remember correct I think the "^" and "$" only work for targeting the beginning and end of the whole text/page document.

Ergo: The whole text/page document should be looked at as one long string with no 'real' internal (^/$) line-breaks.

2001:7C0:28E0:102:0:0:0:242 (talkcontribs)

Thanks for your replies! I put in the expression(s) but I still get the same (= no) result but realized that while the first part \n\[([^]]+) works, the next one (year) does not.

Clump (talkcontribs)

Perhaps the match quantifier isn't supported. Instead of \d{4}, have you tried \d\d\d\d?

MvGulik (talkcontribs)


Replace Text: 1.7 (cba3752) 18:03, 14 March 2023. (+10.5.24-MariaDB)

I don't know what version of "Replace Text" supports what and/or what not. But if your using an older version, and can update it. That might be the best option at this point.


If your search string might also exist at the begin and/or end of the page text. "(^|\n)...(\n|$)" should do the trick (partially tested).

2001:7C0:28E0:108:C02A:97A4:196F:36FB (talkcontribs)

\d\d\d\d also doesn't work. I really think that the cause is rooted somewhere else, but I don't know where...


The wiki's is currently running with

  • MediaWiki 1.39.7
  • PHP 8.1.28
  • MySQL 5.7.43
  • ReplaceText 1.7 ( d5b3a18)
Reply to "Help with regex"