Jump to content

Extension talk:External Data/Archive 2023

From mediawiki.org

accessing the internal variable (__json)

I need some help. Background: I'm working on an extension that uses a webcall to retrieve a JSON object and converts it into concise html code. This extension now is working properly, but lacks caching. This JSON object is complex: containing multiple nested arrays with varying number of elements and it took me a while to program it in php.

My current line of thought is to transfer the entire json object (that is stored in the internal variable __json) to my extension.

Question 1) how does one access the internal variables?? The code below doesn't work.

Question 2) do you have any comments / suggestions on my approach?

{{#get_web_data:url=https://jisho.org/api/v1/search/words?keyword="先生"|format=json}}

  • first bullit: {{#external_value:__json}}
  • second bullit:{{#external_value:meta}}

{{#myextension:{{#external_value:__json}}|param1|param2}} Harro Kremer (talk) 21:20, 3 January 2023 (UTC)Reply

  • Since __json is not a string but a Lua table, it is only accessible from Lua (Scribunto is needed) and not from wikitext:
local parsed, errors = mw.ext.externalData.getExternalData{ url = 'https://jisho.org/api/v1/search/words?keyword="先生"' }
if parsed then
    local json = parsed.__json
end

Alexander Mashin talk 03:46, 9 January 2023 (UTC)Reply


Using External Data in a Template

I want to fill a custom Infobox Template with data from ExternalData. The data will reside in a CSV file matching the non-namespace part of the page importing.

mynamespace:mypage content: {{template:infobox myinfobox}}"
template:myinfobox content: {{infobox | header1 = header | data1 = {{#external_value:value_name|source=data|format=csv with header|delimiter=;|file name={{PAGENAME}}.csv}}
However, this tries to fetch data from a CSV file named "myinfobox" instead What is a better way to achieve this?

Is the template called "infobox myinfobox", or just "myinfobox"? And do you see the problem on the "mypage" page, or right on the template page? Yaron Koren (talk) 00:34, 10 January 2023 (UTC)Reply
my template page is called "Vorlage:Infobox_VM". The problem occurs on "mypage" as the template page doesn’t really have any visible content nor has it the CSV providing information. "mypage" tries to read from a csv called Vorlage:Infobox_VM.csv instead if mypage.csv.
  • This looks like a strange way to invoke a template: {{template:infobox myinfobox}}. Why not just {{myinfobox}}? And, as said above, fetching data from myinfobox.csv is what to be expected on the template page itself, unless the code is wrapped with <includeonly>...</includeonly>.
    Alexander Mashin talk 02:11, 10 January 2023 (UTC)Reply
yes you’re right, i tried being verbose for clarity. in my page i have only {{Infobox VM}} and that seems to work well enough. I'm not really concerned what csv is being read on the template page itself, as that page isn’t really supposed to be looked at.
I want "mypage" to look at mypage.csv through the template. that’s what doesn’t work.
  • What are the settings for the data source data in LocalSettings.php ($wgExternalDataSources['data'])?
$wgExternalDataSources['data']['path'] = '/opt/infradb-data/vm/';
that path and all the files in it are world readable. that stuff works in principle. i am able to pull info from csv. just not via the page name in the template.

Slightly different csv data files, only one works

I have 2 48 line csv files, < 10k in size with a minor difference, but only one can be read in ExternalData in 1.39

Simplified case https://johnbray.org.uk/expounder/Extdataproblem1 uses

get_web_data: url=https://files.johnbray.org.uk/Documents/Expounder/Q/532/9928/datagood and then
get_web_data: url=https://files.johnbray.org.uk/Documents/Expounder/Q/532/9928/databad 

checking that {{#external_value:ddescription}} has something from a line of the csv file. datagood works, but databad does not. The difference between the files is a few characters on one line, and the good file is actually longer than the bad

< "item",+1996-04-05T00:00:00Z,+1996-04-08T00:00:00Z,"{{link|Q111529509|Evolution}}","in Heathrow wit h {{link|Q312405|Vernor Vinge}}, {{link|Q472872|Jack Cohen}}, {{link|Q2927188|Bryan Talbot}}, {{link| Q7151782|Paul Kincaid}}, {{link|Q742918|Colin Greenland}}, {{link|Q62625219|Maureen Kincaid Speller}} ,","","","","","","",51.4673,-0.4529

> "item",+1996-04-05T00:00:00Z,+1996-04-08T00:00:00Z,"{{link|Q111529509|Evolution}}","in Heathrow wit h {{link|Q312405|Vernor Vinge}}, {{link|Q472872|Jack Cohen}}, {{link|Q2927188|Bryan Talbot}}, {{link| Q7151782|Paul Kincaid}}, {{link|Q742918|Colin Greenland}}AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAB BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBA,","","","","","","",51.4673,-0.4529

Both files are < 10k in size, 48 lines, and the good file is actually longer than the bad

wc datagood databad

 48   338  9245 datagood 
 48   341  9180 databad Vicarage (talk) 14:05, 13 January 2023 (UTC)Reply

What is the best practice to fetch many values (>300) from the same place?

Is it better to use the legacy method, like #get_web_data, and fetch all the values at once, then display using #external_value; or is it better to use the new method, #external_value with source parameter, 300 times? What performance consideration might there be?

Jeremi Plazas (talk) 17:51, 17 January 2023 (UTC)Reply

  • If you use caching, the difference is not that big. Using {{#get_web_data:}} will save cache lookups, but the legacy mode will stop working once MediaWiki is upgraded to use Parsoid, since it does not guarantee parsing order. The optimal solution is to handle data fetching and display with one Lua function, where you can save the fetched data into a variable and later display it.
    Alexander Mashin talk 03:22, 18 January 2023 (UTC)Reply
    Thanks, we'll look into Lua. We do have caching setup so the standalone method might be fine, now that you've helped us iron out the kinks. Thanks again for the help! Jeremi Plazas (talk) 17:54, 19 January 2023 (UTC)Reply


Some JSONPATH doesn’t retrieve results

I have a JSON file with Information about network interfaces in it. I am retrieving this information with get_file_data and two jsonpath instructions. However, only one of them seems to be executed/filled with data.

This is my json:
[{"ip-addresses": [{"prefix": 24, "ip-address": "1.1.1.1", "ip-address-type": "ipv4"}, {"ip-address-type": "ipv6", "ip-address": "fe80::aaaa:aaaa:aaaa:aaaa", "prefix": 64}], "hardware-address": "ab:ab:ab:ab:ab:ab", "name": "eth0"}, {"ip-addresses": [{"ip-address": "1.1.1.2", "ip-address-type": "ipv4", "prefix": 24}, {"ip-address-type": "ipv6", "ip-address": "fe80::ccff:ceff:feff:ceff", "prefix": 64}], "hardware-address": "ce:ff:ce:ff:ce:ff", "name": "eth1"}]
and this is the template code i use to retrieve the data:
{{#get_external_data: source=host_data |file name={{{{{ucfirst:{{{hostname}}}}}|dummy}}}_net.json |data=mymacaddress=$[?(@.name == '{{{interfacename}}}')].hardware-address, adressen=$[?(@.name == '{{{interfacename}}}')][ip-addresses][*].ip-address |use jsonpath |format=json }} {{{hostname}}} {{{interfacename}}} ({{#external_value:mymacaddress|invalid}}) {{#display_external_table: template=bulleted list|data=1=adressen}}
i call this template in the following fashion:
{{#display_external_table: template=Interface |data=interfacename={{{netnames}}},hostname={{{filename}}} }}
where netnames is just an array, usually only one entry like ["ens18"] and filename points to the correct file

  • If {{{netnames}}} is an array, I don't know how transcluding it within single quotes in a JsonPath query could work. Neither $[?(@.name == '["ens18"]')].hardware-address, nor $[?(@.name == '["eth1"]')].hardware-address is a working JsonPath. mymacaddress=$[?(@.name == 'eth1')].hardware-address would be. I would suggest replacing == with in, but this does not seem to be implemented.
    You could try a regular expression: $[?(@.name =~ 'eth0|eth1')].ip-addresses.[*].ip-address.
    Also, you can get the bulleted list without a template:
{{#for_external_table:|
 * {{{adressen}}}}}

Alexander Mashin talk 06:23, 19 January 2023 (UTC)Reply

following up on this, i found some mistakes in the JSONPATH.
keys with "-" character need to be specified in this syntax: ["key-with-dash"]
so i changed that, but i still cannot retrieve the ip addresses.
This is my current JSONPATH:
<code><nowiki>ipadressen=$.qemu_agent_interfaces[?(@["hardware-address"] != "00:00:00:00:00:00" )]["ip-addresses"][?(@["ip-address-type"] == "ipv4")]["ip-address"]</nowiki></code> 24.134.95.253 20:42, 29 April 2024 (UTC)Reply
the trick is to not have any spaces in the JSONPATH 77.22.6.114 10:26, 1 May 2024 (UTC)Reply

Strange bug parsing CSV with pipe character

I have a field called title which is called in #for_external_table. All of the values have at least one pipe character, and the page cuts off everything before and including the first pipe character. But the strange thing is that it only happens if {{{title}}} is located in a specific place.

The page can be seen at https://comprehensibleinputwiki.org/wiki/Mandarin_Chinese/Videos and the external data is at https://comprehensibleinputwiki.org/wiki/Data:Mandarin_Chinese/Videos. If you look at the wiki source of the first link, I have {{{title}}} twice, one of them in a hidden div. If you view the source of the page, the first one is missing part of the value, while the hidden one is complete. Dimpizzy (talk) 20:08, 22 January 2023 (UTC)Reply

  • Add delimiter=, to {{#get_web_data:}}.
    Alexander Mashin talk 02:01, 23 January 2023 (UTC)Reply
    It didn't seem to change anything. I changed it to:
    {{#get_web_data:url={{fullurl:Data:Mandarin_Chinese/Videos|action=raw}}|format=csv with header|data=language=Language, title=Title, videoId=Video ID, service=Service, level=Level, channel=Channel, index=Index|delimiter=,}} Dimpizzy (talk) 03:00, 23 January 2023 (UTC)Reply
      • At least, the videos are displayed now. If the current problem is the trimmed titles in the "Title" column, this is not related directly to the extension. The beginning of the title is treated as attributes to the <td> tag by MediaWiki parser. Wrap {{{title}}} with nowiki, like this: {{#tag:nowiki|{{{title}}}}}, to see the first chunk of the title.
        UPD: Or, you can add a second {{!}} before {{{title}}}.
        Alexander Mashin talk 07:21, 23 January 2023 (UTC)Reply
        That worked, thanks! I didn't notice any issues on my end with the videos not displaying before, but good to know! Dimpizzy (talk) 09:36, 23 January 2023 (UTC)Reply


Parameter parsing problems

PHP 8.2.1
MediaWiki 1.38.4
PostgreSQL 14.6
External Data 3.2 (5d30e60) 08:38, 2. Nov. 2022

Hi. I use External Data to retrieve data from a PostgreSQL database. In most cases I use prepared statements and I noticed that the passing of parameters seem not to work correctly. Here a self contained test to visualize what I mean.

In the database I have a table with a single column of type text:

SELECT * FROM public.test;
            txt
----------------------------
 a simple text
 another simple text
 a text with, a comma in it

Notice that the lines contain spaces and in one case a comma.

Then I have a search function that receives a parameter of type text and returns a set of text:

SETOF TEXT public.mw_test(p_search TEXT)

in the configuration LocalSettings.php for this looks like this:

$wgExternalDataSources['wikidoc'] = [
    'server' => 'xxx',
    'type' => 'postgres',
    'name' => 'xxx',
    'user' => 'xxx',
    'password' => 'xxx',
    'prepared'  => [
       'test' => 'SELECT mw_test
                  FROM public.mw_test($1);'
    ]
];

In the wikipage the snippet is as follow:

{{#get_db_data:
  db         = wikidoc
| query      = test
| parameters = with
| data       = documentation=mw_test
}} 
{{#for_external_table:<nowiki />
{{{documentation}}}
}}
{{#clear_external_data:}}

It simply displays what it finds on a line.

What happens is the the list of parameters cannot contain a comma. The snippet as is above works fine and returns:

a text with, a comma in it

But something like this not:

{{#get_db_data:
  db         = wikidoc
| query      = test
| parameters = with, a
| data       = documentation=mw_test
}} 
{{#for_external_table:<nowiki />
{{{documentation}}}
}}
{{#clear_external_data:}}

The error is "Fehler: Es wurden keine Rückgabewerte festgelegt."

It is clear that a comma is used to separate parameters and that is the reason why this does not work. My question is how can I pass the whole string "with, a" as a single parameter.

I tried enclosing it in single and double quotes, but this did not help. It leads to this exception:

[6882cc0e426ebdb6cf6911bc] /w/index.php?title=IT/IT_Infrastructure/KOFDB_Uebersicht&action=submit TypeError: EDParserFunctions::formatErrorMessages(): Argument #1 ($errors) must be of type array, null given, called in /home/wiki/application/w/extensions/ExternalData/includes/EDParserFunctions.php on line 98

Any idea what I could do to solve this? Help is very appreciated. Thanks

It looks like I found a way to solve this. I can enclose the whole string in round parenthesis and it works.

{{#get_db_data:
  db         = wikidoc
| query      = test
| parameters = (with, a)
| data       = documentation=mw_test
}} 
{{#for_external_table:<nowiki />
{{{documentation}}}
}}
{{#clear_external_data:}}

composer.json

When trying to install dependencies via composer, the current requirement says "composer/installers": "~2.1" on REL1_39 branch. The latest version is currently 2.5.4 which doesn't meet this requirement. Could this be updated to a more permissive requirement? Prod (talk) 18:29, 24 February 2023 (UTC)Reply

This is being addressed in phab:T330485. Prod (talk) 19:39, 13 March 2023 (UTC)Reply

Not pulling the correct result from the json

I have a json file that validates and have used the following code in the page:

{{#get_web_data:url=https://blah.json
| data = id=id,type=type,description=description
| use jsonpath $['Generic']['Defensive']['Wanted']:
| format=json
}}

I have tried all sorts of iterations code to get External data to bring back the result for ['Wanted'] but it will only bring back the first entry in the json file, not the ['Wanted'] entry. The patch checks out and I have checked it on a validator so I am not sure what I am doing wrong.

Any help would be appreciated. SyrinxCat (talk) 23:22, 25 April 2023 (UTC)Reply

The correct syntax seems to be:
{{#get_web_data:
url=https://blah.json
| data = id=$.Generic.Defensive.Wanted.id,type=$.Generic.Defensive.Wanted.type,description=$.Generic.Defensive.Wanted.description
| use jsonpath
| format=json
}}
A simpler variant may also work, depending on the JSON contents:
{{#get_web_data:
url=https://blah.json
| data = id=id,type=type,description=description
| format=json
}} Alexander Mashin talk 12:47, 26 April 2023 (UTC)Reply
Thanks for those. The first example returns the error
ID: Error: no local variable "id" has been set.
Type: Error: no local variable "type" has been set.
Description(s): Error: no local variable "description" has been set.
I was calling the results with
  • ID: {{#external_value:id}}
  • Type: {{#external_value:type}}
  • Description(s): {{#external_value:description}}
The second example I had already tried and it has the same problem. SyrinxCat (talk) 16:52, 26 April 2023 (UTC)Reply
Then an example of the JSON would be helpful. Alexander Mashin talk 08:13, 28 April 2023 (UTC)Reply

Read BSON array from MongoDB

submit TypeError EDConnectorMongodb::getValueFromJSONArray(): Argument #1 ($origArray) must be of type array, MongoDB\Model\BSONDocument given

When I put data to {{get_external_data: source=Mongo|from collection|find query ={"email.domain":"some_domain"}

|data=email=email.domain}} recieve error above 93.153.250.62 09:47, 18 July 2023 (UTC)Reply

Selecting data via a view is not working anymore and an encoding issue

During testing on MW-1.39.4 (coming from MW-1.35) we noticed that selecting data via a view with {{#get_db_data:db the below code stopped working.

The below two examples do exactly the same thing but the one that selects the data from a view gives "Error: no local variable "TestString" has been set." The actual view(code) for testview is: SELECT * FROM view.

{{#get_db_data:db = Externalmachine
|from=test
|where=TestID= 2
|data=TestString=TestString
}}

* {{#external_value:TestString}}{{#clear_external_data:}}

{{#get_db_data:db = Externalmachine
|from=testview
|where=TestID= 2
|data=TestString=TestString
}}

* {{#external_value:TestString}}{{#clear_external_data:}}

The result is:

* Test String 2

* Error: no local variable "TestString" has been set.

It seems that in both cases the data is selected from the database because when I add a print statement for $value (ExternalData\includes\connectors\EDConnectorDb.php on line 159 ) the value for both queries is shown. But the {{#external_value:TestString}} does not print the value for the query where the data is coming from the view.


Another problem seems to be the mb_convert_encoding statement in the same file. When running the same query as above where the result of TestString contains a special character the php PRINT $value; prints Test�String 1 and the the following happens:

Internal error

[7f74c5f50b2707f31ceeb4ba] /wiki/Sandbox   ValueError: mb_convert_encoding(): Argument #3 ($from_encoding) must specify at least one encoding

Backtrace:

from C:\Program Files\Apache\htdocs\internalwiki\extensions\ExternalData\includes\connectors\EDConnectorDb.php(159)
 #0 C:\Program Files\Apache\htdocs\internalwiki\extensions\ExternalData\includes\connectors\EDConnectorDb.php(159): mb_convert_encoding()
 #1 C:\Program Files\Apache\htdocs\internalwiki\extensions\ExternalData\includes\connectors\EDConnectorDb.php(140): EDConnectorDb::processField()
 #2 C:\Program Files\Apache\htdocs\internalwiki\extensions\ExternalData\includes\connectors\EDConnectorDb.php(102): EDConnectorDb->processRows()
 #3 C:\Program Files\Apache\htdocs\internalwiki\extensions\ExternalData\includes\EDParserFunctions.php(90): EDConnectorDb->run()
 #4 C:\Program Files\Apache\htdocs\internalwiki\extensions\ExternalData\includes\EDParserFunctions.php(113): EDParserFunctions::get()
 #5 C:\Program Files\Apache\htdocs\internalwiki\extensions\ExternalData\includes\ExternalDataHooks.php(24): EDParserFunctions::fetch()
 #6 C:\Program Files\Apache\htdocs\internalwiki\includes\parser\Parser.php(3437): ExternalDataHooks::{closure}()
 #7 C:\Program Files\Apache\htdocs\internalwiki\includes\parser\Parser.php(3122): Parser->callParserFunction()
 #8 C:\Program Files\Apache\htdocs\internalwiki\includes\parser\PPFrame_Hash.php(275): Parser->braceSubstitution()
 #9 C:\Program Files\Apache\htdocs\internalwiki\includes\parser\Parser.php(2951): PPFrame_Hash->expand()
 #10 C:\Program Files\Apache\htdocs\internalwiki\includes\parser\Parser.php(1609): Parser->replaceVariables()
 #11 C:\Program Files\Apache\htdocs\internalwiki\includes\parser\Parser.php(723): Parser->internalParse()
 #12 C:\Program Files\Apache\htdocs\internalwiki\includes\content\WikitextContentHandler.php(301): Parser->parse()
 #13 C:\Program Files\Apache\htdocs\internalwiki\includes\content\ContentHandler.php(1721): WikitextContentHandler->fillParserOutput()
 #14 C:\Program Files\Apache\htdocs\internalwiki\includes\content\Renderer\ContentRenderer.php(47): ContentHandler->getParserOutput()
 #15 C:\Program Files\Apache\htdocs\internalwiki\includes\Revision\RenderedRevision.php(266): MediaWiki\Content\Renderer\ContentRenderer->getParserOutput()
 #16 C:\Program Files\Apache\htdocs\internalwiki\includes\Revision\RenderedRevision.php(237): MediaWiki\Revision\RenderedRevision->getSlotParserOutputUncached()
 #17 C:\Program Files\Apache\htdocs\internalwiki\includes\Revision\RevisionRenderer.php(221): MediaWiki\Revision\RenderedRevision->getSlotParserOutput()
 #18 C:\Program Files\Apache\htdocs\internalwiki\includes\Revision\RevisionRenderer.php(158): MediaWiki\Revision\RevisionRenderer->combineSlotOutput()
 #19 [internal function]: MediaWiki\Revision\RevisionRenderer->MediaWiki\Revision\{closure}()
 #20 C:\Program Files\Apache\htdocs\internalwiki\includes\Revision\RenderedRevision.php(199): call_user_func()
 #21 C:\Program Files\Apache\htdocs\internalwiki\includes\poolcounter\PoolWorkArticleView.php(91): MediaWiki\Revision\RenderedRevision->getRevisionParserOutput()
 #22 C:\Program Files\Apache\htdocs\internalwiki\includes\poolcounter\PoolWorkArticleViewCurrent.php(97): PoolWorkArticleView->renderRevision()
 #23 C:\Program Files\Apache\htdocs\internalwiki\includes\poolcounter\PoolCounterWork.php(162): PoolWorkArticleViewCurrent->doWork()
 #24 C:\Program Files\Apache\htdocs\internalwiki\includes\page\ParserOutputAccess.php(299): PoolCounterWork->execute()
 #25 C:\Program Files\Apache\htdocs\internalwiki\includes\page\Article.php(714): MediaWiki\Page\ParserOutputAccess->getParserOutput()
 #26 C:\Program Files\Apache\htdocs\internalwiki\includes\page\Article.php(528): Article->generateContentOutput()
 #27 C:\Program Files\Apache\htdocs\internalwiki\includes\actions\ViewAction.php(78): Article->view()
 #28 C:\Program Files\Apache\htdocs\internalwiki\includes\MediaWiki.php(542): ViewAction->show()
 #29 C:\Program Files\Apache\htdocs\internalwiki\includes\MediaWiki.php(322): MediaWiki->performAction()
 #30 C:\Program Files\Apache\htdocs\internalwiki\includes\MediaWiki.php(904): MediaWiki->performRequest()
 #31 C:\Program Files\Apache\htdocs\internalwiki\includes\MediaWiki.php(562): MediaWiki->main()
 #32 C:\Program Files\Apache\htdocs\internalwiki\index.php(50): MediaWiki->run()
 #33 C:\Program Files\Apache\htdocs\internalwiki\index.php(46): wfIndexMain()
 #34 {main}

When also printing $encoding it is empty so it seems that the correct encoding is not detected for the string with mb_detect_encoding on line 158. When adding some code to see what the encoding is it comes back with "Quoted-Printable". I will keep trying to investigate further but this is it for now.

I tested this on a clean install of MW-1.39.4 and where only ExternalData ( version: 3.3-alpha ) is enabled, no other extensions. Thank you. Felipe (talk) 13:04, 27 July 2023 (UTC)Reply

Upgrade the extension and try again.
Alexander Mashin talk 02:51, 28 July 2023 (UTC)Reply
Hello Alexander, thank you for the quick responds. The SELECT from a view fault is fixed now but the encoding fault was not. After some testing I found out that on some table columns (which hold the data retrieved by ExternalData) my encoding was set to latin1 and not to utf8. In the above mentioned string there was not a normal space but a "non breaking space". The combination of latin1 and trying to check for UTF8 made mb_detect_encoding fail to return any value. When the column is encoded as UTF8 it works just fine. If I am correct I can fix this completely by using utf8mb4_general_ci instead of utf8mb3_general_ci. But I need to update MariaDB first to be able to do this.
As for $encoding = mb_detect_encoding( $value, 'UTF-8', true ) ?? 'UTF-8'; it will still not return UTF-8 when the check "fails". When you instead force $encoding to UTF-8 and $value is not utf8 encoded you get another fault further "upstream" which is probably not what you want. It is probably better to fail in mb_convert_encoding because $encoding is empty. But that is my humble opinion. Thanks again, Felipe (talk) 11:28, 28 July 2023 (UTC)Reply

Retrieve a wiki page's revision fails with 3.x

With v2.0.1 it was possible to retrieve a wiki page's revision as follows:

{{#get_web_data:url=http://127.0.0.1{{SCRIPTPATH}}/api.php?action=query&prop=revisions&titles={{FULLPAGENAME}}&format=json
  |format=json
  |data=revision_id=revid
}}
{{#external_value:revision_id}}

This does not work anymore with v3.2 independent of setting $wgExternalDataAllowGetters to true or false. Did I miss something? Planetenxin (talk) 18:28, 13 September 2023 (UTC)Reply

  • Works with HTTPS, yet there seem to be problems with HTTP, but only if {{FULLPAGENAME}}, rather than a constant page name, is used. It's srange; I'll investigate it further.
    While I was writing this, I got it. It is a caching issue. The page that you had put the API call on, is new, and the API response, without any revisions yet, is stuck in the cache. You may want to reduce the caching time in {{#get_web_data:}}, but it will not be lower than the corresponding configuration setting; so, you may need to set low caching time just for API calls: $wgExternalDataSources['127.0.0.1']['min cache seconds'] = 5;. The alternative is to switch miser mode off and use {{REVISIONID}}.
    Alexander Mashin talk 04:17, 18 September 2023 (UTC)Reply


Upgrade to 1.39.4 produces exception on data retrieval

Using get_program_data in the same condition that worked in version 1.35.4, this call throws an exception now. [26b84a74db9b6b0329b23d0d] index.php title=Benutzer:abcd/tests action submit TypeError: Argument 1 passed to ...Command::input() must be of the type string, null given, called in ...EDConnectorExe.php on line 184 (omitting the paths, because mediawiki thinks it’s preventing me from spamming links...)


This seems to indicate, that no argument is available to the run() function that line 184 is part of. Is this due to some sort of caching? What can i do to fix this?

{{#get_program_data:
    program = phone_number_formatter
  | data = intl=intl,natl=natl,rfc=rfc,e164=e164,nosp=nosp
  | format = INI
  | number = 123123123
 }}

$wgExternalDataSources['phone_number_formatter'] = [
    'command'       => '/usr / bin/ python3  /opt  /scripts  / prettyPhoneNumber.py $number$',
    'params'        => [ 'number'],
    'param filters' => [ 'number' => '/^[0-9+ ]+$/' ],
];
    1. I am glad to read that someone else uses {{#get_program_data:}},
    2. although using an external program to format telephone numbers seems an overkill,
    3. I also thank you for testing the extension under MediaWiki 1.39, for I have not had a chance to do it,
    4. this error is caused by broken backward compatibility in MediaWiki 1.39,
    5. and is triggered by the fact that you pass the phone number to the python script as a parametre, not as standard input, and is this the right way?
    6. I have submitted a patch to fix the issue.
      Alexander Mashin talk 03:02, 20 September 2023 (UTC)Reply
Thanks for the quick help. Will the patch find its way into the tarball distributed by the extension distributor?
The script also adds a flag depending on the country the number is from ;) I may change the script to be a webserver instead. it’s kinda slow to launch python so many times.
When i apply the patch i can look at my other usages of this plugin, namely the mysql connector and ldap to see if they work correctly. will report back.
Out of curiosity, how would i pass the number to STDIN and why do you think it’s better? i can do it either way. 77.21.209.173 10:20, 20 September 2023 (UTC)Reply
      1. If the tarball is downloaded from gerrit or github, and not from the extension distributor, I think, it will be updated. Use the master branch.
      2. I meant that this processing could be performed by a Lua script, especially aided by semantic data (flags, for example),
      3. $wgExternalDataSources['phone_number_formatter']['input'] = 'number'; will send the value of the parser function parameter number to program's standard input. This is, more or less, the only choice for long multi-line parameters like dot code for GraphViz. But, perhaps, it is not optimal for short data, like a phone number.
        Alexander Mashin talk 10:48, 20 September 2023 (UTC)Reply
        Ah i see. I will apply the patch 11 hours from now. I'm using a python library to do the heavy lifting. It’s just quick and dirty and works for us, to time to reimplement the thing in lua. 77.21.209.173 11:07, 20 September 2023 (UTC)Reply
        Patch works, and so do other methods of data retrieval. Thanks again. 62.72.73.138 09:25, 21 September 2023 (UTC)Reply

Fatal exception of type "TypeError" if get_db_data has no results

Hi, i get the ErrorMessage Fatal exception of type "TypeError" if my Query has no results (the whole page is not loading).

If i change my WHERE filter to not existing data i get the error, if i change it to get result it works. Even if i remove the where clause and empty the table i get the error message. In this example this means

  • WHERE TestZahl = 1 -> works
  • WHERE TestZahl = 2 -> exception
  • with Empty Table on SQL Server and without WHERE clause -> exception


My Environment:

  • Windows Server, Apache, Microsoft SQL Server, PHP8, MediaWiki 1.40, ExternalData 3.2

Sql Server

CREATE TABLE [Wiki].[TestTabelle]([TestText] [nvarchar](50) NULL,[TestZahl] [int] NULL)
INSERT INTO [Wiki].[TestTabelle] VALUES('A',1)

LocalSettings.php

$wgExternalDataSources['connectionname'] = [
    'server' => 'mssqlservername,11431',
	'driver' => 'ODBC Driver 17 for SQL Server',
    'type' => 'odbc',
    'name' => 'databasename',
    'user' => 'wikisqlserveruser',
    'password' => 'wikisqlserverpassword',
	'prepared' => [
  	  'test' => <<<'ODBC'
	 	  SELECT TestText FROM Wiki.TestTabelle WHERE TestZahl = 2
      ODBC
	]
];

Wikipage

{{#get_db_data: db = connectionname
  | query=test
  | data=TestText=TestText}}

{| class="wikitable"; width="100%; 
| Test 
{{#for_external_table:<nowiki/>
{{!}}-
{{!}} {{{TestText}}}}}
|}

TomRamm (talk) 10:25, 9 November 2023 (UTC)Reply

In my debug log i found out that this error happens in EDParserFunctions.php.
If there is no result the get function try to format the error messages, but the error messages are empty.
I have two solutions that works for me, but both are not the correct solution because it supress any (also real) errors.
The correct solution would be that the get functions not fall into an Error if the result is empty.
Solution 1
Check if connector has errormessages for the result.
$connector = EDConnectorBase::getConnector( $name, self::parseParams( $args ), $title );
        if ( empty($connector->errors()) ) {
            if ( $connector->run() ) {
                return $connector->result();
            }
        }
        if ( empty($connector->errors()) ) {
            return null;
        }
        return $connector->suppressError() ? null : self::formatErrorMessages( $connector->errors() );
Solution 2
Allways return null.
$connector = EDConnectorBase::getConnector( $name, self::parseParams( $args ), $title );
if ( empty($connector->errors()) ) {
    if ( $connector->run() ) {
        return $connector->result();
    }
}
if ( empty($connector->errors()) ) {
    return null;
}
return null;
TomRamm (talk) 08:14, 10 November 2023 (UTC)Reply
  • Fixed in January, therefore, upgrade.
    Alexander Mashin talk 12:20, 14 November 2023 (UTC)Reply
    • Thanks for the tip, that was the solution.
    • The server does not have a direct internet connection, so I used the recommended link on the extension page. However, this link points to a version from October 2022. Since the version number no longer changes, I did not notice that this version is outdated (current and October version are 3.22). The link should be updated so that others do not also download old versions. TomRamm (talk) 09:07, 22 November 2023 (UTC)Reply
      • Well, there will often be fixes and changes that require users to get the latest version of some extension, as opposed to the standard download version. With that said, this may be a good time to release a new version of External Data anyway. @Alex Mashin - what do you think? Yaron Koren (talk) 14:48, 24 November 2023 (UTC)Reply
        • I can't see how setting a new version number could help with obsolete links to the extension archive. Being rather busy at the moment, I am also not going to make a significant contributuon to the extension soon, except bug fixes. Please take this into account when making a decision about new version release.
          Alexander Mashin talk 06:58, 26 November 2023 (UTC)Reply
          From my point of view, there are several ways to solve the problem (sometimes in combination).
          Version 1: A regular update of the version number:
          For me as a "user" of your extension, it would be a great help if the version number was updated regularly.
          After installing an extension, I look under Special:Version to see if the extension is installed correctly and if it is the latest version. If this version is the latest, as stated in the description of the extension, I assume that I have installed the latest version (which in this case was not correct and led to my support request after a few hours of error analysis).
          2. attach a note to the download link
          The link on the download page could also say "You can download an older version here..."
          3. change the download link
          Github offers a dynamic link to a zip file that always contains the current master: https://github.com/wikimedia/mediawiki-extensions-ExternalData/archive/refs/heads/master.zip
          4. use the ExtensionDistributor. Currently you say "not recommended", instead the information text could say "please download the current master version, not the offered wiki versions, because ...".
          PS: The link to the ExtensionDistributor should be https://www.mediawiki.org/wiki/Special:ExtensionDistributor/ExternalData TomRamm (talk) 07:04, 27 November 2023 (UTC)Reply
          Okay, I just released a new version of External Data, 3.3. Hopefully there are no major problems with this version! Yaron Koren (talk) 18:46, 27 November 2023 (UTC)Reply
          "External Data is not compatible with the current MediaWiki core (version 1.35.13), it requires: >= 1.37.0.".
          Was it necessary? The version 1.35 is still LTS.
          Alexander Mashin talk 13:59, 30 November 2023 (UTC)Reply
          That's true... but only until tomorrow. :) (At least, according to this.) I figured it was okay to drop off compatibility already. If you think it should be kept/restored, though, let me know. Yaron Koren (talk) 14:28, 30 November 2023 (UTC)Reply
          Yes, it should. So far, there are no features in External Data incompatible with 1.35; and for some reasons I am not yet able to upgrade to 1.39 or above; therefore, I will not be able to fix bugs, which I was trying to do.
          Alexander Mashin talk 14:40, 30 November 2023 (UTC)Reply
          Okay, that's good to know. Do you have any idea when you'll be able to upgrade to 1.39 (or higher)? Yaron Koren (talk) 16:46, 30 November 2023 (UTC)Reply
          In the spring; and I have a bug fix that I cannot test without doing something non-standard. So, I am afraid, the commit dropping MW 1.35 support should be reverted for now.
          Alexander Mashin talk 06:37, 1 December 2023 (UTC)Reply
          Okay, I just reverted that change, so now you should be able to use the latest ED code with MW 1.35 again. Yaron Koren (talk) 14:44, 1 December 2023 (UTC)Reply

http header in web request

I am trying to use #get_external_data / #get_web_data to fetch some data from webserver that requires to authenticate by providing a token in http Headers argument of GET request. unfortunately i see the extension only offers Url parameter.

Is there any way to insert the Header to simulate web request like:

curl --header "token:123456789" <my url comes here>

78.11.8.249 15:33, 29 December 2023 (UTC)Reply

  • In LocalSettings.php: $wgExternalDataSources['<your url comes here>']['options']['headers'] = [ 'token' => '123456789' ];
    Alexander Mashin talk 10:58, 31 December 2023 (UTC)Reply
    Thanks for hint Alexander. Does this mapping allow to put any wildards to my_url, when i need to fetch a lots of data located under different subtrees, and can not provide mapping for each single host url. Something like $wgExternalDataSources['<my_url_comes_here>/api/list/hosts/*']['options']['headers'] = [ 'token' => '123456789' ]; 78.11.8.249 12:53, 2 January 2024 (UTC)Reply
      • You can use full domain or second-level domain name as a key to $wgExternalDataSources: e.g. $wgExternalDataSources['subdomain.example.com'] or $wgExternalDataSources['example.com']
        Alexander Mashin talk 13:42, 2 January 2024 (UTC)Reply
        Hi Alexander, dealing with parts of domain name itself does not solve. what i look for is the way to define one common token to be user for fetching different json data components published under one domain urls like www.myurl.com/api/lists/hosts/host1/ip , www.myurl.com/api/lists/hosts/host2/hostname, where this subtrees may grow. Whitelisting describe on manual page also seems to need list of exact entries, with no wildcards, right ? 78.11.8.249 15:50, 2 January 2024 (UTC)Reply
        i need to correct myself. removing https:// from url string made it working for any suffixes after domain name. 78.11.8.249 15:57, 2 January 2024 (UTC)Reply