I couldn't find this mentioned anywhere, apologies if I missed it. I'm wondering how an MCR-aware workflow such as saving media with structured info will behave if extracting that structured info takes a non-zero amount of time, for example, if we need to open the media file, calculate some metadata, and add it to the media info content slot. It seems that we would want to store the revision and its main slot content, then kick off a background job which will come back and write media info to its own slot. Will this be in a new revision that somehow point to the old revision, or will it be adding the slot content to the old revision?
Topic on Talk:Requests for comment/Multi-Content Revisions
Revisions are immutable - once a revision exists, its content cannot be modified. So, if new information becomes known after a revision was already created, it has to be stored in a new revision. That doesn't mean copying any information: content of slots that are not modified in a given revision are "inherited" from the parent revision.
So, short answer: if extracting some kind of information during upload has to be done asynchronously because it takes a non-trivial amount of time, storing it will create a new revision.
Longer answer:
Modifiable slots for derived data was considered in the original MCR proposal, and may be added in the future. But a slot can be either human editable or modifiable, not both. So the extracted data would have to live in yet another slot, not the MediaInfo slot. Information from that slot could then be merged with the human editable information in the MediaInfo slot to define "virtual statements".
Another wrinkle: perhaps we don't need a modifiable slot, we already have a place to store a meta data blob: img_metadata. This is not versioned, and can be updated at will, and MediaInfo could use it to expose virtual statements.
Virtual statements for extracted data seem like a really nice idea, but a lot of nitty gritty UI stuff needs to be sorted out to make them work. The idea has been floated several times, but it not mature.
That's helpful, thank you for the explanation. So it seems tricky to retrieve a historical revision along with extracted data, since I would need to know the revision at which the main slot modification of interest happens, then search forwards for the next derived slot edit which happens before another main slot edit.
This is all a hypothetical use case of course, so it isn't urgent or anything, I'm just hoping to better understand the internals.