For TimedMediaHandler's video transcoding, currently we use one job per output file, which can run anywhere from a couple minutes to many hours -- this has been problematic when there are large floods of uploads that need handling simultaneously, or batch reprocessing for changed output formats.
As part of a planned pivot of output formats from flat-files to MPEG-DASH streaming we'll have the opportunity to divide up the output into many small files, which gives the opportunity to divide the large jobs up into (potentially a lot of) small jobs converting about 10 seconds of video per job.
I've still got open questions about how to handle this sort of job. If we have a long video with 500 segments, do we:
- queue up 500 segment jobs and let them drain out?
- or queue up one or a few jobs for the first chunks, which "fan out" to produce the following jobs as they complete?
- or something else?
Concerns:
- need to be able to cancel if the file is deleted, moved, etc -- can we cancel the jobs and remove them, or just have to set a flag that lets them cancel out once they get run?
- want to avoid floods of batch jobs, but how to set priorities? (new files > rework of old files? low resolutions > high resolutions?) Current scheme uses two queues to ensure that low resolutions are processed alongside high resolutions.
Any thoughts?