-
Notifications
You must be signed in to change notification settings - Fork 66
Open
Labels
refactorImproves code itself, but does not fix a bug or add new functionality.Improves code itself, but does not fix a bug or add new functionality.
Description
Currently, our DeviceInterface is implicitly a video device interface. Inside SingleStreamDecoder, we dispatch differently based on audio and video:
torchcodec/src/torchcodec/_core/SingleStreamDecoder.cpp
Lines 1322 to 1327 in 6d72f11
| if (streamInfo.avMediaType == AVMEDIA_TYPE_AUDIO) { | |
| convertAudioAVFrameToFrameOutputOnCPU(avFrame, frameOutput); | |
| } else { | |
| deviceInterface_->convertAVFrameToFrameOutput( | |
| avFrame, frameOutput, preAllocatedOutputTensor); | |
| } |
Ideally, we'd like to turn that line into just:
deviceInterface_->convertAVFrameToFrameOutput(
avFrame, frameOutput, preAllocatedOutputTensor);That is, we handle audio and video the same. In order to do that, we need to somehow get SingleStreamDecoder::convertAudioAVFrameToFrameOutputOnCPU into a device interface. Design questions:
- Should we extend
CpuDeviceInterfaceto handle audio, in which case we'd dispatch inside of it? - Or should we keep
CpuDeviceInterfaceto be just video only, and create a new kind of device interface that is just audio?
I think 1 might be the better option, but I'm not sure. I'm confident we'll only ever do audio decoding on the CPU.
Metadata
Metadata
Assignees
Labels
refactorImproves code itself, but does not fix a bug or add new functionality.Improves code itself, but does not fix a bug or add new functionality.