Skip to content

[ECCV 2024] Learning Video Context as Interleaved Multimodal Sequences

Notifications You must be signed in to change notification settings

showlab/MovieSeq

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MovieSeq (ECCV'24)

Learning Video Context as Interleaved Multimodal Sequences
Kevin Qinghong Lin, Pengchuan Zhang, Difei Gao, Xide Xia, Joya Chen, Ziteng Gao, Jinheng Xie, Xuhong Xiao, Mike Zheng Shou

overview

TL;DR: MovieSeq aim to enhance Large Multimodal Models for improved Video In-Context Learning using Interleaved Multimodal Sequences (e.g., character photo, human dialogues, etc).

NOTE: Recognize the baseline used in the paper LLama2 is quite old, we have developed MovieSeq-4o -- lightweight practical code that can be easily integrated into existing LMMs (e.g., GPT-4o) for easy usage.

MovieSeq-4o connects Whisper, Character images, and Video Frames to build a good video context, it can easily integrate into other VLM or APIs (such as Gemini, Claude, etc) on your own videos!

Environments

conda create --name movieseq python=3.10
conda activate movieseq
conda install pytorch==2.0.0 torchaudio==2.0.0 pytorch-cuda=11.8 -c pytorch -c nvidia

pip install git+https://github.com/m-bain/whisperx.git
pip install tqdm moviepy openai opencv-python

Guideline

Please refer to example.ipynb to learn how MovieSeq works. Have fun!

BibTeX

If you find our work helpful, please kindly consider citing our paper. Thank you!

@inproceedings{lin2024learning,
  title={Learning video context as interleaved multimodal sequences},
  author={Lin, Kevin Qinghong and Zhang, Pengchuan and Gao, Difei and Xia, Xide and Chen, Joya and Gao, Ziteng and Xie, Jinheng and Xiao, Xuhong and Shou, Mike Zheng},
  booktitle={European Conference on Computer Vision},
  pages={375--396},
  year={2024},
  organization={Springer}
}

About

[ECCV 2024] Learning Video Context as Interleaved Multimodal Sequences

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published