MovieSeq (ECCV'24)

Learning Video Context as Interleaved Multimodal Sequences
Kevin Qinghong Lin, Pengchuan Zhang, Difei Gao, Xide Xia, Joya Chen, Ziteng Gao, Jinheng Xie, Xuhong Xiao, Mike Zheng Shou

TL;DR: MovieSeq aim to enhance Large Multimodal Models for improved Video In-Context Learning using Interleaved Multimodal Sequences (e.g., character photo, human dialogues, etc).

NOTE: Recognize the baseline used in the paper LLama2 is quite old, we have developed MovieSeq-4o -- lightweight practical code that can be easily integrated into existing LMMs (e.g., GPT-4o) for easy usage.

MovieSeq-4o connects Whisper, Character images, and Video Frames to build a good video context, it can easily integrate into other VLM or APIs (such as Gemini, Claude, etc) on your own videos!

Environments

conda create --name movieseq python=3.10
conda activate movieseq
conda install pytorch==2.0.0 torchaudio==2.0.0 pytorch-cuda=11.8 -c pytorch -c nvidia

pip install git+https://github.com/m-bain/whisperx.git
pip install tqdm moviepy openai opencv-python

Guideline

Please refer to example.ipynb to learn how MovieSeq works. Have fun!

BibTeX

If you find our work helpful, please kindly consider citing our paper. Thank you!

@inproceedings{lin2024learning,
  title={Learning video context as interleaved multimodal sequences},
  author={Lin, Kevin Qinghong and Zhang, Pengchuan and Gao, Difei and Xia, Xide and Chen, Joya and Gao, Ziteng and Xie, Jinheng and Xiao, Xuhong and Shou, Mike Zheng},
  booktitle={European Conference on Computer Vision},
  pages={375--396},
  year={2024},
  organization={Springer}
}

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
assets		assets
char_bank		char_bank
README.md		README.md
example.ipynb		example.ipynb
input_clip.mp4		input_clip.mp4
movieseq.py		movieseq.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MovieSeq (ECCV'24)

Environments

Guideline

BibTeX

About

Releases

Packages

Languages

showlab/MovieSeq

Folders and files

Latest commit

History

Repository files navigation

MovieSeq (ECCV'24)

Environments

Guideline

BibTeX

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages