Vladislav Bargatin · Egor Chistov · Alexander Yakovenko · Dmitriy Vatolin
📄 Paper | 🌐 Project Page | 🚀 Colab | 🤗 Demo | 📦 Models
MEMFOF is a memory-efficient optical flow method for Full HD video that combines high accuracy with low VRAM usage.
Given a video sequence, our code can estimate its optical flow. Visit the demo page and try it with your own video.
🏞️ Prefer MEMFOF-Tartan-T-TSKH model for real-world videos — it is trained with higher diversity and robustness in mind.
Install MEMFOF via the package manager:
pip3 install git+https://github.com/msu-video-group/memfofThen use the following snippet to compute backward and forward optical flow for three consecutive frames:
import torch
from memfof import MEMFOF
device = "cuda" if torch.cuda.is_available() else "cpu"
model = MEMFOF.from_pretrained("egorchistov/optical-flow-MEMFOF-Tartan-T-TSKH").eval().to(device)
with torch.inference_mode():
# [B=1, T=3, C=3, H=1080, W=1920]
example_input = torch.randint(0, 256, [1, 3, 3, 1080, 1920], device=device)
# [B=1, C=2, H=1080, W=1920]
backward_flow, forward_flow = model(example_input)["flow"][-1].unbind(dim=1)Refer to the demo notebook for quick start.
MEMFOF-TartanMEMFOF-Tartan-TMEMFOF-Tartan-T-TSKH(✅ Best for real videos)MEMFOF-Tartan-T-TSKH-kittiMEMFOF-Tartan-T-TSKH-sintelMEMFOF-Tartan-T-TSKH-spring
To train, evaluate, or submit MEMFOF, you’ll need the dev installation. Run the following commands:
git clone https://github.com/msu-video-group/memfof.git
cd memfof
pip3 install --editable .[dev]To train MEMFOF, you will need to download the required datasets: FlyingThings3D, Sintel, KITTI, HD1K, TartanAir, and Spring.
By default datasets.py will search for the datasets in these locations. You can create symbolic links to wherever the datasets were downloaded in the datasets folder.
├── datasets
├── Sintel
├── KITTI
├── FlyingThings3D
├── HD1K
├── Spring
├── test
├── train
├── TartanAirPlease refer to eval.sh and submission.sh for more details.
Our training setup is configured to use a fixed effective batch size with 4 nodes 8 GPUs each.
You can train the model with fewer resources (no need to alter the configs), but if you encounter out-of-memory (OOM) errors, try increasing the accumulate_grad_batches parameter in the configs. For example, set it to 4 when training on a single node with 8 GPUs.
By default training is configured for submissions using the *-full versions of the benchmark datasets.
For experiments and ablations however, it is recommended to switch to the versions without the -full postfix to get validation results on a separate validation set.
Our training script is optimized for use with the slurm workload manager. A typical submission script looks like this:
# (submit.sh)
#!/bin/bash
#SBATCH --nodes=4
#SBATCH --gres=gpu:8
#SBATCH --ntasks-per-node=8
#SBATCH --cpus-per-task=16
srun bash train.shAlternatively, multi-node training is also supported via other launch methods, such as torchrun:
OMP_NUM_THREADS=16 torchrun \
--nproc_per_node=8 \
--nnodes=4 \
--node_rank <NODE_RANK> \
--master_addr <MASTER_ADDR> \
--master_port <MASTER_PORT> \
--no-python bash train.shFor more details, refer to the PyTorch Lightning documentation.
We use Weights & Biases (WandB) for experiment tracking by default. To disable logging, set the environment variable:
export WANDB_MODE=disabledFeel free to open an issue if you have any questions.
@article{bargatin2025memfof,
title={MEMFOF: High-Resolution Training for Memory-Efficient Multi-Frame Optical Flow Estimation},
author={Bargatin, Vladislav and Chistov, Egor and Yakovenko, Alexander and Vatolin, Dmitriy},
journal={arXiv preprint arXiv:2506.23151},
year={2025}
}
This project relies on code from existing repositories: SEA-RAFT, VideoFlow, and GMA. We thank the original authors for their excellent work.