Skip to content

showlab/Code2Video

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

80 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Code2Video: Video Generation via Code

English | 简体中文

Code2Video: A Code-centric Paradigm for Educational Video Generation
以代码为中心的教学视频生成新范式

Yanzhe Chen*, Kevin Qinghong Lin*, Mike Zheng Shou
Show Lab @ National University of Singapore

  📄 Paper   |     🤗 Daily Paper   |     🤗 Dataset   |     🌐 Project Website   |     💬 X (Twitter)

code2video_light.mp4

Learning Topic Veo3 Wan2.2 Code2Video (Ours)
Hanoi Problem
Large Language Model
Pure Fourier Series


🔥 Update

Any contributions are welcome!

  • [2025.10.11] Due to issues on ICONFINDER, we’ve updated Code2Video auto-collected icons at MMMC as a temporary alternative.
  • [2025.10.6] We have updated the ground truth human-made videos and metadata for the MMMC dataset.
  • [2025.10.3] Thanks @_akhaliq for sharing our work on Twitter!
  • [2025.10.2] We release the arXiv, code and dataset .
  • [2025.9.22] Code2Video has been accepted to the Deep Learning for Code (DL4C) Workshop at NeurIPS 2025.

Table of Contents


🌟 Overview

Overview

Code2Video is an agentic, code-centric framework that generates high-quality educational videos from knowledge points.
Unlike pixel-based text-to-video models, our approach leverages executable Manim code to ensure clarity, coherence, and reproducibility.

Key Features:

  • 🎬 Code-Centric Paradigm — executable code as the unified medium for both temporal sequencing and spatial organization of educational videos.
  • 🤖 Modular Tri-Agent Design — Planner (storyboard expansion), Coder (debuggable code synthesis), and Critic (layout refinement with anchors) work together for structured generation.
  • 📚 MMMC Benchmark — the first benchmark for code-driven video generation, covering 117 curated learning topics inspired by 3Blue1Brown, spanning diverse areas.
  • 🧪 Multi-Dimensional Evaluation — systematic assessment on efficiency, aesthetics, and end-to-end knowledge transfer.

🚀 Try Code2Video

Approach

1. Requirements

cd src/
pip install -r requirements.txt

Here is the official installation guide for Manim Community v0.19.0, to help everyone correctly set up the environment.

2. Configure LLM API Keys

Fill in your API credentials in api_config.json.

  • LLM API:

    • Required for Planner & Coder.
    • Best Manim code quality achieved with Claude-4-Opus.
  • VLM API:

    • Required for Planner Critic.
    • For layout and aesthetics optimization, provide Gemini API key.
    • Best quality achieved with gemini-2.5-pro-preview-05-06.
  • Visual Assets API:

    • To enrich videos with icons, set ICONFINDER_API_KEY from IconFinder.

3. Run Agents

We provide two shell scripts for different generation modes:

(a) Any Query

Script: run_agent_single.sh

Generates a video from a single knowledge point specified in the script.

sh run_agent_single.sh --knowledge_point "Linear transformations and matrices"

Important parameters inside run_agent_single.sh:

  • API: specify which LLM to use.
  • FOLDER_PREFIX: output folder prefix (e.g., TEST-single).
  • KNOWLEDGE_POINT: target concept, e.g. "Linear transformations and matrices".

(b) Full Benchmark Mode

Script: run_agent.sh

Runs all (or a subset of) learning topics defined in long_video_topics_list.json.

sh run_agent.sh

Important parameters inside run_agent.sh:

  • API: specify which LLM to use.
  • FOLDER_PREFIX: name prefix for saving output folders (e.g., TEST-LIST).
  • MAX_CONCEPTS: number of concepts to include (-1 means all).
  • PARALLEL_GROUP_NUM: number of groups to run in parallel.

4. Project Organization

A suggested directory structure:

src/
│── agent.py
│── run_agent.sh
│── run_agent_single.sh
│── api_config.json
│── ...
│
├── assets/
│   ├── icons/          #  downloaded visual assets cache via IconFinder API
│   └── reference/      # reference images
│
├── json_files/         # JSON-based topic lists & metadata
├── prompts/            # prompt templates for LLM calls
├── CASES/              # generated cases, organized by FOLDER_PREFIX
│   └── TEST-LIST/      # example multi-topic generation results
│   └── TEST-single/    # example single-topic generation results

📊 Evaluation -- MMMC

We evaluate along three complementary dimensions:

  1. Knowledge Transfer (TeachQuiz)

    python3 eval_TQ.py
  2. Aesthetic & Structural Quality (AES)

    python3 eval_AES.py
  3. Efficiency Metrics (During Creating)

    • Token usage
    • Execution time

👉 More data and evaluation scripts are available at: HuggingFace: MMMC Benchmark


🙏 Acknowledgements

  • Video data is sourced from the 3Blue1Brown official lessons. These videos represent the upper bound of clarity and aesthetics in educational video design and inform our evaluation metrics.
  • We thank all the Show Lab @ NUS members for support!
  • This project builds upon open-source contributions from Manim Community and the broader AI research ecosystem.
  • High-quality visual assets (icons) are provided by IconFinder and Icons8, which were used to enrich the educational videos.

📌 Citation

If you find our work useful, please cite:

@misc{code2video,
      title={Code2Video: A Code-centric Paradigm for Educational Video Generation}, 
      author={Yanzhe Chen and Kevin Qinghong Lin and Mike Zheng Shou},
      year={2025},
      eprint={2510.01174},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2510.01174}, 
}

If you like our project, please give us a star ⭐ on GitHub for the latest update. Star History Chart