[EMNLP 2025 Main] Diagram-Eval: Evaluating LLM-Generated Diagrams via Graphs

Paper

This repository contains the official implementation of our EMNLP 2025 Main paper:

Evaluating LLM-Generated Diagrams via Graphs
Chumeng Liang, Jiaxuan You
University of Illinois Urbana-Champaign

Diagrams play a central role in research papers for conveying ideas, yet they are often complex and labor-intensive to create. We propose DiagramEval, a novel evaluation metric designed to assess demonstration diagrams generated by LLMs. DiagramEval conceptualizes diagrams as graphs, treating text elements as nodes and their connections as directed edges, and evaluates diagram quality using two new groups of metrics: node alignment and path alignment.

What's Included

This repository provides:

Diagram Data Collection: Run one command to collect diagrams and paper context according to a list of paper titles.
Diagram Generation (Demo): A demo LLM workflow to generate diagrams from paper context.
Diagram-Eval Metrics: Our proposed metrics to evaluate generated diagrams against reference including paper context (.txt) and diagram images (.png).

Environment Setup

Dependency Installation

conda create -n diagram python=3.10
conda activate diagram
pip install -r requirements.txt

API Key Configuration

Create configs/key.yaml with your API keys:

openai_api_key: "your-openai-key"
google_api_key: "your-google-key"  # For Gemini image generation
claude_api_key: "your-claude-key"
gemini_api_key: "your-gemini-key"
nvidia_api_key: "your-nvidia-key"

Quick Start

0. Prepare A Paper List

Format:

[
  {
    "title": "Taming Transformers for High-Resolution Image Synthesis"
  },
  {
    "title": "Scalable Diffusion Models with Transformers"
  },
  ...
]

1. Collect Reference Data

Run Our Data Collection Pipeline:

python script/prepare_inputs.py \
  --paper-list [PATH_TO_PAPER_LIST] \
  --crawl-output [OUTPUT_DIR]/crawled \
  --extract-output [OUTPUT_DIR]/extracted \
  --config-path configs/llm_config.yaml

2. Generate Diagrams

Basic generation:

python script/generate_diagram.py \
  [PATH_TO_PAPER_CONTEXT] \
  --output-dir [OUTPUT_DIR]/generated

With planning (focuses on methodology):

python script/generate_diagram.py \
  [PATH_TO_PAPER_CONTEXT] \
  --output-dir [OUTPUT_DIR]/generated \
  --use-planner

3. Run Diagram-Eval

Evaluate against reference:

python script/evaluate_diagram.py \
  [PATH_TO_GENERATED_DIAGRAM_FILE] \
  [PATH_TO_REFERENCE_FILE] \  # .txt or .png
  --output-dir [OUTPUT_DIR]/generated

Output:

[OUTPUT_DIR]/generated/[ARXIV_ID]_metrics.json - Evaluation metrics
[OUTPUT_DIR]/generated/[ARXIV_ID]_metrics.md - Human-readable report

4. Check Output File Structure

Output file structure:

[OUTPUT_DIR]
├── crawled/
│   └── 2212.09748/
│       ├── 2212.09748.pdf: PDF file from Arxiv
│       └── 2212.09748.tar.gz: latex file package from Arxiv
├── extracted/
│   └── 2212.09748/
│       ├── 2212.09748_text.txt: reference paper context
│       └── 2212.09748_diagrams/
│           └── figure_overview.png: reference paper diagram
└── generated/
    ├── 2212.09748_generated_metrics.json: metric report in .json
    ├── 2212.09748_generated_metrics.md: metric report in .md
    └── 2212.09748_generated.png: generated diagram

Please check 'example/' for an example input and output file structure.

Configuration

All LLM/MLLM configurations are managed in configs/llm_config.yaml:

text_graph_extraction: Extract graph structure from text
image_graph_extraction: Extract graph structure from diagram images
node_alignment: Align nodes between generated and reference graphs
diagram_selection: Select overview diagrams from papers
layout_planner: Plan diagram layout and components
diagram_generator: Generate diagrams via Google Gemini API

Modify model names, temperatures, and other parameters as needed.

Evaluation Metrics

DiagramEval conceptualizes diagrams as directed graphs:

Nodes: Text elements or components in the diagram
Edges: Directional connections (arrows) between components

Node Alignment

Measures how well the components in generated and reference diagrams correspond:

Precision: Proportion of generated nodes that match reference nodes
Recall: Proportion of reference nodes covered by generated nodes
F1 Score: Harmonic mean of precision and recall

Path Alignment

Evaluates the structural relationships between components:

Precision: Proportion of generated paths that match reference paths
Recall: Proportion of reference paths covered by generated paths
F1 Score: Harmonic mean of precision and recall

Advanced Usage Guidance

Custom Prompt

Edit evaluate_figure_captions in utils/extract_text_diagram_from_paper.py to improve the diagram extraction accuracy.

Edit _create_layout_plan in generate/workflow.py to customize the planning prompt.

Edit _request_diagram in generate/workflow.py to customize the diagram generation prompt.

Step-by-step data collection

Run utils/crawl_paper.py to only crawl latex files and PDF of the paper from Arxiv.

Run utils/extract_text_diagram_from_paper.py to manually pick diagrams from latex files.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

[EMNLP 2025 Main] Diagram-Eval: Evaluating LLM-Generated Diagrams via Graphs

Paper

What's Included

Environment Setup

Dependency Installation

API Key Configuration

Quick Start

0. Prepare A Paper List

1. Collect Reference Data

2. Generate Diagrams

3. Run Diagram-Eval

4. Check Output File Structure

Configuration

Evaluation Metrics

Node Alignment

Path Alignment

Advanced Usage Guidance

Custom Prompt

Step-by-step data collection

License

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
assets		assets
configs		configs
eval		eval
example		example
generate		generate
script		script
utils		utils
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

License

ulab-uiuc/diagram-eval

Folders and files

Latest commit

History

Repository files navigation

[EMNLP 2025 Main] Diagram-Eval: Evaluating LLM-Generated Diagrams via Graphs

Paper

What's Included

Environment Setup

Dependency Installation

API Key Configuration

Quick Start

0. Prepare A Paper List

1. Collect Reference Data

2. Generate Diagrams

3. Run Diagram-Eval

4. Check Output File Structure

Configuration

Evaluation Metrics

Node Alignment

Path Alignment

Advanced Usage Guidance

Custom Prompt

Step-by-step data collection

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages