feat: Donut, Swin, and BART (models and examples) #3265

danielclough · 2025-12-25T14:37:52Z

Donut, Swin, and BART (models and examples)

This PR adds three interconnected models for document understanding and text generation:

BART (candle-transformers/src/models/bart/) - Encoder-decoder transformer for summarization and translation
Swin Transformer (candle-transformers/src/models/swin.rs) - Hierarchical vision transformer for image processing
Donut (candle-transformers/src/models/donut.rs) - OCR-free document understanding (Swin encoder + BART decoder)

I am submitting the PR together because Donut relies upon Swin (encoder) and BART (decoder).

Each model also comes with examples demonstrating essential features.

Features

Model	Capabilities
BART	Text summarization, mBART translation, beam search decoding
Swin	ImageNet classification (Tiny/Small/Base/Large variants)
Donut	Receipt parsing, document VQA, document classification

Key Implementation Details

Beam Search: Both single-sequence and batched beam search with efficient KV cache reordering
Weight Tying: Proper handling of tied embeddings between encoder/decoder/lm_head
Layer Norm Order: Auto-detection of pre-norm (mBART) vs post-norm (BART) architectures
Shifted Window Attention: Full Swin v1 implementation with attention masks

- Full encoder-decoder architecture - Beam search decoding with configurable parameters - Causal language modeling head - Support for BART, mBART, and MBart50 variants

- Shifted window multi-head self-attention - Patch merging for hierarchical feature maps

ivarflakstad

👍

ivarflakstad · 2026-01-03T13:38:30Z

candle-examples/examples/bart/convert_mbart_tokenizer.py

+mBART models use SentencePiece tokenization which isn't directly supported
+by the Rust tokenizers crate. This script converts the tokenizer to the
+tokenizer.json format that can be loaded by the Rust example.


Is there a specific model you've encountered where they don't provide a tokenizer.json?

Take a look at: https://huggingface.co/facebook/mbart-large-50-many-to-many-mmt/tree/main

The README.md explains converting the SentencePiece tokenizer first:

# Step 1: Convert tokenizer cd candle-examples/examples/bart pip install transformers sentencepiece python convert_mbart_tokenizer.py --model-id facebook/mbart-large-50-many-to-many-mmt # Step 2: Run translation (English to French) cargo run --example bart --release -- \ --model-id facebook/mbart-large-50-many-to-many-mmt \ --prompt "Hello, how are you today?" \ --source-lang en_XX \ --target-lang fr_XX \ --sample-len 50

candle-examples/examples/swin/README.md

danielclough · 2026-01-04T00:23:36Z

I noticed I forgot to run clippy with -D warnings. 🤦 I'll fix that.

Co-authored-by: ivarflakstad <[email protected]>

danielclough added 3 commits December 25, 2025 06:29

feat: add BART encoder-decoder transformer model

89a6497

- Full encoder-decoder architecture - Beam search decoding with configurable parameters - Causal language modeling head - Support for BART, mBART, and MBart50 variants

feat: add Swin Transformer for vision tasks

d4a1805

- Shifted window multi-head self-attention - Patch merging for hierarchical feature maps

feat: add Donut document understanding model

b4f8310

danielclough mentioned this pull request Dec 25, 2025

feat: Swin Transformer v1 Support #3243

Closed

ivarflakstad reviewed Jan 3, 2026

View reviewed changes

danielclough and others added 3 commits January 3, 2026 19:05

fix: update candle-examples/examples/swin/README.md image path

f593971

Co-authored-by: ivarflakstad <[email protected]>

fix: change map_or() to is_some_and() to resolve clippy warnings

5144d76

Merge branch 'main' into feat/donut-swin-bart

3bc92c7

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: Donut, Swin, and BART (models and examples) #3265

feat: Donut, Swin, and BART (models and examples) #3265

Uh oh!

danielclough commented Dec 25, 2025

Uh oh!

ivarflakstad left a comment

Uh oh!

ivarflakstad Jan 3, 2026

Uh oh!

danielclough Jan 4, 2026

Uh oh!

Uh oh!

danielclough commented Jan 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

feat: Donut, Swin, and BART (models and examples) #3265

Are you sure you want to change the base?

feat: Donut, Swin, and BART (models and examples) #3265

Uh oh!

Conversation

danielclough commented Dec 25, 2025

Donut, Swin, and BART (models and examples)

Features

Key Implementation Details

Uh oh!

ivarflakstad left a comment

Choose a reason for hiding this comment

Uh oh!

ivarflakstad Jan 3, 2026

Choose a reason for hiding this comment

Uh oh!

danielclough Jan 4, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

danielclough commented Jan 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants