-
Notifications
You must be signed in to change notification settings - Fork 1.4k
feat: Donut, Swin, and BART (models and examples) #3265
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
- Full encoder-decoder architecture - Beam search decoding with configurable parameters - Causal language modeling head - Support for BART, mBART, and MBart50 variants
- Shifted window multi-head self-attention - Patch merging for hierarchical feature maps
ivarflakstad
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
| mBART models use SentencePiece tokenization which isn't directly supported | ||
| by the Rust tokenizers crate. This script converts the tokenizer to the | ||
| tokenizer.json format that can be loaded by the Rust example. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a specific model you've encountered where they don't provide a tokenizer.json?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Take a look at: https://huggingface.co/facebook/mbart-large-50-many-to-many-mmt/tree/main
The README.md explains converting the SentencePiece tokenizer first:
# Step 1: Convert tokenizer
cd candle-examples/examples/bart
pip install transformers sentencepiece
python convert_mbart_tokenizer.py --model-id facebook/mbart-large-50-many-to-many-mmt
# Step 2: Run translation (English to French)
cargo run --example bart --release -- \
--model-id facebook/mbart-large-50-many-to-many-mmt \
--prompt "Hello, how are you today?" \
--source-lang en_XX \
--target-lang fr_XX \
--sample-len 50|
I noticed I forgot to run clippy with -D warnings. 🤦 I'll fix that. |
Donut, Swin, and BART (models and examples)
This PR adds three interconnected models for document understanding and text generation:
candle-transformers/src/models/bart/) - Encoder-decoder transformer for summarization and translationcandle-transformers/src/models/swin.rs) - Hierarchical vision transformer for image processingcandle-transformers/src/models/donut.rs) - OCR-free document understanding (Swin encoder + BART decoder)I am submitting the PR together because Donut relies upon Swin (encoder) and BART (decoder).
Each model also comes with examples demonstrating essential features.
Features
Key Implementation Details