Batch Speculative Decoding

For the paper: Batch Speculative Decoding Done Right

Environment Setup

pip install -r requirements.txt
# make sure you use transformers==4.51.3. They later redesign the KV cache API and it won't work.
# Also for sglang use the requirements_sglang.txt"

Key Files

Core Implementation

src/custom/batch_speculative.py - Main batch speculative decoding with unpad-update-repad pattern
src/custom/cross_batch_speculative.py - Cross-batch scheduling implementation
src/custom/verification.py - Prefill-refill oracle verification system

Testing & Validation

test_spec_decode_correctness.py - Comprehensive correctness testing suite. There are many tests, find ones you need, turn them on, to compare different spec with non-spec outputs.

How To Run

Inference:

# EXSPEC 
CUDA_VISIBLE_DEVICES=0 python unified_benchmark.py --methods Ours-XBatch --input_file data/spec_bench/question.jsonl --num_prompts 100 --max_new_tokens 128 --n_draft_tokens 5 --batch_size 16 --window_size 48 --scheduling_strategy cross_batch --sort_by_length


# EQSPEC 
CUDA_VISIBLE_DEVICES=0 python unified_benchmark.py --methods Ours-Batch-Cache --input_file data/spec_bench/question.jsonl --num_prompts 100 --max_new_tokens 128 --n_draft_tokens 5 --batch_size 16 --enable_profiling

verification experiment:

# you may edit methods_batch_n and methods_batch_1 in verification_benchmark.py to add more methods to compare
python verification_benchmark.py --input_file data/spec_bench/question.jsonl --num_prompts 480 --models glm4 --batch_sizes 4 8 --max_new_tokens 50 --output_dir test_verification

Bibtex

TBD

License

This project is licensed under the terms of the Apache 2.0 License.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
data/spec_bench		data/spec_bench
scripts		scripts
src		src
.DS_Store		.DS_Store
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
c_overhead_b.py		c_overhead_b.py
requirements.txt		requirements.txt
requirements_sglang.txt		requirements_sglang.txt
scaling_vllm.py		scaling_vllm.py
test_spec_decode_correctness.py		test_spec_decode_correctness.py
unified_benchmark.py		unified_benchmark.py
verification_benchmark.py		verification_benchmark.py
verification_sglang.py		verification_sglang.py
verification_vllm.py		verification_vllm.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Batch Speculative Decoding

Environment Setup

Key Files

How To Run

Bibtex

License

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

eBay/spec_dec

Folders and files

Latest commit

History

Repository files navigation

Batch Speculative Decoding

Environment Setup

Key Files

How To Run

Bibtex

License

About

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages