GitHub - deepseek-ai/DeepSpec: DeepSpec: a full-stack codebase for training and evaluating speculative decoding algorithms · GitHub
/" data-turbo-transient="true" />
Skip to content
Search or jump to...
Search code, repositories, users, issues, pull requests...
-->
Search
Clear
Search syntax tips
Provide feedback
--><br>We read every piece of feedback, and take your input very seriously.
Include my email address so I can be contacted
Cancel
Submit feedback
Saved searches
Use saved searches to filter your results more quickly
-->
Name
Query
To see all available qualifiers, see our documentation.
Cancel
Create saved search
Sign in
/;ref_cta:Sign up;ref_loc:header logged out"}"<br>Sign up
Appearance settings
Resetting focus
You signed in with another tab or window. Reload to refresh your session.<br>You signed out in another tab or window. Reload to refresh your session.<br>You switched accounts on another tab or window. Reload to refresh your session.
Dismiss alert
{{ message }}
Uh oh!
There was an error while loading. Please reload this page.
deepseek-ai
DeepSpec
Public
Notifications<br>You must be signed in to change notification settings
Fork<br>91
Star<br>1.2k
main
BranchesTags
Go to file
CodeOpen more actions menu
Folders and files<br>NameNameLast commit message<br>Last commit date<br>Latest commit
History<br>1 Commit<br>1 Commit
config
config
deepspec
deepspec
eval_datasets
eval_datasets
scripts
scripts
.gitignore
.gitignore
DSpark_paper.pdf
DSpark_paper.pdf
LICENSE
LICENSE
NOTICE
NOTICE
README.md
README.md
eval.py
eval.py
requirements.txt
requirements.txt
train.py
train.py
View all files
Repository files navigation
DeepSpec
DeepSpec is a full-stack codebase for training and evaluating draft models for speculative decoding. It contains data preparation utilities, draft model implementations, training code, and evaluation scripts.
Environment
Install the Python dependencies:
python -m pip install -r requirements.txt
Data preparation additionally requires an inference engine to serve the target model when regenerating answers; see scripts/data/README.md for details.
Workflow
Run the stages in order — each stage's output feeds the next:
Data Preparation — download prompts, regenerate target answers, and build the target cache.
Training — train a draft model against the cached target outputs.
Evaluation — measure speculative-decoding acceptance on benchmark tasks.
Data Preparation
See scripts/data/README.md for the step-by-step data pipeline:
download and split training data,
regenerate answers,
prepare the target cache (storage warning: this can be very large — roughly 38 TB for the default Qwen/Qwen3-4B setting).
Training
bash scripts/train/train.sh
train.sh launches train.py, which spawns one worker per visible GPU. Select the algorithm and target model by pointing config_path at one of the configs under config/ (e.g. config/dspark/dspark_qwen3_4b.py); see the script header for the full list of configs, how to override config_path / target_cache_dir, and how to use --opts to override individual config fields. Checkpoints are written to ~/checkpoints///step_*.
Hardware: the default configs and scripts assume a single node with 8 GPUs. For fewer GPUs, reduce CUDA_VISIBLE_DEVICES.
Evaluation
bash scripts/eval/eval.sh
eval.sh runs eval.py against a trained draft checkpoint over the speculative-decoding benchmarks in eval_datasets/ (gsm8k, math500, aime25, humaneval, mbpp, livecodebench, mt-bench, alpaca, arena-hard-v2). Set:
target_name_or_path — the target model the draft was trained against (e.g. Qwen/Qwen3-4B),
draft_name_or_path — the draft checkpoint, e.g. ~/checkpoints/deepspec/dspark_block8_qwen3_4b/step_latest.
Supported Algorithms
Currently, DeepSpec includes three draft models: DSpark, DFlash and Eagle3.
License
DeepSpec is released under the MIT License. It includes code adapted<br>from third-party projects under their own licenses; see NOTICE for the<br>full attribution.
Acknowledgements
DeepSpec builds on the ideas and code of several excellent open-source projects:
SpecForge (Apache-2.0) — the overall training framework and Eagle3 implementation; portions of the Eagle3 modeling, loss, optimizer, attention, and evaluation code are adapted from it. Adapted files carry an in-file attribution comment, and the full notice is recorded in NOTICE.
DFlash (MIT) — the DFlash draft-model design and training recipe.
Qwen3 and Gemma — the target model families supported in this repo.
We thank the authors and maintainers of these projects. Contributions of new algorithms are welcome.
About
DeepSpec: a full-stack codebase for training and evaluating speculative decoding algorithms
Resources
Readme
License
MIT license
Uh oh!
There was an error while loading. Please reload this page.
Activity
Custom properties
Stars
1.2k<br>stars
Watchers
watching
Forks
91<br>forks
Report repository
Releases
No releases published
Packages
Uh...