Deltatensors – store model fine-tunes as compressed weight deltas

AaravGaur1 pts1 comments

GitHub - AaravGaurdev/deltatensors · GitHub

/" data-turbo-transient="true" />

Skip to content

Search or jump to...

Search code, repositories, users, issues, pull requests...

-->

Search

Clear

Search syntax tips

Provide feedback

--><br>We read every piece of feedback, and take your input very seriously.

Include my email address so I can be contacted

Cancel

Submit feedback

Saved searches

Use saved searches to filter your results more quickly

-->

Name

Query

To see all available qualifiers, see our documentation.

Cancel

Create saved search

Sign in

/;ref_cta:Sign up;ref_loc:header logged out"}"<br>Sign up

Appearance settings

Resetting focus

You signed in with another tab or window. Reload to refresh your session.<br>You signed out in another tab or window. Reload to refresh your session.<br>You switched accounts on another tab or window. Reload to refresh your session.

Dismiss alert

{{ message }}

AaravGaurdev

deltatensors

Public

Notifications<br>You must be signed in to change notification settings

Fork

Star

main

BranchesTags

Go to file

CodeOpen more actions menu

Folders and files<br>NameNameLast commit message<br>Last commit date<br>Latest commit

History<br>15 Commits<br>15 Commits

deltatensors

deltatensors

docs

docs

tests

tests

.gitignore

.gitignore

.readthedocs.yaml

.readthedocs.yaml

README.md

README.md

mkdocs.yml

mkdocs.yml

pyproject.toml

pyproject.toml

View all files

Repository files navigation

deltatensors

Near-lossless delta compression for fine-tuned neural network models.

Instead of storing 50 fine-tunes of the same base model, store one base and 50 small .wdelta delta files. deltatensors compresses the delta between a base and fine-tuned model, and reconstructs with sub-1% perplexity difference.

Tested on Qwen2.5-0.5B fine-tuned on WikiText-2:

Perplexity: 19.11 (original) → 19.22 (reconstructed) — 0.58% perplexity difference

Less degradation than standard int4 quantization of the full model

294 MB delta vs 953 MB fine-tuned model (3.2x)

~2.8x total storage reduction across 10 fine-tunes

base_model.safetensors 1.0 GB<br>checkpoint_01.wdelta 294 MB<br>checkpoint_02.wdelta 294 MB<br>...<br>checkpoint_10.wdelta 294 MB<br>─────────────────────────────────<br>Total 3.9 GB vs 11 GB naive

Install

pip install deltatensors<br>pip install torch safetensors # for loading from safetensors directories

Quick start

import deltatensors as dt

# save delta between a fine-tuned and base model (streaming, O(1) RAM)<br>dt.save_delta_from_paths("checkpoint.wdelta", "qwen-wiki/", "qwen-base/", strategy="int4")

# reconstruct without loading the full base into RAM<br>recon_sd = dt.load_delta_from_paths("checkpoint.wdelta", "qwen-base/")

# inspect a delta file without a base model<br>info = dt.inspect("checkpoint.wdelta")<br>print(info)<br># {'path': 'checkpoint.wdelta', 'size_mb': 294.2, 'strategy': 'int4', 'n_tensors': 290, ...}

Compression strategies

Strategy<br>Quality<br>Compression

int4<br>near-lossless (~0.5% PPL)<br>best

sparse<br>tunable via sparsity=<br>good

quantized<br>BitDelta-style 1-bit<br>aggressive

int4 uses outlier extraction (top k% weights stored in float16) + 4-bit quantization for the remainder. This was the strategy used for the example at the start.

Why not LoRA?

LoRA constrains the delta to be low-rank during training, which limits expressiveness. deltatensors compresses arbitrary full fine-tune deltas after training - no constraints on how you fine-tune.

Roadmap

Lineage — chain multiple .wdelta files to track and reconstruct full fine-tuning histories

License

MIT

p.s. If you find deltatensors useful, please consider leaving a ⭐ star on the repository to help others find it!

About

No description, website, or topics provided.

Resources

Readme

Uh oh!

There was an error while loading. Please reload this page.

Activity

Stars

star

Watchers

watching

Forks

forks

Report repository

Releases

tags

Packages

Uh oh!

There was an error while loading. Please reload this page.

Contributors

Uh oh!

There was an error while loading. Please reload this page.

Languages

Python<br>100.0%

You can’t perform that action at this time.

deltatensors fine wdelta base model delta

Related Articles