PostgreSQL ext makes LLM available as an index for similarity searches,inference

kermatt1 pts0 comments

gregburd/pg_infer: A PostgreSQL extension makes LLM model-knowledge available as an index for similarity searches, inference, and more. - Codeberg.org

This website requires JavaScript.

gregburd/pg_infer

Watch

Star

Fork

You've already forked pg_infer

Code

Issues

Pull requests

Releases

Activity

Actions

A PostgreSQL extension makes LLM model-knowledge available as an index for similarity searches, inference, and more.

39 commits

5 branches

1 tag

4 MiB

Rust

99.7%

0.2%

main

Find a file

HTTPS

Download ZIP<br>Download TAR.GZ<br>Download BUNDLE

Open with VS Code

Open with VSCodium

Open with Intellij IDEA

Greg Burd

7b4a09c38d

Some checks are pending

Lint / Clippy (push) Waiting to run

Details

Lint / Clippy (extension) (push) Waiting to run

Details

Test / Workspace Tests (push) Waiting to run

Details

Test / pgrx Integration Tests (push) Waiting to run

Details

Release / Build Extension (push) Waiting to run

Details

Classify project as experimental (v0.1.0-alpha)

...<br>Add maturity notice to README: functional and tested but API unstable,<br>no production deployments, hardware-specific compute paths, vindex<br>format not frozen. Sets expectations for early adopters.

2026-05-13 12:42:17 -04:00

.forgejo/workflows

Add production hardening: tracing, AVX2 SIMD, tests, CI/CD

2026-05-12 12:26:25 -04:00

.github

Port upstream Priority 1-2: FP8 dtypes, DeepSeek-V4, MXFP4, config validation, cfg(unix), SVD

2026-05-12 13:17:51 -04:00

benches

Add pgbench suite + remote-backend deployment doc

2026-05-09 14:01:27 -04:00

crates

Mark Per-layer FFN type as DONE in heading

2026-05-13 12:31:56 -04:00

docs

Add mdBook documentation, CUDA prefill, residual cache, PLE test, and model enhancements

2026-05-13 09:45:49 -04:00

examples

Add attention backward, CachedLayerGraph predict, and fix all clippy warnings

2026-05-12 08:11:51 -04:00

expected

initial import

2026-04-27 15:01:19 -04:00

scripts

Add mock-server integration test + live-server runner

2026-05-09 13:59:29 -04:00

sql

Address code review findings: AM safety, embedding consistency, describe_layers

2026-05-12 18:49:04 -04:00

src

Add Phase 2 AM (SQ8+HNSW), GGUF skip-dequant, and CUDA backend

2026-05-13 06:07:06 -04:00

.editorconfig

Add attention backward, CachedLayerGraph predict, and fix all clippy warnings

2026-05-12 08:11:51 -04:00

.gitignore

Add attention backward, CachedLayerGraph predict, and fix all clippy warnings

2026-05-12 08:11:51 -04:00

Cargo.toml

Add production hardening: tracing, AVX2 SIMD, tests, CI/CD

2026-05-12 12:26:25 -04:00

demo.sql

WIP

2026-04-28 12:48:05 -04:00

DESIGN

initial import

2026-04-27 15:01:19 -04:00

flake.lock

Add Nix flake dev shell and fix merge conflict in infer-inference

2026-05-13 10:45:30 -04:00

flake.nix

Add Nix flake dev shell and fix merge conflict in infer-inference

2026-05-13 10:45:30 -04:00

LICENSE

add license

2026-04-27 15:04:48 -04:00

pg_infer.control

Add attention backward, CachedLayerGraph predict, and fix all clippy warnings

2026-05-12 08:11:51 -04:00

README

Classify project as experimental (v0.1.0-alpha)

2026-05-13 12:42:17 -04:00

renovate.json

Port upstream Priority 1-2: FP8 dtypes, DeepSeek-V4, MXFP4, config validation, cfg(unix), SVD

2026-05-12 13:17:51 -04:00

rust-toolchain.toml

Add attention backward, CachedLayerGraph predict, and fix all clippy warnings

2026-05-12 08:11:51 -04:00

SECURITY.md

Add attention backward, CachedLayerGraph predict, and fix all clippy warnings

2026-05-12 08:11:51 -04:00

README

pg_infer

Status: EXPERIMENTAL (v0.1.0-alpha)

A PostgreSQL extension that exposes transformer model knowledge as<br>SQL-queryable relations. Built on pgrx 0.17 for PostgreSQL 18+.

This project is in early experimental stage. It is functional, tested<br>(769+ tests passing), and performant on supported hardware, but:

- The SQL API may change without notice between releases<br>- No production deployments exist yet<br>- Some compute paths require specific hardware (Apple Metal, NVIDIA CUDA)<br>- The vindex format is not yet frozen

Use pg_infer if you want to explore what's possible at the intersection<br>of transformer model internals and relational databases. Do not use it<br>for production workloads without accepting the risk of breaking changes.

pg_infer lets you ask questions about what a language model "knows" --<br>its internal feature activations, learned associations, and semantic<br>relationships -- directly from SQL, without running inference.

QUICK START

-- Load a vindex (extracted model knowledge)<br>SELECT infer_create_model('qwen05b', '/data/qwen-0.5b.vindex');<br>-- infer_create_model<br>-- qwen05b<br>-- (1 row)

-- What does the model know about France?<br>SELECT * FROM describe('France');<br>-- relation | target | confidence | layer<br>-- -----------+------------+------------+-------<br>-- capital | Paris | 42.7 | 18<br>-- language | French | 38.1 | 17<br>-- continent | Europe | 35.4 | 16<br>-- currency | euro | 29.8 | 19<br>-- leader | president | 24.3 | 20<br>-- (5 rows)

--...

model clippy pg_infer inference attention backward

Related Articles