Show HN: Shivvr – Ephemeral semantic embedding and cognitive agent service

shivvr — semantic embedding & cognitive agent service

shivvr 🔪 v0.3.0

Ephemeral semantic embedding & cognitive agent service.

Chunk text. Embed with GTR-T5-base. Search via hybrid FST-BM25 vector rank fusion. Stream turn-based cognitive agent reasoning via native Model Context Protocol (MCP).

View API Quick Start GitHub Health

2298s Uptime

CUDA Compute

Sessions

Chunks

Encryption

enabled Inversion

Capabilities

Ingest

Sentence-boundary chunking + GTR-T5-base embeddings (768d). Stores in RwLock — pure ephemeral compute.

FST-BM25 Hybrid Search

RRF blending dense vectors and sparse lexical indices. FST dictionary scanning provides microsecond safe query guardrails and intent-entity score boosting.

Cognitive GhostAgent

Turn-based agent loop with integrated memory search, document ingestion, session indexing, and sandboxed command execution. Powered by OpenAI/Anthropic.

Native MCP Server

Complete Model Context Protocol HTTP/SSE server endpoint (/mcp/sse) allowing immediate, zero-config integration with Claude Code, Antigravity, and Codex.

Crypto

Per-agent orthogonal matrix rotation on embeddings. Cosine similarity preserved under encryption. Keys are in-memory only.

Dual embedding

organize role uses GTR-T5-base (768d). retrieve role uses OpenAI text-embedding-ada-002 (1536d) — pass your own key or set server-side.

API

MethodEndpointDescription GET/healthStatus, model info, live counts GET/mcp/sseMCP Server SSE handshake POST/mcp/messageMCP Server JSON-RPC message router POST/sessions/:id/agent/chatStream non-blocking GhostAgent cognitive turns (SSE) POST/sessions/:id/ingestChunk + embed text into session GET/sessions/:id/search?q=...Semantic search (supports RRF hybrid & lexical_only) GET/sessions/:idSession metadata DELETE/sessions/:idDelete session GET/tempList temp stores with TTL POST/temp/:name/ingestIngest into temp store (2 hr TTL) GET/temp/:name/search?q=...Search temp store DELETE/temp/:nameDelete temp store POST/agent/:id/registerRegister per-agent orthogonal key POST/agent/:id/encryptEncrypt embeddings POST/agent/:id/decryptDecrypt embeddings POST/invertReconstruct text from embedding vector

Quick start

# Ingest into session curl -X POST https://shivvr.nuts.services/sessions/my-session/ingest \ -H "Content-Type: application/json" \ -d '{"text": "Supreme Raven is protected by Known Opossum.", "source": "vault_specs"}'

# Autonomous Agent Conversational Chat (Streams Thoughts, ToolCalls, & Answer via SSE) curl -i -X POST http://localhost:8085/sessions/my-session/agent/chat \ -H "Content-Type: application/json" \ -d '{"message": "Who protects the Supreme Raven?"}'

# Vector-Lexical Hybrid RRF Search curl "http://localhost:8085/sessions/my-session/search?q=Known+Opossum&hybrid=true"

# High-speed Lexical-Only BM25 Search (Bypasses ONNX embedder) curl "http://localhost:8085/sessions/my-session/search?q=Opossum&lexical_only=true"

# Synchronize Claude Code or Antigravity with shivvr's Native MCP Server nemesis8 mcp add http://localhost:8085/mcp/sse

Search parameters

ParamDefaultDescription qrequiredQuery text n5Number of results hybridfalseBlend semantic vectors + BM25 scores (Reciprocal Rank Fusion) lexical_onlyfalseBypass vector embedder, execute pure BM25 search guardrailtrueEnable FST toxic term scanning and automatic query blocking roleorganizeorganize (768d local) or retrieve (1536d OpenAI) time_weight0.0Blend semantic + recency score (0–1) decay_halflife_hours168Recency decay half-life in hours include_nearbyfalseReturn temporally adjacent chunks agent_id—Agent ID for encrypted search openai_api_key—Per-request OpenAI key for retrieve role (overrides server key)

Environment

VariableDefaultDescription PORT8080Listen port MODEL_PATHmodels/gtr-t5-base.onnxGTR-T5-base ONNX embedder TOKENIZER_PATHmodels/tokenizer.jsonTokenizer OPENAI_API_KEY—Enables OpenAI completions and retrieve embeddings ANTHROPIC_API_KEY—Enables Anthropic completions and GhostAgent loops NUTS_AUTH_JWKS_URL—Enable auth (open dev mode if unset) NUTS_AUTH_VALIDATE_URLhttps://auth.nuts.services/api/validateAPI token validation endpoint

Stack

LayerChoice RuntimeRust + Tokio + axum CognitionGhostAgent cognitive RAG turn loop (OpenAI / Anthropic compat) MCP ServerHTTP/SSE JSON-RPC 2.0 Model Context Protocol transport layer Hybrid IndexTantivy FST deterministic phrase engine + BM25F field indexer EmbeddingGTR-T5-base (768d) via ONNX Runtime 2.0 — local, required StorageEphemeral RwLock — no disk, no volume mounts GPUCUDA 12.6 via ort EP on Cloud Run L4 — CPU fallback automatic Inversionvec2text gtr-base (projection + T5 enc/dec) — optional

GitHub Health JSON Sessions

shivvr · Rust + ONNX Runtime

DeepBlueDynamics · nuts.services · hyperia · @deepbluedynamic

Show HN: Shivvr – Ephemeral semantic embedding and cognitive agent service

Related Articles

The Newest Instagram "Exploit" Is the Goofiest I've Seen

Apple WWDC 2026 Livestream

Claude Fable 5

US Government directive to suspend access to Fable 5 and Mythos 5

German ruling declares Google liable for false answers in AI Overviews