Dropstone 1.5: Technical Report

Dropstone 1.5: Technical Report — Research | Blankline svg]:px-3 h-9 py-0 px-4 text-[12px] font-semibold uppercase tracking-[0.08em] rounded-none shadow-none focus-visible:z-10 transition-colors duration-300 bg-cyan-500 text-zinc-900 hover:bg-cyan-400" style="font-family:var(--font-sf-pro), -apple-system, BlinkMacSystemFont, sans-serif">Try Dropstone

li::marker]:!text-cyan-600 [&_ol>li::marker]:!font-semibold [&_ul>li::marker]:!text-cyan-600 prose-headings:font-sans prose-headings:font-semibold prose-headings:tracking-tight prose-headings:text-cyan-500 prose-p:font-sans prose-p:text-zinc-600 prose-p:leading-[1.8] prose-p:text-[17px] prose-a:text-cyan-600 prose-a:no-underline hover:prose-a:underline prose-strong:text-cyan-600 prose-strong:font-semibold prose-blockquote:border-l-zinc-300 prose-blockquote:text-zinc-500 prose-blockquote:not-italic prose-img:rounded-xl prose-img:border prose-img:border-black/[0.08] prose-li:font-sans prose-li:text-zinc-600 prose-li:text-[17px] ">Abstract Dropstone is a versioned, model-agnostic agentic coding runtime. The version number tracks the integration generation, not the underlying weights. Each release cycle we re-evaluate the leading open-weight frontier models on a public eval harness (the Joule Index) and re-baseline to whichever wins on the agentic-coding workload. Dropstone 1.5, the generation reported here, composes DeepSeek V4 Flash for the Fast tier, DeepSeek V4 Pro for the Pro tier, and Moonshot Kimi K2.6 for the Heavy tier. The report documents four things: the eval cycle as a product mechanism, the runtime architecture that makes model origin a non-issue for end-user security, the inference path on US-hosted providers, and the cost mechanics that let Pro users sustain roughly 450 heavy-coding turns per week at $ 15 a month. The cost claim is grounded in measured prefix-cache behavior on real Dropstone sessions. Sessions reach a steady-state hit rate above 95% once the cache warms, with a population-mean per-turn hit rate of approximately 82% averaged across mixed session lengths. We are explicit about what we do not claim. We did not pre-train the underlying models. We cannot audit their weights. The cache economics come from vLLM-style prefix caching at a US-hosted provider, not from DeepSeek's first-party disk cache. The report's contributions are a published cadence for monthly re-baselining of a coding CLI, a measurement of cache hit rates on real agentic coding sessions, and a cost model (Session-Amortized Token Cost, or SATC) that explains the per-turn economics directly. 1. What Dropstone Is Dropstone is not a foundation model. It is a runtime that turns an open-weight frontier model into a safe, affordable, agentic coding CLI. The runtime owns five things the underlying model does not. The agent loop. Planning, tool dispatch, multi-step execution, recovery from tool errors. The safety boundary. Explicit user approval before any state-changing action. The inference path. US-hosted providers, no retention, region-constrained routing. The pricing mechanism. Credit-based billing whose unit cost reflects measured cache economics rather than naive per-token list price. The eval cycle. A recurring public process that decides which open-weight model the next generation composes. The model is a swappable component. The runtime is the product. That sentence determines how we name releases, publish results, and price the product. "Dropstone 1.5" denotes the generation of the runtime that, at the moment of release, composes DeepSeek V4 Flash (Fast), DeepSeek V4 Pro (Pro), and Moonshot Kimi K2.6 (Heavy). "Dropstone 1.6" will denote whichever composition wins the next eval cycle. Users do not re-platform when a better open model lands. We do that for them. Perplexity has routed conversational queries across Claude, GPT, and DeepSeek for some time. As far as we know, Dropstone is the first coding-CLI vendor to formalize monthly re-baselining with public evals as the product cadence and to tie release version numbers to the integration generation. Cline and Aider expose model choice to the user via BYOK. Cursor and Claude Code lock to a single lab's family. We pick, we measure, and we publish. This report is the first artifact of the cycle. 1.1 Why Dropstone instead of going direct to the underlying model The underlying open-weight models are publicly available. Anyone can sign up for DeepSeek's API or self-host Kimi K2.6 on rented GPUs. The question follows: what does Dropstone add over going direct? Six things, each of which the user would otherwise have to build, buy, or absorb as risk. US-hosted compliance by default. DeepSeek's first-party API is hosted in China. Many US and EU buyers cannot route inference there under their compliance posture. Every Dropstone request lands at a US-hosted endpoint with data_collection: deny enforced at the API layer. The user configures nothing. Predictable flat pricing. 15 dollars a month replaces metered API anxiety. A...

Dropstone 1.5: Technical Report

Related Articles

The Newest Instagram "Exploit" Is the Goofiest I've Seen

It's Not Just X. It's Y

Amazon, Facebook, FBI have access to a private intelligence-sharing network

Show HN: GoPeek – open links in live mini browser windows without new tabs

Agent Memory: An Anatomy