Killing a `Cow` made my JSON formatter 42% faster

linolevan1 pts0 comments

Killing a `Cow` made my JSON formatter 42% faster Killing a `Cow` made my JSON formatter 42% faster

Em dashes: 12<br>May 5, 2026

A formatter styles code consistently, usually improving human readability. If you are unfamiliar with formatters, tokenizers, or abstract syntax trees, see my article on how formatters work

The JavaScript ecosystem has a gap—most formatters are either featureful or fast. Prettier is the standard in the JavaScript ecosystem, but it’s not the fastest. Oxfmt has made major strides—supporting parity with Prettier’s JavaScript formatting with much improved performance. One report found a migration to Oxfmt formatted their repo 6.5x faster—from 13.9 seconds to 2.1

However these gains do not apply to JSON. Oxfmt is 10–20% slower than Prettier for JSON. Oxfmt has a native JavaScript formatter, but it reasonably delegates to Prettier for file types without native implementations1. Oxfmt plans for a native JSON implementation this year!

Prettier and Oxfmt both take more than a second to format a 2MB JSON file, while my formatter, JJPWRGEM takes around 30 milliseconds

Dense floats benchmark - Execution time<br>Parsing and stringifying a 2.2MB JSON file with lots of lightly nested arrays<br>See the full benchmarks for more details<br>See tabular benchmarks

CommandMedian Timejsonxf14.0msjjp31.0msbun70.0msnode89.0msjq112.0msdprint665.0msprettier1149.0msoxfmt1218.0ms

To be fair, JJPWRGEM doesn’t have Prettier feature parity—but what if it did? It’s a great base, and I am confident there are performance gains to be found while implementing features

I started with a well scoped and impactful change—normalizing numbers like Prettier—and I ended up speeding up my formatter by 42% in the process. If you’re interested in how that was calculated, jump to the final benchmarks

Number normalization

Let’s start with 3 normalizations related to scientific notation for simplicity. They all build nicely on each other

Lowercases exponent symbol (from 1E5 to 1e5)

Strips + (from 1e+5 to 1e5)

Removes 0 exponents (from 1e0 to 1)

Baseline

At the moment, JJPWRGEM leaves numbers untouched and represents them as a single Cow. I’ll explain more about Cow as relevant

I am benchmarking against several files with varying data shapes. Here are a few samples

dense floats{ "type": "FeatureCollection",<br>"features": [{<br>"type": "Feature",<br>"properties": { "name": "Canada" },<br>"geometry": {"type":"Polygon","coordinates":[<br>[[-65.613616999999977,43.420273000000009],<br>[-65.619720000000029,43.418052999999986],<br>[-65.625,43.421379000000059], ...]<br>]}<br>}]<br>many objects and integers{<br>"events": {<br>"138586341": {<br>"description": null,<br>"id": 138586341,<br>"name": "30th Anniversary Tour",<br>"subTopicIds": [337184269, 337184283],<br>"topicIds": [324846099, 107888604]<br>},<br>"areaNames": {<br>"205705993": "Arrière-scène central",<br>"205705994": "1er balcon central"<br>strings and deep nesting{<br>"statuses": [{<br>"metadata": { "result_type": "recent", "iso_language_code": "ja" },<br>"created_at": "Sun Aug 31 00:29:15 +0000 2014",<br>"id": 505874924095815700,<br>"text": "@aym0566x \n\n名前:前田あゆみ\n第...",<br>...<br>}]<br>Deser refers to deserializing into an abstract syntax tree. The performance is represented by throughput to avoid size based discrepancies. A 1MB file will be formatter faster than a 3MB file, but MB/s is comparable for both!

benchmarkbaseline (MB/s)deser/dense floats115.6deser/many objects and integers385.5deser/strings and deep nesting257.8<br>Iteration 0: Build a new normalized string

The simplest approach is to build new strings for improperly formatted numbers

Unfortunately, each number needs traversed twice, once to find the range and again to validate it. Even if there are no numbers in our input, it requires slow heap allocations

We need a more complex approach that takes references to chunks of the input string

Iteration 1: 2 Cows

Only 2 parts of each number matter. The mantissa to left of the e and the exponent to the right. You may recognize this as scientific notation—1e5 represents 1e51e51e5

Say we have 1E+5, we can store 1 and 5 and reconstruct 1e5. If we have an exponent, the e or E is lowercased. Any +s can be skipped since 1e5 and 1e+5 are equivalent

pub enum Tokena> {<br>Number {<br>mantissa: Cowa, str>,<br>exponent: Cowa, str><br>},<br>This approach works, but lowers performance across the board. Cow comes with overhead raw str avoids, negatively affecting datasets without many numbers

benchmarkbaseline (MB/s)this (MB/s)deltadeser/dense floats115.6101.1-12.5%deser/many objects and integers385.5289.3-25.0%deser/strings and deep nesting257.8214.8-16.7%<br>What is a Cow anyways?

Cow stands for Clone On Write. It references data until mutation is needed, then duplicates it. This pays off when most values are unchanged—only data needing modification is cloned

A to_lowercase function is a great use case—if the input is already lowercase, you can return a reference to the same string! In Rust terms, the data is “borrowed”

fn to_lowercase(string: &str) -> Cow_, str> {<br>if...

json formatter prettier oxfmt faster file

Related Articles