We Built NeuroAutomata: protein variant effect prediction

Why We Built NeuroAutomata | Axon Agentic

NeuroAutomata Research Research and Disclosure Methodology

Benchmark reports TP53 Methodology INSR CYP2C9 BRCA1 + PTEN See all reports

Results Cohort results

Services Blog About Glossary Contact

On this page About This Post

Hello everyone,

Before the team’s writeup on NeuroAutomata, a note on who wrote what and how this post was produced.

NeuroAutomata is built on ESM-2 (Evolutionary Scale Modeling 2), a protein language model developed by Meta AI. I built the tool with agentic AI tools, similar to the approach I used for “Building an AI Multi-Agent System to Enable Natural Language Queries for Human Protein Atlas (HPA) data”. Further more, the HPA project is continuing being developed.

In my past roles at life sciences and biotech companies, I kept running into the same pattern: my scientist co-workers had web-based tools that needed updates, and waiting for internal IT meant launch dates slipped and marketing goals broke. Because I also code, I’d help update their apps and run the marketing alongside. That pattern is how NeuroAutomata started — a tool I’d want to hand to those same co-workers.

I’m the only human (solopreneur) building this at the time of writing, so I rely on a set of AI agents for specific roles (meet the full AI staff):

Amara — marketing director (strategy, content, outreach)

Astro — website and frontend engineering

Kiran — product manager for NeuroAutomata (product direction, software engineering, scientific research)

Veritas — independent claim verification (fact-checking against primary sources)

Folio — content authoring (blog posts, methodology and landing pages, shared glossary)

Scout — source resolution (finding open-access versions of cited papers, confirming links stay live)

Regarding AI-generated content: I adapted what I learned from the HPA project and built a verification system to reduce AI-generated slop. My take is that this is early days. LLMs have come a long way, but they are far from perfect. I take responsibility for what I publish. Every serious blog post is run through Veritas, our independent verification system AI agent. Veritas extracts claims, checks them against primary sources across numerical, clinical, and semantic layers, and blocks publication on contradictions. Claims that could not be verified from primary sources are flagged in the post or in the verification report — not silently passed. The verification report for this post is linked here.

This disclosure approach is new as of April 2026. Earlier posts predate this policy; I am applying it going forward, and both the disclosure and the underlying policy will continue to evolve.

With that, here is the team’s analysis of why we built NeuroAutomata.

ESM-2 Ranks 45th of 97 on the Live ProteinGym Leaderboard. It Also Requires a GPU to Run.

You have 50 VUS Variant of Uncertain Significance — a genetic variant found in a patient that hasn't been classified as definitively pathogenic or benign. Full definition in your queue. A patient review is scheduled. Your current tool is PolyPhen-2.

PolyPhen-2 was published in 2010. Protein language models A deep learning model trained on millions of protein sequences to predict how mutations affect function. NeuroAutomata uses ESM-2, a PLM developed by Meta AI. Full definition that outperform it by a documented margin have existed since 2021. The problem isn’t that better tools don’t exist — it’s that running them requires a Python environment, GPU access, 2.5 GB of model weight downloads, and custom code for tokenization and output formatting. That’s not a workflow for a pharmacogenomics The study of how genetic variants affect drug response — which patients metabolize drugs faster, slower, or differently due to inherited differences in drug-metabolizing enzymes. Full definition lab under clinical time pressure. It’s a side project.

The performance gap between legacy tools and modern protein language models A deep learning model trained on millions of protein sequences to predict how mutations affect function. NeuroAutomata uses ESM-2, a PLM developed by Meta AI. Full definition is documented in the ProteinGym A standardized benchmark suite for protein variant effect predictors, covering 217 deep mutational scanning assays across diverse protein families. Full definition substitution benchmark (benchmark design Notin et al. 2023, NeurIPS) across 217 protein assays. ESM-2 A protein language model by Meta AI trained on 250 million protein sequences. Predicts how amino acid mutations affect protein function from sequence alone — no structure required. Full definition ranks 45th of 97 models on the live leaderboard. SIFT and PolyPhen-2 were not evaluated in the ProteinGym benchmark.

NOTE

NeuroAutomata scores are Research Use Only (RUO) Research Use Only — a regulatory designation meaning the tool provides research scores, not clinical diagnoses. The same label used by REVEL, CADD, AlphaMissense, and...

We Built NeuroAutomata: protein variant effect prediction

Related Articles

Apple WWDC 2026 Livestream

Claude Fable 5

US Government directive to suspend access to Fable 5 and Mythos 5

German ruling declares Google liable for false answers in AI Overviews

Britain Became as Poor as Mississippi