The AI x TechBio Bingo Card: A framework for what "good" looks like for startups

advikipedia1 pts0 comments

The AI x TechBio Bingo | MMC

The AI x TechBio Bingo

user<br>Advika Jalan<br>, Charlotte Barttelot

icon-yoi 19.05.26

Data-driven health

Insights Hub

The AI x TechBio Bingo

Let’s say you’re the type of person who likes to bet. If you had a 2% chance of getting $200 billion (spread over 20 years), would you take it? The catch is that you’ll have to spend $2 billion or so to place the bet, and you’ll know if you won the bet only at the end of a 10 year period. Oh, and if the financial investment wasn’t daunting enough, millions of lives are at stake if you don’t win.

That’s what drug discovery looks like today.

The probability of success is low (90% of drugs fail to make it through clinical trials), the costs are incredibly high ($2 billion for a drug), and the entire process takes a decade. Even if the drug is commercialised, c.55% of approved drugs don’t make enough money to recover their development costs. Nevertheless, the payoffs can be massive if you get it right: AbbVie’s drug Humira is widely seen as the best-selling drug ever, with over $200 billion in lifetime sales.

Against this backdrop, AI x TechBio was touted as something that evens the odds (by speeding up time to preclinical candidates, lowering costs and increasing effectiveness of medicines). Which is why we’re seeing Big Pharma make multiple billion-dollar AI deals (think Isomorphic Labs’ $3 billion worth of partnerships with Eli Lilly and Novartis, or NVIDIA’s $1 billion collaboration with Eli Lilly for an AI co-innovation lab). However, Jayatunga et al. report that AI-discovered molecules achieve 80–90% success rates in Phase I of clinical trials (well above historical averages) but drop to around 40% in Phase II (based on a limited sample), broadly in line with industry norms.

Overcoming Phase II failure is critical to validating AI in drug discovery. Clinical trials drive over 60% of total development costs, and Phase II is where most drugs fail to progress. The questions then become: how can we help AI-discovered drugs succeed in Phase II and beyond (as it has so clearly done in Phase I)? How can we improve AI drug discovery and development so that we can get the right drugs to the right patients quickly, cheaply, and with improved efficacy and safety?

These are questions we’re answering in our report, based on 40+ interviews with senior pharma leaders and startup founders. We’ve broken down what good looks like for (1) proprietary data; (2) algorithms and models; and (3) lab-in-the-loop infrastructure and agentic AI workflows, and illustrated them with case studies (everything from model generalisation through new architectures like JEPA to curiosity-based agentic workflows driving novel discoveries). We’ve also talked about how to validate early that your AI model works (even if you don’t have clinical trial data for a drug generated using your AI model yet).

"For every patient waiting on a breakthrough, drug discovery has been a decade-long coin flip. AI is rewriting those odds: turning brute-force probability into rapid iteration, tighter biological insight, and faster paths from biology to medicine."

– Daniel Rabina, Healthcare & Life Sciences Startups at AWS

We distilled all of this into what we call the AI TechBio Bingo card. Think of it as a slightly tongue-in-cheek checklist of everything a compelling TechBio company should be able to say (and most importantly, prove). From data scale and quality to hypothesis-free discovery and translation models, these tiles capture the ingredients that keep showing up in the most promising approaches. Few companies will tick every box – but the closer you get to a full house, the more likely you are to bend the odds in your favour.

Significant unmet (technological) need: Data

Biology foundation models are only as good as the data they’ve been trained on, yet getting data for training biology foundation models is the hardest problem to solve for a multitude of reasons – and substantially different from trying to train another ChatGPT. For starters:

Generating real biological data takes time : There’s a line in Recursion’s S-1 filing that we really liked: “no amount of resources can compress the time it takes to observe naturally occurring biological processes” and we fully concur with the sentiment. Or if you want Charlie Munger’s more lurid description, “You can’t produce a baby in one month by getting nine women pregnant." Real biological processes and their corresponding datasets (e.g. longitudinal patient datasets that span over 10 years) will simply take time to build.

Capturing relevant data is difficult and expensive, because:

We don’t fully understand human biology, so figuring out what to capture and measure itself can be challenging.

Even if you knew what data you wanted to capture and measure, the technologies and methods for it may not yet exist – after all, AlphaFold would not exist without the help of technologies like X-ray crystallography to determine the...

data drug techbio billion phase bingo

Related Articles