Biohub Releases Protein Biology World Model to Address Disease
genprowebdirectory
End Meta Pixel Code -->
RSS
Youtube
GEN Edge
Featured News
Multimedia
Content
News
Insights
Topics
Artificial Intelligence
Bioprocessing
Cancer
Drug Discovery
Genome Editing
Infectious Diseases
OMICs
Translational Medicine
Magazine
Browse Issues
Subscribe
Multimedia
Summits
Webinars
GEN Live
Learning Labs
Podcasts
Videos
Resources
A-Lists
eBooks/Perspectives
Tutorials
Peer-Reviewed Journals
GEN Biotechnology
Re:Gen Open
New Products
Subscribe
Get GEN Magazine
Get GEN eNewsletters
Search
RSS
Youtube
Sign in
Welcome! Log into your account
your username
your password
Forgot your password? Get help
Privacy Policy
Password recovery
Recover your password
your email
A password will be e-mailed to you.
GEN – Genetic Engineering and Biotechnology News
GEN Edge
Featured News
Multimedia
Content
News
Insights
Topics
Artificial Intelligence
Bioprocessing
Cancer
Drug Discovery
Genome Editing
Infectious Diseases
OMICs
Translational Medicine
Magazine
Browse Issues
Subscribe
Multimedia
Summits
Webinars
GEN Live
Learning Labs
Podcasts
Videos
Resources
A-Lists
eBooks/Perspectives
Tutorials
Peer-Reviewed Journals
GEN Biotechnology
Re:Gen Open
New Products
Subscribe
Get GEN Magazine
Get GEN eNewsletters
Home Topics Artificial Intelligence Biohub Releases Protein Biology World Model to Address Disease
Credit: Christoph Burgsted/Science Photo Library/ Getty Images
Biohub, the non-profit research organization co-founded by Priscilla Chan, MD, and Mark Zuckerberg, has now unveiled the latest update to the ESM protein language model family, with expanded capabilities in binder design and protein function mapping for therapeutic discovery. The release comes just seven months after Biohub recruited the team behind EvolutionaryScale.
The system includes ESMC (Evolutionary Scale Modeling Cambrian), a language model trained on approximately 2.8 billion sequences drawn from a breadth of life, including organisms adapted to extreme environments, and more than 20,000 types of proteins found in the human body. Evolutionary information encoded in ESMC is translated into atomic-resolution protein structures and interactions using the design engine and prediction model, ESMFold2.
Alex Rives, PhD, head of science at Biohub and former chief scientist at EvolutionaryScale, presented the work at this week’s “AI in Biology” symposium at Cold Spring Harbor Laboratory.
These models aim to transform the earliest stages of drug discovery by making biology more programmable. While traditional discovery workflows rely on slow and resource intensive experimental screens to identify promising drug candidates, rational protein design guided by in silico predictions has the potential to dramatically accelerate development timelines.
"We’re at an exciting point in protein biology where accurate digital representations allow asking experimental questions at a scale that wouldn’t be possible in the laboratory," Rives told GEN Edge.
ESMC provides a foundation for modeling the sequence, structure, and function of proteins. ESMFold2 predicts the structure of proteins and biomolecular complexes. Features derived from the representations of the model capture fundamental principles of structure and function that form a compositional grammar for protein biology. [Biohub]ESMFold2 designed high-affinity protein binders against five disease targets in cancer and immunology: receptor tyrosine kinases implicated in tumor growth (EGFR and PDGFRβ), immune checkpoints exploited by cancer cells to evade immune surveillance (PD-L1 and CTLA-4), and a regulator of immune cell signaling (CD45).
Lab-validated designs achieved hit rates ranging from 36–88% for compact mini-binders and 15–29% for antibody-derived formats, while also demonstrating nanomolar binding affinity, high specificity, and favorable stability profiles consistent with potential clinical utility. Notably, binders for PD-L1 showed therapeutic function and restored T-cell signaling in laboratory tests by blocking the same pathway as approved checkpoint therapies.
Rather than requiring multiple sequence alignments (MSAs) to build representations, ESMFold2 captures evolutionary information encoded during pretraining. The model also uses a looped transformer architecture, which allows compute to scale at inference time and avoids overfitting that can arise when training is constrained by limited experimental protein structures.
In benchmarking, ESMFold2 performed favorably when compared against Chai-1 from Chai Discovery, Boltz-1 from MIT (whose developers have since launched a public benefit corporation), and AlphaFold 3 from Google DeepMind.
The models are accessible under the highly permissive Massachusetts Institute of Technology (MIT) license for both commercial and...