AI to rewire life’s interactome: Structural foundation models help to elucidate and reprogram molecular biology

7 min read Original article ↗

Biomolecular interactions gained through evolution enable living systems to transduce signals and energies across diverse spatial and temporal scales. The ability to harness patterns from these extensive interactions will unlock vast molecular design and therapeutic development opportunities. My colleagues and I transformed this mission into computable tasks by creating a “computational microscope” of the structural interactome, by leveraging tools from artificial intelligence (AI) and accelerated computing.

The interactions among biomolecules, such as proteins and smaller molecules, define larger biological organizations and orchestrate the fundamental processes of life. To identify these interactions and understand how they take place, we must develop a “microscope” to decode the three-dimensional structures—coordinates of atoms—that form these interactions from zoomed-in snapshots of molecular compartments. Experimental methods to determine molecular structures, such as x-ray crystallography and cryo–electron microscopy, are incredibly powerful but constrained by the slow pace of laboratory research and the months of laborious work to isolate molecular snapshots into analyzable samples. A computational microscope would overcome this bottleneck by directly synthesizing views of structures from the identity of those molecules.

AI shows potential to unify biomolecular structure prediction and design across diverse modalities

(A) Schematics of generative structure prediction for protein-ligand complexes. (B) Generative modeling enables fast sampling from molecular conformational landscapes. (C) Structural insights revealed by NeuralPLexer predictions, as exemplified by the prediction of the human KRAS G12C ligand-bound state. (D) New opportunities for structure prediction foundation models and de novo molecular design approaches.

GRAPHIC: ADAPTED FROM (5) BY A. FISHER/SCIENCE

This idea remained a moonshot for half a century until recent breakthroughs in AI-driven protein structure prediction (1). Unlike traditional simulation approaches that involve enumerating an astronomical number of hypotheses, AlphaFold2 and related AI-based structure predictors harness traces from millions of years of molecular evolution and patterns learned from experimentally determined structures. Specifically, models of evolutionary constraints—such as multiple sequence alignments (2) or protein language models (3)—are digested by specialized neural networks to enable reasoning about the three-dimensional constellation of amino acids (building blocks of proteins) with unprecedented accuracy. We have advanced this vision by developing generative machine-learning approaches to address two critical aspects beyond proteins in isolation: protein-ligand interactions and their conformational landscapes.

SIGN UP FOR THE AWARD-WINNING SCIENCEADVISER NEWSLETTER

The latest news, commentary, and research, free to your inbox daily

Biomolecules are highly dynamic and require numerous snapshots to fully capture their behaviors. Protein shapes are modulated by a vast array of small-molecule ligands and posttranslational modifications (4), which drive dynamical conformational changes crucial to the regulation of biological functions and provide key opportunities for drug discovery. These complexities challenge traditional lock-and-key computational protein-ligand interaction prediction strategies, which often assume that the protein is a rigid body. Although methods such as molecular dynamics can model binding and conformational changes, they are limited by the prohibitive cost of overcoming the barriers of slow transitions between low-lying conformational states.

We present a generative AI strategy, NeuralPLexer, to resolve this conundrum (5) (see the figure). NeuralPLexer models the landscape of protein-ligand binding with generative diffusion (6): It starts from an initial sketch of the entire molecular complex and progressively refines the finer-grained details of the reasoned structures. The AI model generates an ensemble of conformational snapshots, based on multiple simultaneous guesses of the initial sketch, to cover the thermodynamic landscape of the biomolecule. Such a “one-shot” generation provides a pathway to bypass the sampling barriers and quickly obtain the full picture of molecular interactions with atomistic details. NeuralPLexer uses a neural network that mirrors the multiscale hierarchical organization of biomolecular complexes. It initializes predictions by leveraging inferred protein-ligand contacts and subsequently generates detailed geometric representations, all while maintaining the essential physical symmetries.

As an initial validation, we applied NeuralPLexer to predict the formation of cryptic pockets—binding sites induced by ligand binding that are absent in unbound structures. We used a dataset where small-molecule binding considerably alters protein conformations, and NeuralPLexer successfully generated conformational distributions consistent with structural experiments. On a diverse collection of enzymatic systems, the sampled conformational ensembles showed strong agreement with experimental protein conformations, as quantified by metrics such as TM-score (7) and Q-factors (8), effectively overcoming the limitations of static protein-folding models. Additionally, by assigning confidence levels to its conformational predictions, NeuralPLexer demonstrated the ability to distinguish strong binders from weak ones across a wide array of targets, despite never having been trained on affinity measurements.

We further used this strategy to gain mechanistic insight into protein functions. For example, in a ketol-acid reductoisomerase, whose catalytic mechanism was recently characterized, the model accurately captured the closure motion of the N-subdomain upon cofactor and inhibitor binding while providing hints about target self-assembly. Additionally, NeuralPLexer predictions can assist in identifying structural elements crucial for protein activation and deactivation. In the case of a G protein–coupled receptor, the model generated a conformational hypothesis that explained the receptor’s constitutive activity in the absence of ligands. These capabilities highlight NeuralPLexer’s potential as a powerful tool to unravel various molecular mechanisms underlying allosteric regulation and enzyme catalysis.

Beyond atomistic conformations, we developed a geometric deeplearning approach named OrbNet-Equi (9) to study the energetics of molecular interactions with accuracy comparable to full-precision quantum mechanical methods across main-group chemistry while being about 1000-fold faster. By combining these tools, we can resolve variations in protonation states and electronic charge and spin configurations at greater resolution to provide a comprehensive strategy to interpret and devise proton or electron transfer pathways.

Beyond structure predictions for known protein–small-molecule interactions, we enhanced NeuralPLexer with an inpainting-based approach to discover new pockets. This capability could enable us to design ligands for previously uncharacterized binding sites. Similar to AI image editing tools that paint and restyle a selected region based on its surrounding context, our method simultaneously infers ligand structure, protein sequence, and pocket-shape variations using only the backbone of the target protein. This strategy led to accurate predictions on challenging targets such as KRAS-G12C, where it achieved a markedly higher design success rate than traditional conformational search and docking algorithms.

As structural prediction foundation models achieve greater accuracy (10), future research will be extending these capabilities to encompass more sophisticated interactions and arbitrary bioassembly stoichiometry. This includes protein-protein interaction interfaces stabilized by chemical modulators—a phenomenon known as induced proximity (11). Such advancements pave the way for the rational design of ligands to reprogram protein interaction networks and ultimately restore healthy cellular states. Unified interaction prediction models, together with de novo binder design frameworks, will become a powerful platform to design stabilizers of protein-protein interfaces, modulating the subcellular localization of pathogenic proteins (12) and biasing the formation of specific assembly states. This approach to rewiring cellular signaling not only holds promise for therapeutic precision but also enhances the therapeutic window by selectively targeting oligomeric states of proteins, sparing monomeric forms essential for normal cellular functions (13). These emerging capabilities will help translate the structural foundation model we have developed into versatile tools for advancing chemical biology and expediting drug discovery.

Biographies

GRAND PRIZE WINNER

Zhuoran Qiao received an undergraduate degree from Peking University and a PhD from the California Institute of Technology. He then served as a senior machine-learning scientist at Iambic Therapeutics during 2023–2024 and started at Chai Discovery as a scientist and founding team member in 2025. His research centers around physics-driven machinelearning approaches to address problems in chemistry and structural biology involving complex molecular systems. www.science.org/doi/10.1126/science.adx7802

PHOTO: COURTESY OF ZHUORAN QIAO

References and Notes

1

J. Jumper et al., Nature 596, 583 (2021).

2

S. Ovchinnikov et al., Science 355, 294 (2017).

3

Z. Lin et al., Science 379, 1123 (2023).

4

R. Nussinov, C. J. Tsai, Cell 153, 293 (2013).

5

Z. Qiao, W. Nie, A. Vahdat, T. F. Miller III, A. Anandkumar, Nat. Mach. Intell. 6, 195 (2024).

7

Y. Zhang, J. Skolnick, Nucleic Acids Res. 33, 2302 (2005).

8

R. B. Best, G. Hummer, Proc. Natl. Acad. Sci. U.S.A. 102, 6732 (2005).

9

Z. Qiao et al., Proc. Natl. Acad. Sci. U.S.A. 119, e2205221119 (2022).

10

J. Abramson et al., Nature 630, 493 (2024).

11

B. Z. Stanton, E. J. Chory, G. R. Crabtree, Science 359, eaao5902 (2018).

12

C. S. C. Ng, A. Liu, B. Cui, S. M. Banik, Nature 633, 941 (2024).

13

R. C. Sarott et al. Science 386, eadl5361 (2024).