IRCNF

AlphaFold 3 Predicted It. Now Biologists Are Using It to Find Drugs.

Share:
AlphaFold 3 Predicted It. Now Biologists Are Using It to Find Drugs.

When DeepMind published AlphaFold 2 in 2020, it solved a 50-year-old grand challenge in biology: predicting how a chain of amino acids folds into a three-dimensional protein structure. The scientific community called it a watershed moment. Entire research programs that once required years of X-ray crystallography or cryo-electron microscopy work could now be attempted computationally in hours. By 2022, the AlphaFold Protein Structure Database held predicted structures for virtually every protein in the human proteome and hundreds of millions more across life on Earth.

Then, in May 2024, DeepMind released AlphaFold 3 — and changed the question entirely.

What AlphaFold 3 Actually Added

AlphaFold 2 was exceptional at predicting protein structures in isolation. AlphaFold 3 extended that capability to the full molecular ecosystem: DNA, RNA, small molecule ligands, ions, and chemical modifications, all predicted jointly with proteins in a single unified model. This isn't a marginal improvement — it's a fundamental shift in what structure prediction can mean for drug discovery.

The architectural change was equally significant. AlphaFold 3 replaced the Evoformer backbone with a diffusion-based architecture, borrowing techniques from image generation models to iteratively refine 3D atomic coordinates from noise. On the PoseBusters benchmark — a challenging test of physically plausible drug-like molecule poses in protein binding sites — AlphaFold 3 achieved more than a 50% improvement over prior state-of-the-art methods. For drug hunters, that number matters: accurately predicting how a small molecule ligand docks into a protein active site is one of the oldest and most computationally demanding problems in pharmaceutical research.

The Pharma Bet

The commercial implications were not lost on the industry. Isomorphic Labs, DeepMind's sister company focused on AI-driven drug discovery, announced a landmark partnership with Eli Lilly in January 2024 valued at up to $1.7 billion — one of the largest AI-pharma deals in history. The collaboration is specifically centered on applying advanced structure prediction and molecular design tools to Lilly's drug pipeline.

Recursion Pharmaceuticals has integrated AlphaFold predictions into its high-throughput biological screening platform, using structural data to prioritize which compounds to synthesize and test. At CASP16 — the biennial Critical Assessment of Structure Prediction competition held in late 2024 — AI-based methods dominated RNA structure prediction for the first time, a category where previous tools had struggled badly. The ability to accurately model RNA conformations opens doors to entirely new target classes, including the RNA-targeting drugs that have become a major area of pharmaceutical investment.

The Access Controversy

DeepMind's release strategy for AlphaFold 3 drew immediate criticism from the academic community. Unlike AlphaFold 2, whose weights were fully open-sourced and enabled an entire ecosystem of tools, AlphaFold 3's model weights were initially released under highly restrictive terms — available for academic research but not for any commercial application, and with constraints that made integration into open pipelines difficult.

The backlash was sharp. Researchers argued that restricting the weights of a model built substantially on publicly funded science created an unfair advantage for well-resourced pharmaceutical companies. DeepMind subsequently moved toward a more permissive licensing arrangement for non-commercial academic use, though commercial restrictions remained.

The controversy accelerated the development of open alternatives. RoseTTAFold All-Atom, from the Baker lab at the University of Washington, offers joint protein-ligand-nucleic acid prediction with fully open weights. Chai-1, released by Chai Discovery in 2024, matches AlphaFold 3's performance on several benchmarks and is available under a permissive research license. Boltz-1, from MIT, provides another open implementation. Together, these tools have ensured that the research community retains access to state-of-the-art structure prediction without depending on a single corporate gatekeeper.

What the Models Still Cannot Do

Structure prediction has solved one bottleneck while leaving others intact. The most fundamental limitation is that these models predict static snapshots — a single lowest-energy conformation — rather than the dynamic ensemble of structures a protein samples at physiological temperature. Biology runs on motion: enzymes change shape to catalyze reactions, receptors flex to bind signaling molecules, intrinsically disordered proteins function precisely because they lack a fixed structure. An estimated 30 to 40 percent of the human proteome consists of intrinsically disordered regions that AlphaFold and its successors handle poorly, as reflected in low pLDDT confidence scores for those segments.

Membrane proteins present a separate challenge. These targets — which account for roughly 60% of approved drugs — exist in a lipid bilayer environment that is difficult to simulate accurately, and their predicted structures carry higher uncertainty than soluble proteins. The models also struggle with large conformational changes induced by ligand binding, the kind of induced-fit dynamics that are critical to understanding drug selectivity and off-target effects.

The Open Ecosystem Beyond DeepMind

The field has matured into a rich open ecosystem. Meta's ESMFold, based on the ESM language model trained on evolutionary sequence data rather than multiple sequence alignments, offers dramatically faster inference — useful for large-scale screening applications where speed matters more than precision. OpenFold provides a fully open reimplementation of AlphaFold 2 that researchers can retrain and fine-tune on custom datasets.

Evolutionary Scale's ESM3, released in 2024, takes a more ambitious approach: a multimodal generative model that reasons jointly over protein sequence, structure, and function. Where AlphaFold predicts structure from sequence, ESM3 can generate novel sequences that fold into target structures — beginning to close the loop between prediction and design.

The database infrastructure has kept pace. The Protein Data Bank now contains over 220,000 experimentally determined structures accumulated over five decades of work. The AlphaFold Database, maintained jointly by DeepMind and EMBL-EBI, has grown to over 200 million predicted structures covering most known proteins across all domains of life. This combination of experimental ground truth and computational coverage at scale has transformed what is possible in comparative structural biology.

A Bottleneck Removed, Not Biology Solved

Two years after AlphaFold 3's release, the honest assessment is this: it removed a real and significant bottleneck in the drug discovery pipeline, but it did not make drug discovery easy. Structure prediction was one of several rate-limiting steps — alongside target validation, ADMET profiling, clinical translation, and the fundamental unpredictability of human biology in vivo. Solving it with high accuracy has accelerated the early stages of structure-based drug design and opened up target classes that were previously inaccessible.

The commercial deals, the open alternatives, the database growth, and the continued push into dynamics and generative design all signal that the field is in a genuine transition. But the gap between a beautifully predicted binding pose and a drug that works in patients remains enormous — and filled with biology that no model yet knows how to predict.

Share:
AlphaFold 3 Predicted It. Now Biologists Are Using It to Find Drugs. | IRCNF - Intelligent Reliable Custom Next-gen Frameworks