How Does AlphaFold Work? The Science Explained for Biologists

How Does AlphaFold Work? The Science Explained for Biologists

AlphaFold changed structural biology — but most explanations of how it works are written for machine learning engineers, not biologists. This article explains the core ideas in biological terms: what information AlphaFold uses, what it learned from, and why its predictions are so much better than anything that came before.

The key insight: evolution encodes structure

Every AlphaFold explanation eventually comes back to one idea: evolution is a massive experiment in protein folding, and the results of that experiment are encoded in the sequences of billions of known proteins.

Proteins that perform the same function in distantly related organisms have similar structures — even when their sequences have diverged significantly over hundreds of millions of years. Why? Because mutations that destroy the fold are selected against. Sequences that maintain the fold survive. Over evolutionary time, the sequences in any protein family carry a hidden record of which parts are structurally essential, which residues must stay in contact, and which can vary freely.

AlphaFold learned to read that record. Its training data was not a set of rules written by biologists, but the entire Protein Data Bank — over 170,000 experimentally determined structures — cross-referenced with the evolutionary information in hundreds of millions of known protein sequences. It learned, empirically, the mapping from evolutionary patterns to 3D structure.

Multiple sequence alignments — reading evolutionary memory

The first thing AlphaFold does with your input sequence is search large sequence databases — UniRef90, BFD, MGnify — for evolutionary relatives. It collects all the sequences it finds, aligns them into a multiple sequence alignment (MSA), and uses that alignment as primary input alongside the sequence itself.

Multiple sequence alignment — simplified example
Query (human) M K T A Y I A K Q R Q I S F V K S H F S
Mouse M K T A Y I A K Q R Q I S F V K S H F S
Zebrafish M K T A Y V A K Q R E I S F V K A H F S
Yeast M R T A H I S K Q K Q V T F L K N H Y T
E. coli – – T A Y I A K Q – Q I S F – K S H – S
Each row is a homologous sequence. Columns show which positions are conserved (teal) across evolution and which vary (amber). The pattern of conservation and co-variation is what AlphaFold analyzes for structural signals.

The MSA encodes two kinds of structural information. Conservation: positions that are identical or near-identical across all species are structurally or functionally essential — mutating them disrupts the fold or destroys activity. Co-variation: positions that vary together across species are often in physical contact in 3D — when one mutates, the other compensates to preserve the interaction.

Why AlphaFold struggles with “orphan” proteins
Proteins with very few known homologs — sometimes called orphan proteins or dark proteome — produce sparse MSAs with few rows. AlphaFold’s predictions for these proteins are significantly less accurate because the evolutionary signal it depends on is weak or absent. ESMFold, which uses a protein language model trained on sequences without requiring an MSA, handles these cases better. This is the main situation where ESMFold outperforms AlphaFold2.

Co-evolution: how contacts leave a signal

The co-evolution concept is worth dwelling on because it’s the biological heart of why AlphaFold works. Two residues that are in direct physical contact in the protein’s 3D structure tend to co-evolve — when a mutation at one position would disrupt the contact, a compensating mutation at the other position is selected to restore it.

Co-evolution as a contact signal
AxxExxx
Original: Ala–Glu contact
stable interaction
KxxDxxx
Evolved variant: Lys–Asp
different residues, same contact
When position 1 mutates from Ala→Lys (positive charge), position 4 co-mutates from Glu→Asp (negative charge) to preserve the electrostatic interaction. Across thousands of species, this correlated variation reveals that positions 1 and 4 are in contact — without ever measuring the structure directly.

By analyzing the statistical patterns of correlated mutations across hundreds of thousands of sequences in the MSA, AlphaFold can infer which pairs of residues are likely to be in contact in 3D space. These predicted contacts constrain the structure — they function like a sparse set of distance measurements that the model uses to guide the fold.

The Evoformer: AlphaFold’s core architecture

AlphaFold2’s central innovation is the Evoformer — a neural network module that simultaneously processes two representations: the MSA (which encodes evolutionary information) and a pairwise distance matrix (which encodes spatial relationships between residues).

The key idea is that these two representations are processed together, not separately. Information flows between them through attention mechanisms — mathematical operations that allow each position in the sequence to “attend to” and be influenced by every other position simultaneously, weighted by how relevant each pair is for predicting the structure.

You don’t need to understand the mathematics to grasp the biological meaning: the Evoformer is learning to ask and answer questions like “if residues 45 and 112 are co-evolving, and residues 112 and 78 are co-evolving, what does that imply about the relative positions of residues 45 and 78?” — propagating structural constraints across the entire length of the sequence in parallel.

This is repeated through 48 layers of the Evoformer. Each layer refines the representation, resolving ambiguities and propagating information until the final layer produces a detailed picture of which residues are near each other and in what geometry.

Why “attention” is the right tool
Proteins are fundamentally about long-range relationships — residues far apart in sequence that are adjacent in 3D space, loops that communicate with active sites across the whole protein. Attention mechanisms in neural networks are specifically designed to model exactly these kinds of long-range dependencies, which is why they’re so well-suited to the protein structure prediction problem.

The full AlphaFold2 pipeline

1
Sequence input and MSA construction
Your protein sequence is searched against UniRef90, BFD, and MGnify. All homologous sequences are collected and aligned into an MSA. Simultaneously, the sequence is searched for structural templates in the PDB — experimentally determined structures of similar proteins that can provide direct geometric information.
2
Evoformer processing
The MSA and template information are fed into the 48-layer Evoformer stack. The MSA representation and the pairwise distance representation are updated iteratively, each informing the other. The result is a rich representation of the residue-residue relationships that constrain the structure.
3
Structure module
The Evoformer output is passed to the Structure Module — a separate neural network that directly predicts the 3D backbone coordinates for every residue. It starts from a random initial arrangement and iteratively refines the positions using the constraints learned by the Evoformer, converging to the predicted structure.
4
Recycling and refinement
The entire pipeline is run multiple times (typically 3 “recycling” passes). The structure from one pass is fed back as additional input to the next, allowing the model to resolve ambiguities iteratively. This recycling step significantly improves accuracy over a single forward pass.
5
Output: structure + confidence scores
The final output is the predicted 3D structure — atomic coordinates for every atom in the backbone and side chains — plus per-residue pLDDT confidence scores and the predicted aligned error (PAE) matrix. Typically five models are generated and ranked by confidence.

pLDDT scores: how AlphaFold reports confidence

AlphaFold doesn’t just predict structure — it predicts how confident it is in each residue’s position. The pLDDT score (predicted Local Distance Difference Test) is a per-residue score from 0 to 100. It estimates how well the predicted position of each residue would match an experimental structure, if one existed.

Critically, low pLDDT doesn’t always mean AlphaFold made an error — it often means the residue is genuinely disordered and has no single stable position to predict. A disordered loop that flaps freely in solution will have low pLDDT because AlphaFold correctly recognized it has no well-defined structure, not because the prediction is wrong.

90–100
Very high confidence. Backbone and side chains reliable. Use directly for docking, structural analysis, or MD simulation.
70–90
Good confidence. Backbone positions reliable; some side chain uncertainty. Suitable for most downstream applications.
50–70
Low confidence. May represent a flexible or disordered region. Backbone uncertain — treat with caution and do not rely on for docking.
Below 50
Very low confidence. Almost certainly intrinsically disordered. Do not use for structural analysis — the position is essentially a guess.

In PyMOL, pLDDT is stored in the B-factor column of AlphaFold PDB files. To color a structure by confidence: spectrum b, blue_white_red, minimum=50, maximum=100 — blue regions are high confidence, red are low.

AlphaFold3: what changed and why it matters

AlphaFold2 (2021)
Evoformer architecture
  • Predicts single protein chains
  • Multimer extension for protein complexes
  • Single deterministic structure output
  • Requires deep MSA for best accuracy
  • Available free — database + ColabFold
  • Best-in-class for single-chain proteins
AlphaFold3 (2024)
Diffusion architecture
  • Predicts complexes natively — protein + DNA, RNA, ligands, ions
  • Generates multiple diverse conformations
  • Handles modified residues and covalent modifications
  • State-of-the-art for protein-ligand complexes
  • Available via web server (usage limits apply)
  • Commercial use restrictions on model weights

The architectural shift from Evoformer to diffusion is the most important technical change in AlphaFold3. Diffusion models — the same class used for image generation — work by learning to denoise: starting from random noise and iteratively refining toward a plausible structure. This allows AlphaFold3 to generate multiple different plausible structures for the same input, capturing conformational diversity that AlphaFold2’s single-output approach misses.

For practical purposes: use AlphaFold2 for single protein chains where you need the most accurate prediction. Use AlphaFold3 when you need to predict how a protein interacts with DNA, RNA, a small molecule ligand, or another protein.

Neither model replaces experimental structure determination
AlphaFold2 and AlphaFold3 are trained to predict the most common conformation — typically the apo state. Drug targets often adopt different conformations when bound to inhibitors. Allosteric binding sites may only be visible in specific conformational states. For applications where the exact conformation matters, predicted structures are starting points for hypothesis generation, not final answers.

How AlphaFold works in one paragraph

AlphaFold2 works by extracting evolutionary information from a multiple sequence alignment of protein homologs, identifying co-evolving residue pairs as signals of spatial contacts, and using a deep neural network (Evoformer) to translate that evolutionary information into predicted 3D coordinates. It was trained on the entire Protein Data Bank — learning empirically what evolutionary patterns correspond to what structures. The pLDDT score reports per-residue confidence; high-pLDDT regions are reliable, low-pLDDT regions are probably disordered. AlphaFold3 extends this to molecular complexes using a diffusion architecture, making it the current best tool for predicting how proteins interact with ligands, DNA, and RNA.

Last updated on

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *