How to Use an AlphaFold Structure for Molecular Docking: Complete Workflow

18 min read Intermediate

AlphaFold has made high-quality protein structures available for hundreds of millions of proteins — most of which have never been crystallized. This tutorial bridges the structure prediction and docking pillars: how to go from an AlphaFold prediction to a docking-ready receptor, with every consideration that makes AlphaFold docking different from docking into a crystal structure.

Step 1

Download from AFDB

Step 2

Check pLDDT + PAE

Step 3

Evaluate binding site

Step 4

Prepare structure

Step 5

Handle low-conf regions

Step 6

Dock + validate

Before you start: is AlphaFold the right receptor?

AlphaFold structures are the right starting point for docking when no experimental structure exists — or when an existing crystal structure is of poor quality, incomplete, or in the wrong conformation. Before committing to an AlphaFold receptor, check two things:

Does an experimental structure exist? Search the RCSB PDB (rcsb.org) for your target. If a high-resolution crystal structure exists in an apo or relevant-conformation state, use that — experimental structures are ground truth. AlphaFold is the fallback, not the first choice.
What conformation do you need? AlphaFold predicts the apo (unbound) state. If your target is known to adopt a dramatically different conformation when bound to an inhibitor — like many kinases, GPCRs, or nuclear receptors — and a homolog crystallized in that conformation exists in the PDB, consider homology modeling from that template instead.

If no experimental structure exists and AlphaFold is your best option, continue with this workflow.

The cross-pillar context

This tutorial connects the structure prediction pillar to the molecular docking pillar. If you haven’t already read the docking preparation tutorial on this site, it covers protein preparation steps in detail — protonation states, removing waters, generating the PDBQT file. This tutorial focuses on the AlphaFold-specific considerations that precede those standard steps.

Step 1 — Download from the AlphaFold Database

Step 1

Get the structure and confidence files

Go to alphafold.ebi.ac.uk. Search for your protein by UniProt ID (e.g. P04637 for human TP53) or gene name. On the protein page, download two files — both are essential:

PDB file — the 3D coordinates. The pLDDT score for each residue is stored in the B-factor column.
JSON file — contains the full pLDDT array and the complete PAE matrix. You need this to evaluate inter-domain confidence and to generate the PAE plot.

If your protein isn’t in the database — for example a novel sequence or mutant — run a prediction with ColabFold first, then return to this workflow with the output PDB and JSON files.

Terminal — download via command line

# Download PDB and confidence JSON for UniProt ID P04637
wget https://alphafold.ebi.ac.uk/files/AF-P04637-F1-model_v4.pdb
wget https://alphafold.ebi.ac.uk/files/AF-P04637-F1-predicted_aligned_error_v4.json

The filename convention is AF-{UniProtID}-F{fragment}-model_v{version}.pdb. Fragment number is 1 for most proteins; longer proteins above ~2700 residues are split into overlapping fragments numbered F1, F2, etc.

Step 2 — Assess pLDDT and PAE confidence

Step 2 — Critical before proceeding

Evaluate confidence in the binding site

This step determines whether it’s worth docking at all. Load the PDB in PyMOL and color by pLDDT to immediately see where confidence is high and where it isn’t:

PyMOL command line

# Load and color by pLDDT (stored in B-factor column)
load AF-P04637-F1-model_v4.pdb
spectrum b, blue_cyan_yellow_orange_red, minimum=50, maximum=100
show cartoon
set cartoon_fancy_helices, 1

Now focus specifically on the predicted binding site — the region where you intend to dock. Ask: what is the pLDDT of the residues lining the binding pocket?

If the binding site residues have pLDDT > 70: the pocket geometry is reliable enough to proceed. Most docking studies report the mean pLDDT of binding site residues as a confidence metric.

If the binding site has pLDDT 50–70: proceed with caution. Consider running short MD simulation to relax the structure first, or running an ensemble docking protocol with multiple conformations.

If key binding site residues have pLDDT below 50: stop. Docking into a low-confidence binding site is not a reliable prediction — the pocket geometry is uncertain and results will be misleading.

Check PAE for multi-domain proteins

If your binding site is at the interface between two domains, also check the inter-domain PAE. A binding site that straddles two domains can look confident per-residue (high pLDDT in both domains) while the relative domain arrangement is completely uncertain (high inter-domain PAE). Docking into an incorrect domain arrangement gives meaningless results regardless of the per-residue confidence.

Step 3 — Identify and evaluate the binding site

Step 3

Locate the binding pocket and verify it’s real

With a crystal structure, the binding site is typically obvious — the co-crystallized ligand sits in it, and you center your docking grid on that ligand. With an AlphaFold structure, you must identify the binding site independently.

Option A: known binding site from literature or mutagenesis

If published mutagenesis data, biochemical studies, or structural studies on homologs identify which residues are important for binding, use those to define your docking grid box center. In PyMOL, select those residues and calculate their center of mass:

PyMOL command line

# Select known active site residues
select active_site, resi 175+248+249+273+282

# Calculate center of mass for grid box centering
centerofmass active_site

Option B: computational binding site prediction

If the binding site is unknown, use a pocket detection tool. fpocket (free, open source) is the standard choice — it identifies putative binding pockets by rolling a sphere across the protein surface and identifying concave regions:

Terminal

# Install fpocket
conda install -c conda-forge fpocket -y

# Run pocket prediction
fpocket -f AF-P04637-F1-model_v4.pdb

# Results in AF-P04637-F1-model_v4_out/ directory
# Pockets ranked by druggability score — examine top 3

Examine the top-ranked pockets in PyMOL. Load the pocket PDB files output by fpocket and check whether they coincide with conserved residues, known functional sites, or high-pLDDT regions. A high-druggability pocket score in a low-pLDDT region is not useful — always cross-reference fpocket hits with the confidence map.

Step 4 — Prepare the structure for docking

Step 4

Standard preparation — with AlphaFold-specific steps

AlphaFold structures require the same preparation as crystal structures — plus additional steps specific to predicted models. Run through this sequence in PyMOL before any docking tool processing:

1. Remove the signal peptide and propeptide (if present)

AlphaFold models the full UniProt sequence, which may include signal peptides, propeptides, or transit peptides that are cleaved in the mature protein. Check the UniProt entry for your protein — the “Chain” annotation under PTM/Processing lists the exact residue range of the mature protein. Remove the non-mature regions before docking.

PyMOL — trim to mature protein (example: signal peptide residues 1–24)

remove resi 1-24
save protein_mature.pdb

2. Remove low-confidence terminal regions

N- and C-terminal disordered tails with pLDDT below 50 contribute nothing to docking and can interfere with grid box definition. Remove them:

PyMOL

# Identify and remove very low-confidence terminal residues
select low_term, (resi 1-20 or resi 390-400) and b < 50
remove low_term

3. Add hydrogens and assign protonation states

AlphaFold PDB files do not include hydrogen atoms. Add them and assign protonation states at pH 7.4 using H++ (newapp.chemistry.gatech.edu/h++) or PropKa before converting to PDBQT format. Pay particular attention to histidines in and around the binding site — their protonation state affects binding pocket electrostatics significantly.

4. Energy minimization

AlphaFold structures sometimes contain minor geometric imperfections — slightly non-ideal bond lengths, angles, or rotamers — that cause problems during PDBQT conversion or docking. A brief energy minimization in PyMOL or with GROMACS before conversion resolves these:

PyMOL — quick steepest descent minimization

# Quick geometry cleanup in PyMOL
load protein_mature.pdb
optimize  # basic energy minimization in PyMOL
save protein_minimized.pdb

5. Generate the PDBQT receptor

Convert to PDBQT format using AutoDockTools or Open Babel, following the same preparation steps as for any protein receptor:

Terminal

python prepare_receptor4.py \
  -r protein_minimized.pdb \
  -o receptor.pdbqt \
  -A hydrogens \
  -U nphs_lps

Step 5 — Handle low-confidence regions

Step 5

What to do about loops and disordered regions near the binding site

AlphaFold models of most proteins contain at least some low-confidence regions. How you handle them depends on their proximity to the binding site:

Low-confidence regions far from the binding site

If pLDDT-low regions (below 70) are distant from your intended binding site — terminal regions, surface loops on the opposite face — you have three options, all acceptable: leave them as-is (simplest, doesn't affect docking if truly distant), remove them (cleaner structure, eliminates any steric artifacts), or restrain them during subsequent MD simulation. For simple docking studies, leaving distal low-confidence regions in place is fine.

Low-confidence loops adjacent to the binding site

This is the most consequential case. A flexible loop that borders the binding pocket — even one that isn't directly lining it — affects the pocket shape and volume. If AlphaFold modeled it in an arbitrary conformation (as it often does for low-pLDDT loops), that conformation may occlude or artificially open the pocket.

The best approaches, in order of rigor:

Ensemble docking: Run a short MD simulation (10–50 ns) of the AlphaFold structure, sample multiple frames where the loop adopts different conformations, and dock into each. Report results from the most populated conformation consistent with the literature.
Loop refinement: Use a loop modeling tool (Rosetta Remodel, MODELLER loopmodel) to generate an ensemble of loop conformations and dock into all of them.
Acknowledge the limitation: For simpler studies, perform docking in the AlphaFold conformation, report the pLDDT of loop residues in the methods section, and discuss the limitation explicitly. This is acceptable for hypothesis-generating work.

Apo conformation caveat

Even high-pLDDT binding sites can present a challenge: AlphaFold typically predicts the apo conformation, which for many targets differs from the ligand-bound shape. If your target is known to show induced fit upon ligand binding, consider running post-docking MD simulation to allow the protein to relax around the docked pose — this is the most common workflow for validating AlphaFold-based docking results.

Step 6 — Run and validate the docking

Step 6

Dock and validate with extra care

Run docking with AutoDock Vina, GNINA, or your preferred engine using the prepared receptor. The docking protocol itself is identical to standard docking — but validation is more demanding when using a predicted receptor.

Unlike crystal structure docking, you cannot perform standard self-docking validation (redocking the co-crystallized ligand) because no co-crystallized ligand exists. Use these alternatives instead:

Cross-docking validation: If known actives exist for your target (from ChEMBL, literature IC50 data), dock them and verify they rank above known inactives. This tests whether the predicted binding site can distinguish binders from non-binders.
Pharmacophore consistency: Check whether the top-ranked docking pose makes interactions with residues known to be important from mutagenesis or biochemical data. A pose that forms H-bonds with catalytic residues or known pharmacophore anchors is more credible than one that doesn't.
MD validation: Run a 50–100 ns MD simulation of the protein-ligand complex. A binding mode that is stable in MD and maintains key interactions throughout is significantly more credible than a docking score alone.
Comparison to homolog structures: If the binding site is conserved in a homolog with a crystal structure, dock your compound into the homolog structure and compare binding modes. Consistent binding geometry between the AlphaFold prediction and the experimental homolog structure supports the reliability of both.

AlphaFold vs crystal structure: key differences for docking

Property	AlphaFold structure	Crystal structure
Conformation	Apo state — may differ from active/bound form	Can capture ligand-bound, active, or specific states
Binding site geometry	Uncertain for flexible loops; confident for well-folded cores	Experimentally determined — ground truth at given resolution
Self-docking validation	Not possible — no co-crystallized ligand	Standard validation — redock native ligand, check RMSD <2 Å
Confidence information	pLDDT + PAE — quantified per residue and per pair	B-factors indicate mobility but not prediction confidence
Availability	200M+ proteins, instant download	Limited to crystallizable proteins — ~200,000 in PDB
Water molecules	Not modeled — no structural waters in binding site	Structural waters often present, may mediate binding
Crystal contacts / packing artifacts	None — free from crystal packing distortions	Crystal contacts can distort loops and surface residues

What to report in your methods section

When publishing docking results using an AlphaFold receptor, reviewers expect explicit documentation of the confidence assessment. A complete methods statement should include:

AlphaFold Database version and model version used (e.g. v4)
Mean pLDDT of binding site residues (e.g. "mean pLDDT of active site residues: 87.3")
PAE assessment for multi-domain proteins
Which regions were removed or handled specially (e.g. signal peptide, low-confidence loops)
What validation was used in place of self-docking (cross-docking, pharmacophore, MD)
Statement that no experimental structure was available and the limitation this imposes

AlphaFold docking in one paragraph

Using an AlphaFold structure for docking is more viable than ever — but requires additional confidence checks that crystal structure docking doesn't demand. Check pLDDT for the binding site residues before proceeding: above 70 is workable, below 50 is not. Check PAE for multi-domain targets. Remove signal peptides and disordered terminal regions. Minimize before PDBQT conversion. Because self-docking validation is impossible, validate with known actives, pharmacophore consistency, or MD simulation of the docked complex. Report the binding site pLDDT and your validation approach in the methods section — reviewers will ask if you don't.

Last updated on April 28, 2026

How to Use an AlphaFold Structure for Molecular Docking: Complete Workflow

Before you start: is AlphaFold the right receptor?

Step 1 — Download from the AlphaFold Database

Step 2 — Assess pLDDT and PAE confidence

Step 3 — Identify and evaluate the binding site

Option A: known binding site from literature or mutagenesis

Option B: computational binding site prediction

Step 4 — Prepare the structure for docking

1. Remove the signal peptide and propeptide (if present)

2. Remove low-confidence terminal regions

3. Add hydrogens and assign protonation states

4. Energy minimization

5. Generate the PDBQT receptor

Step 5 — Handle low-confidence regions

Low-confidence regions far from the binding site

Low-confidence loops adjacent to the binding site

Step 6 — Run and validate the docking

AlphaFold vs crystal structure: key differences for docking

What to report in your methods section

AlphaFold docking in one paragraph

AlphaFold2 vs AlphaFold3 vs ESMFold: Which Protein Structure Prediction Tool Should You Use?

The Complete Guide to Protein Structure Prediction: Methods, Tools & Best Practices

How to Use AlphaFold2 with ColabFold: Predict a Protein Structure for Free (2026 Guide)

Homology Modeling Tutorial: Swiss-Model and MODELLER for Beginners

How to Interpret AlphaFold Output: pLDDT Scores, PAE Plots and Model Quality

AlphaFold vs Homology Modeling: When to Use Each

Leave a Reply Cancel reply

Before you start: is AlphaFold the right receptor?

Step 1 — Download from the AlphaFold Database

Step 2 — Assess pLDDT and PAE confidence

Step 3 — Identify and evaluate the binding site

Option A: known binding site from literature or mutagenesis

Option B: computational binding site prediction

Step 4 — Prepare the structure for docking

1. Remove the signal peptide and propeptide (if present)

2. Remove low-confidence terminal regions

3. Add hydrogens and assign protonation states

4. Energy minimization

5. Generate the PDBQT receptor

Step 5 — Handle low-confidence regions

Low-confidence regions far from the binding site

Low-confidence loops adjacent to the binding site

Step 6 — Run and validate the docking

AlphaFold vs crystal structure: key differences for docking

What to report in your methods section

AlphaFold docking in one paragraph

Similar Posts

Leave a Reply Cancel reply