How to Interpret Molecular Docking Results: Scores, Poses & What They Mean
You’ve run AutoDock Vina and you’re staring at a table of scores. What does −8.3 kcal/mol actually mean? Is that good? How do you know if the pose is biologically meaningful? This guide explains exactly how to read docking results — and the interpretation mistakes that most beginners make.
Understanding binding affinity scores (kcal/mol)
AutoDock Vina outputs binding affinity as a number in kcal/mol — kilocalories per mole. This is an estimate of the binding free energy: how much free energy is released when the ligand binds to the protein. The more negative the number, the more favorable the predicted binding.
Thermodynamically, the relationship is straightforward: a more negative ΔG means the bound state is more stable relative to the unbound state. In practice, Vina’s scores are a computed approximation of this quantity, not a direct measurement — which is why you should always treat them as estimates, not facts.
These ranges are useful guidelines, not hard cutoffs. What counts as a “good” score depends on the target. Enzymes with deep, enclosed binding pockets tend to produce stronger scores than proteins with shallow or open binding sites. A score of −8 kcal/mol is impressive for a GPCR and unremarkable for a protease.
Scores vs. experimental IC50 — how well do they correlate?
The honest answer is: moderately. Across diverse compound datasets, the correlation between Vina scores and experimental binding affinities (Kd, Ki, or IC50) is roughly r = 0.5–0.6. That’s real signal — docking genuinely enriches hit rates — but it’s far from predictive. A compound scoring −10 kcal/mol may experimentally bind worse than one scoring −8, especially if the −10 compound has a problematic pose or unusually favorable electrostatic terms that the scoring function overweights.
Use scores for ranking and filtering. Do not use them to predict experimental IC50 values.
What the RMSD columns mean
After the affinity column, Vina prints two more columns: rmsd l.b. and rmsd u.b. These are lower bound and upper bound estimates of how different each pose is from the top-ranked pose (mode 1).
RMSD stands for Root Mean Square Deviation — a measure of atomic displacement between two structures. An RMSD of 1.0 Å means the atoms in one pose are, on average, 1 Å away from their positions in the reference pose.
A practical rule: if modes 1, 2, and 3 all have low RMSD relative to each other (l.b. < 2 Å), the algorithm has converged — it found a genuine energy minimum and sampled it repeatedly. This is a good sign. If all 9 modes have wildly different RMSDs and scores spread across 3+ kcal/mol, the search hasn’t converged — increase exhaustiveness and re-run.
Analyzing poses in PyMOL
A score is a number. A pose is a 3D structure. You need both to make a judgment about whether a docking result is meaningful. Here is a systematic workflow for analyzing poses visually in PyMOL after a Vina run.
-
1Load receptor and posesLoad both files and set up the display. The docked output PDBQT contains all modes as separate MODEL entries — PyMOL loads them as states of a single object.
load receptor/5KIR_receptor.pdbqt, receptor
load output/docked.pdbqt, poses
hide everything
show cartoon, receptor
show sticks, poses
zoom poses
-
2Check whether the ligand is inside the binding pocketThis is the first and most important check. Use the state slider at the bottom of the PyMOL window to cycle through poses. For each top pose, confirm that the ligand sits inside the cavity — not floating above the surface, not partially buried in the protein backbone, not at a symmetry-related site on the protein exterior.
-
3Identify contacts with key binding site residuesShow the residues within 5 Å of the ligand as sticks, and use the distance tool to measure interactions. Hydrogen bonds should be 2.5–3.5 Å between donor and acceptor heavy atoms. Hydrophobic contacts should be 3.5–5.0 Å. Compare the contacts you see to published mutagenesis or structural data for your target.
# Show binding site residues
select binding_site, receptor and (byres poses expand 5)
show sticks, binding_site
# Measure a specific potential hydrogen bond
distance hbond1, poses and name O, receptor and resi 120 and name NH2
# Find all contacts within 4 Å
select contacts, receptor and (byres poses around 4)
-
4Check for steric clashesIf the ligand overlaps with protein atoms (visible as interpenetrating sticks), the preparation went wrong somewhere. This should not happen with a properly prepared receptor — if you see it, check for unknown atom types in your PDBQT file and re-prepare.
-
5Compare the top 3 posesDon’t just look at mode 1. Use the state controls to examine modes 1, 2, and 3. If they all show the same binding geometry with minor variations, you have convergent, trustworthy results. If mode 2 shows the ligand flipped or displaced, decide which pose is more chemically reasonable based on the interactions it makes.
What a good docking result looks like
There is no single number that defines a good result. A good docking result is a combination of a reasonable score, a chemically sensible pose, and consistency between poses. Here’s what to look for and what raises red flags.
Common interpretation mistakes
-
Reporting only the top score without inspecting the poseThe most common mistake in docking papers. A score of −10 kcal/mol means nothing if the ligand is half outside the binding pocket or making no chemically meaningful contacts. The score is hypothesis; the pose is evidence.Always show the binding pose in your figures. Always describe which residues the ligand contacts and compare to known pharmacophore data.
-
Comparing scores across different targetsA score of −9 kcal/mol against kinase A does not mean the same thing as −9 kcal/mol against kinase B. Binding site size, polarity, and geometry all affect absolute scores. Rankings within one target are meaningful; cross-target comparisons of raw scores are not.When comparing across targets, use normalized metrics or report enrichment factors from benchmark sets rather than raw kcal/mol values.
-
Treating mode 1 as definitively correctVina ranks poses by predicted energy, not by biological correctness. The top-ranked pose is the most energetically favorable according to a flawed scoring function. The actual binding mode may be mode 2 or 3, especially for flexible ligands or targets with known induced-fit behavior.Inspect the top 3 poses. If literature SAR data exists, use it to select the most chemically reasonable pose rather than defaulting to mode 1.
-
Skipping self-docking validation before reporting resultsIf you have not confirmed that your protocol can reproduce a known crystal structure pose (RMSD < 2.0 Å), you have no basis for trusting your results on novel ligands. This is not optional for publication-quality work.Always run self-docking validation on the co-crystallized ligand before reporting any docking results. Report the validation RMSD in your methods section.
-
Confusing docking score with biological activityDocking scores predict binding affinity, not biological activity. A compound can bind tightly and be inactive (wrong binding mode, wrong mechanism, poor cell permeability). A compound can have a mediocre docking score and be an excellent drug candidate for reasons docking cannot assess.Frame docking results as hypotheses about binding affinity. Biological activity requires experimental validation. Never claim activity based on docking alone.
The right way to report docking results
Good docking analysis tells a story: here is the score, here is the pose, here are the specific contacts the ligand makes, here is why those contacts are biologically meaningful, and here is how this result was validated. A score alone is never enough.
The best docking figures show the ligand inside the binding pocket as sticks, with the key interacting residues labeled, hydrogen bonds drawn as dashed lines, and a caption that includes both the binding affinity score and the validation RMSD. That level of detail signals to reviewers that you understand what docking can and cannot tell you.
The three-question test for any docking result
Before trusting a docking result, ask three questions. First: is the pose inside the binding pocket and making chemically reasonable contacts with known key residues? Second: are the top poses convergent — do modes 1, 2, and 3 show consistent binding geometry? Third: did your protocol pass self-docking validation on a known co-crystallized ligand? If the answer to any of these is no, the result needs more work before it can be reported or acted on.