How to Interpret Molecular Docking Results: Scores, Poses & What They Mean

How to Interpret Molecular Docking Results: Scores, Poses & What They Mean

You’ve run AutoDock Vina and you’re staring at a table of scores. What does −8.3 kcal/mol actually mean? Is that good? How do you know if the pose is biologically meaningful? This guide explains exactly how to read docking results — and the interpretation mistakes that most beginners make.

Understanding binding affinity scores (kcal/mol)

AutoDock Vina outputs binding affinity as a number in kcal/mol — kilocalories per mole. This is an estimate of the binding free energy: how much free energy is released when the ligand binds to the protein. The more negative the number, the more favorable the predicted binding.

Thermodynamically, the relationship is straightforward: a more negative ΔG means the bound state is more stable relative to the unbound state. In practice, Vina’s scores are a computed approximation of this quantity, not a direct measurement — which is why you should always treat them as estimates, not facts.

Binding affinity score reference — AutoDock Vina (kcal/mol)
−11 to −14
Very strong — rare, worth prioritizing
−9 to −11
Strong — solid hit in most contexts
−7 to −9
Moderate — evaluate pose carefully
Above −7
Weak — unlikely useful hit

These ranges are useful guidelines, not hard cutoffs. What counts as a “good” score depends on the target. Enzymes with deep, enclosed binding pockets tend to produce stronger scores than proteins with shallow or open binding sites. A score of −8 kcal/mol is impressive for a GPCR and unremarkable for a protease.

Scores are not comparable across different targets
If compound A scores −10 kcal/mol against protein X and compound B scores −10 kcal/mol against protein Y, this tells you nothing about which compound binds more tightly in absolute terms. Scores are only meaningful for ranking compounds against the same target. Never use them to compare affinities across different proteins.

Scores vs. experimental IC50 — how well do they correlate?

The honest answer is: moderately. Across diverse compound datasets, the correlation between Vina scores and experimental binding affinities (Kd, Ki, or IC50) is roughly r = 0.5–0.6. That’s real signal — docking genuinely enriches hit rates — but it’s far from predictive. A compound scoring −10 kcal/mol may experimentally bind worse than one scoring −8, especially if the −10 compound has a problematic pose or unusually favorable electrostatic terms that the scoring function overweights.

Use scores for ranking and filtering. Do not use them to predict experimental IC50 values.

What the RMSD columns mean

After the affinity column, Vina prints two more columns: rmsd l.b. and rmsd u.b. These are lower bound and upper bound estimates of how different each pose is from the top-ranked pose (mode 1).

RMSD stands for Root Mean Square Deviation — a measure of atomic displacement between two structures. An RMSD of 1.0 Å means the atoms in one pose are, on average, 1 Å away from their positions in the reference pose.

< 2 Å
Similar pose
This pose and mode 1 share essentially the same binding geometry. The algorithm is converging on the same solution.
2–4 Å
Different orientation
Meaningfully different from mode 1 — worth inspecting visually. Could be an alternate binding mode or a non-specific pose.
> 4 Å
Distinct pose
A substantially different pose. Usually indicates the ligand is sampling a different region of the binding site — or has escaped outside it entirely.

A practical rule: if modes 1, 2, and 3 all have low RMSD relative to each other (l.b. < 2 Å), the algorithm has converged — it found a genuine energy minimum and sampled it repeatedly. This is a good sign. If all 9 modes have wildly different RMSDs and scores spread across 3+ kcal/mol, the search hasn’t converged — increase exhaustiveness and re-run.

RMSD in validation vs. RMSD in the output table
These are two different uses of RMSD. The output table RMSDs compare docked poses to each other. Validation RMSD compares your top docked pose to the experimentally determined crystal pose — the number used to judge whether your docking protocol is working. The validation threshold is 2.0 Å. The output table RMSDs have no pass/fail threshold.

Analyzing poses in PyMOL

A score is a number. A pose is a 3D structure. You need both to make a judgment about whether a docking result is meaningful. Here is a systematic workflow for analyzing poses visually in PyMOL after a Vina run.

  • 1
    Load receptor and poses
    Load both files and set up the display. The docked output PDBQT contains all modes as separate MODEL entries — PyMOL loads them as states of a single object.
PyMOL command line
load receptor/5KIR_receptor.pdbqt, receptor
load output/docked.pdbqt, poses
hide everything
show cartoon, receptor
show sticks, poses
zoom poses
  • 2
    Check whether the ligand is inside the binding pocket
    This is the first and most important check. Use the state slider at the bottom of the PyMOL window to cycle through poses. For each top pose, confirm that the ligand sits inside the cavity — not floating above the surface, not partially buried in the protein backbone, not at a symmetry-related site on the protein exterior.
  • 3
    Identify contacts with key binding site residues
    Show the residues within 5 Å of the ligand as sticks, and use the distance tool to measure interactions. Hydrogen bonds should be 2.5–3.5 Å between donor and acceptor heavy atoms. Hydrophobic contacts should be 3.5–5.0 Å. Compare the contacts you see to published mutagenesis or structural data for your target.
PyMOL command line
# Show binding site residues
select binding_site, receptor and (byres poses expand 5)
show sticks, binding_site

# Measure a specific potential hydrogen bond
distance hbond1, poses and name O, receptor and resi 120 and name NH2

# Find all contacts within 4 Å
select contacts, receptor and (byres poses around 4)
  • 4
    Check for steric clashes
    If the ligand overlaps with protein atoms (visible as interpenetrating sticks), the preparation went wrong somewhere. This should not happen with a properly prepared receptor — if you see it, check for unknown atom types in your PDBQT file and re-prepare.
  • 5
    Compare the top 3 poses
    Don’t just look at mode 1. Use the state controls to examine modes 1, 2, and 3. If they all show the same binding geometry with minor variations, you have convergent, trustworthy results. If mode 2 shows the ligand flipped or displaced, decide which pose is more chemically reasonable based on the interactions it makes.

What a good docking result looks like

There is no single number that defines a good result. A good docking result is a combination of a reasonable score, a chemically sensible pose, and consistency between poses. Here’s what to look for and what raises red flags.

Good signs
Ligand fully inside the pocket
The entire ligand sits within the binding cavity with no atoms extending into solvent.
Red flags
Ligand on the protein surface
The ligand docked to an exterior surface region — grid box was too large or incorrectly centered.
Good signs
H-bonds to known key residues
The pose makes hydrogen bonds to residues known from mutagenesis or other structural data to be important for binding.
Red flags
No contacts with key residues
Good score but the ligand makes no meaningful contacts with the pharmacophore — likely a false positive driven by scoring function artifacts.
Good signs
Modes 1–3 are convergent
Top poses are similar to each other (low inter-mode RMSD) — the algorithm found a stable energy minimum and sampled it consistently.
Red flags
All poses look completely different
Non-convergent sampling suggests the search space is too large or exhaustiveness is too low. Increase exhaustiveness to 16–32 and re-run.
Good signs
Score consistent with target class
The score is in the expected range for your target type — not suspiciously better or worse than known binders from the literature.
Red flags
Implausibly strong score
Scores below −13 for a drug-like molecule are rare and often indicate a preparation artifact, a too-small grid box concentrating the search, or a known problem compound (e.g. PAINS).

Common interpretation mistakes

  • Reporting only the top score without inspecting the pose
    The most common mistake in docking papers. A score of −10 kcal/mol means nothing if the ligand is half outside the binding pocket or making no chemically meaningful contacts. The score is hypothesis; the pose is evidence.
    Always show the binding pose in your figures. Always describe which residues the ligand contacts and compare to known pharmacophore data.
  • Comparing scores across different targets
    A score of −9 kcal/mol against kinase A does not mean the same thing as −9 kcal/mol against kinase B. Binding site size, polarity, and geometry all affect absolute scores. Rankings within one target are meaningful; cross-target comparisons of raw scores are not.
    When comparing across targets, use normalized metrics or report enrichment factors from benchmark sets rather than raw kcal/mol values.
  • Treating mode 1 as definitively correct
    Vina ranks poses by predicted energy, not by biological correctness. The top-ranked pose is the most energetically favorable according to a flawed scoring function. The actual binding mode may be mode 2 or 3, especially for flexible ligands or targets with known induced-fit behavior.
    Inspect the top 3 poses. If literature SAR data exists, use it to select the most chemically reasonable pose rather than defaulting to mode 1.
  • Skipping self-docking validation before reporting results
    If you have not confirmed that your protocol can reproduce a known crystal structure pose (RMSD < 2.0 Å), you have no basis for trusting your results on novel ligands. This is not optional for publication-quality work.
    Always run self-docking validation on the co-crystallized ligand before reporting any docking results. Report the validation RMSD in your methods section.
  • Confusing docking score with biological activity
    Docking scores predict binding affinity, not biological activity. A compound can bind tightly and be inactive (wrong binding mode, wrong mechanism, poor cell permeability). A compound can have a mediocre docking score and be an excellent drug candidate for reasons docking cannot assess.
    Frame docking results as hypotheses about binding affinity. Biological activity requires experimental validation. Never claim activity based on docking alone.
How to report docking results in a paper
The methods section should include: the software version and scoring function used, the source and resolution of the receptor structure, how the receptor was prepared (including protonation state handling), the grid box center and dimensions, the exhaustiveness setting, how many poses were generated, and the self-docking validation RMSD. Reviewers increasingly expect all of this.

The right way to report docking results

Good docking analysis tells a story: here is the score, here is the pose, here are the specific contacts the ligand makes, here is why those contacts are biologically meaningful, and here is how this result was validated. A score alone is never enough.

The best docking figures show the ligand inside the binding pocket as sticks, with the key interacting residues labeled, hydrogen bonds drawn as dashed lines, and a caption that includes both the binding affinity score and the validation RMSD. That level of detail signals to reviewers that you understand what docking can and cannot tell you.

The three-question test for any docking result

Before trusting a docking result, ask three questions. First: is the pose inside the binding pocket and making chemically reasonable contacts with known key residues? Second: are the top poses convergent — do modes 1, 2, and 3 show consistent binding geometry? Third: did your protocol pass self-docking validation on a known co-crystallized ligand? If the answer to any of these is no, the result needs more work before it can be reported or acted on.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *