10 Common Molecular Docking Mistakes (And How to Avoid Every One)

10 Common Molecular Docking Mistakes (And How to Avoid Every One)

Most bad docking results don’t come from the algorithm — they come from avoidable preparation and interpretation errors that produce wrong answers without any error message. Here are the ten mistakes that appear most often in troubleshooting forums, paper review comments, and lab group postmortems, with exact fixes for each.

Severity: Critical — invalidates results High — significantly degrades accuracy Medium — easy to miss, worth fixing
01
Critical
Wrong grid box — too large, too small, or miscentered

The grid box tells Vina where to search for binding poses. Get it wrong and you’re searching the wrong region of the protein — or searching such a large region that the algorithm spreads its sampling across irrelevant surface area and misses the real binding site entirely. This is the most common source of suspiciously bad scores with no error message.

A box that’s too small will clip the ligand, preventing it from sampling its full conformational space. A box that’s too large wastes exhaustiveness budget on empty space. A miscentered box will dock the ligand to the wrong site altogether — and if the score looks reasonable, you may not even notice.

The fix
Always derive the box center from the centroid of a co-crystallized ligand in the binding site — use PyMOL’s centerofmass command on the native ligand selection. Size the box so it extends 8–10 Å beyond the largest dimension of your ligand. After docking, verify in PyMOL that the top pose sits inside the pocket before reporting any results.
02
Critical
Skipping protein preparation — docking a raw PDB file

A raw PDB file downloaded from the RCSB cannot go directly into AutoDock Vina. It is missing hydrogens (which X-ray crystallography cannot resolve), contains waters and co-crystallized ligands that will block your binding site, and lacks the partial charges that the scoring function requires. Docking against an unprepared receptor produces scores that are numerically plausible but physically meaningless.

This happens more than you’d expect — often when someone is quickly testing a hypothesis and plans to “do it properly later.” The later rarely comes, and the results get presented.

The fix
Always run the full preparation pipeline: remove waters (remove resn HOH in PyMOL), remove co-crystallized ligands and crystallization artifacts, add polar hydrogens, assign Gasteiger charges, and generate a valid PDBQT. Check for ? atom types in the output — any present means preparation is incomplete.
03
Critical
Ignoring protonation states of active site residues

AutoDockTools assigns protonation states using simple default rules that are correct for most surface residues but frequently wrong for active site residues — especially histidines, which can be neutral (HIE or HID) or positively charged (HIP) depending on local electrostatic environment and pH. A misprotonated catalytic histidine can reverse the sign of a key hydrogen bond and systematically misrank every compound you dock.

Aspartate and glutamate residues in buried, hydrophobic environments are also frequently misprotonated by default tools. This is one of the highest-impact improvements you can make to a docking protocol, and one of the most consistently skipped.

The fix
Use H++ (biophysics.cs.vt.edu/H++) or PropKa to predict protonation states at pH 7.4 before adding hydrogens. Check the literature for your specific target — many enzymes have well-characterized active site protonation states. If you must use AutoDockTools defaults, at minimum check every histidine in or near the binding site manually.
04
High
Keeping alternate conformations in the receptor

PDB files sometimes contain alternate conformations for residues with significant positional disorder — indicated by an “A” or “B” in column 17 of the ATOM records. When both are present, AutoDockTools can produce duplicate atoms or assign incorrect atom types, resulting in a PDBQT file that looks valid but produces distorted scoring. Active site residues with alternates are particularly damaging because they directly affect where and how ligands bind.

The fix
In PyMOL, remove alternate conformations before preparation: remove not (alt ''+A) then alter all, alt='' then sort. Or use the --deleteAltB flag when running prepare_receptor4.py. Always keep only the primary conformation (A) unless you have specific reason to use the alternate.
05
High
Removing the wrong heteroatoms — or not removing enough

Two opposite errors are common here. The first is removing everything labeled HETATM, including catalytic metal ions that are biologically essential — a zinc ion in a metalloprotease or a heme iron in a cytochrome P450 is not a crystallization artifact. Removing it collapses the binding site and produces nonsensical poses. The second error is not removing enough — leaving behind glycerol (GOL), sulfate (SO4), polyethylene glycol (PEG), or other cryoprotectants that occupy the binding site and block the ligand from docking correctly.

The fix
Before removing any heteroatom, look it up. Check the PDB entry’s ligand list and cross-reference with the literature. Remove: HOH, SO4, GOL, EDO, PEG, and other known crystallization additives. Keep: biologically relevant metals, cofactors (FAD, NAD, heme), and any ligand you’re deliberately keeping as part of the binding site. When in doubt, ask: is this molecule present in vivo?
06
High
Poor ligand preparation — wrong charges, bad 3D geometry

A ligand PDBQT generated from a 2D structure without proper 3D conformer generation, or with incorrect partial charges, will dock poorly regardless of how well the receptor is prepared. Common problems include flat ring systems that should be nonplanar, incorrect ionization states (docking a carboxylic acid as neutral at pH 7.4 instead of deprotonated), and missing or incorrect rotatable bond definitions that prevent the ligand from exploring its true conformational flexibility.

The fix
Always generate 3D conformers from a proper SMILES or SDF source using Open Babel with --gen3d and -p 7.4 flags to set correct protonation at physiological pH. Assign Gasteiger charges explicitly with --partialcharge gasteiger. For publication-quality work, verify the 3D geometry visually in PyMOL — rings should be non-planar where expected, bond lengths and angles should look chemically reasonable.
07
High
Running with exhaustiveness too low

The default exhaustiveness = 8 is fine for initial exploration but is widely insufficient for publication-quality results, especially for flexible ligands with many rotatable bonds or for targets with complex binding sites. Low exhaustiveness means the search algorithm may not have adequately sampled conformational space — the top-ranked pose may not be the true energy minimum, just the best pose found in a limited search. Results from under-sampled searches are poorly reproducible: run the same job twice and you’ll get meaningfully different scores.

The fix
For publication, use exhaustiveness = 16 at minimum; 32 for flexible ligands (>8 rotatable bonds) or difficult targets. Test reproducibility by running the same docking three times with different random seeds — if top scores vary by more than 0.5 kcal/mol between runs, increase exhaustiveness until they converge. Report your exhaustiveness setting in the methods section.
08
Medium
Only inspecting mode 1 — ignoring alternate poses

Vina’s mode 1 is the highest-scoring pose, not necessarily the most biologically meaningful one. The scoring function is an approximation — it can favor poses with favorable electrostatic terms that are geometrically wrong over poses with correct hydrogen bonding geometry that score slightly lower. For flexible ligands, the correct binding mode sometimes appears at mode 2 or 3, with mode 1 representing a false energy minimum that looks good numerically but makes no chemical sense.

The fix
Always visually inspect at least the top 3 poses for every ligand you intend to report. If you have existing SAR data, mutagenesis results, or pharmacophore information for your target, use it to select the most chemically reasonable pose rather than defaulting to mode 1. A pose that places a known critical pharmacophore feature in contact with a key residue is more trustworthy than one that merely has the best score.
09
Medium
Over-trusting the score — treating kcal/mol as experimental Kd

The most seductive mistake in docking. A compound scores −11.4 kcal/mol and suddenly it’s being described as a “potent inhibitor” in a draft paper. Vina scores are estimates of binding free energy with a correlation of roughly r = 0.5–0.6 against experimental affinities across diverse compound sets. A compound scoring −11 may experimentally bind at micromolar affinity. A compound scoring −7 may be your most potent hit. The score is useful for ranking within a campaign; it is not a substitute for experimental measurement.

The fix
Frame docking results as hypotheses. Use scores for ranking and filtering — not for predicting IC50, Kd, or activity. In papers and presentations, always pair a docking score with a qualifier: “predicted binding affinity” or “estimated ΔG” — never raw kcal/mol values presented as if they were experimental data. Biological activity claims require experimental validation, full stop.
10
Critical
Skipping self-docking validation before reporting results

This is the mistake that reviewers catch most reliably — and the one that most consistently indicates a protocol that cannot be trusted. Self-docking validation (redocking the co-crystallized ligand back into the prepared receptor and comparing the result to the crystal structure) is the standard check that your preparation workflow is producing meaningful results. If your protocol cannot reproduce a known experimental pose to within 2.0 Å RMSD, there is no basis for trusting what it tells you about novel compounds.

Journals in computational chemistry and structural biology increasingly require this validation to be reported. Skipping it doesn’t just weaken your paper — it means you may be acting on results from a broken protocol without knowing it.

The fix
Before running any novel ligands, redock the co-crystallized ligand from your PDB structure using exactly the same receptor preparation and grid box you’ll use for your campaign. Calculate RMSD between the top-ranked pose and the crystal pose. If RMSD < 2.0 Å, your protocol is validated. If not, diagnose and fix the preparation before proceeding. Report the validation RMSD in your methods section.

Quick reference: all 10 mistakes at a glance

MistakeSeverityCore fix
Wrong grid boxCriticalCenter on co-crystallized ligand centroid; verify pose in PyMOL
Skipping protein prepCriticalFull pipeline: remove waters/ligands, add H, assign charges, generate PDBQT
Wrong protonation statesCriticalUse H++ or PropKa at pH 7.4; check active site His manually
Alternate conformationsHighRemove alt B with deleteAltB flag or PyMOL commands
Wrong heteroatoms removedHighLook up each HETATM before removing; keep biological metals/cofactors
Bad ligand preparationHighOpen Babel with --gen3d -p 7.4 --partialcharge gasteiger
Exhaustiveness too lowHighUse 16 minimum; 32 for flexible ligands; test reproducibility
Only checking mode 1MediumInspect top 3 poses; use SAR data to select biologically best pose
Over-trusting scoresMediumScores rank compounds; they do not predict experimental activity
No self-docking validationCriticalRedock co-crystallized ligand; require RMSD < 2.0 Å before proceeding
The pattern behind most of these mistakes
Eight of the ten mistakes above share a common root: they produce plausible-looking numerical output with no error message. Docking will run to completion and return scores regardless of how badly the preparation went. The algorithm has no way to tell you that your protein is missing hydrogens or your grid box is in the wrong place. The only defense is a rigorous, checklist-driven workflow — and self-docking validation as the final gate.

The one-paragraph version

Docking is only as trustworthy as the preparation that precedes it. A correctly run search algorithm on a badly prepared receptor produces wrong answers that look right — and that’s a more dangerous failure mode than an obvious error. Build a preparation checklist, run self-docking validation on every new receptor, inspect your poses visually before trusting any score, and report your validation RMSD. Do those four things and you’ll avoid most of what trips up beginners and gets papers sent back from reviewers.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *