GNINA Tutorial: Deep Learning Molecular Docking as an Alternative to AutoDock Vina
GNINA does everything AutoDock Vina does — but replaces the scoring function with a convolutional neural network trained on millions of protein-ligand structures. The result is consistently better pose prediction with an almost identical workflow. This tutorial shows you exactly how to make the switch.
What GNINA is and why it exists
GNINA (pronounced “ninja”) is a molecular docking program developed at the University of Michigan by David Koes and colleagues. It was first published in 2021 and has been actively maintained since. The name stands for GNINA Neural Integration of Autoencoded structure.
GNINA emerged from a straightforward observation: AutoDock Vina’s search algorithm is good, but its scoring function — a hybrid empirical/knowledge-based function — was designed in the early 2000s and doesn’t capture the full complexity of protein-ligand interactions. Modern deep learning, trained on the hundreds of thousands of protein-ligand complexes now available in the PDB, can do better.
The solution GNINA implements is elegant: keep Vina’s iterated local search algorithm exactly as-is, and replace only the scoring function with a convolutional neural network (CNN) that has learned to evaluate binding poses directly from 3D molecular structure. Same search, smarter scorer.
How it differs from AutoDock Vina
The architectural difference is significant even if the user-facing workflow is nearly identical.
- Iterated local search algorithm
- Explores conformational space
- Hybrid empirical scoring function
- Hand-crafted features (H-bonds, VdW, hydrophobic)
- Outputs: affinity in kcal/mol
- Same iterated local search algorithm
- Explores conformational space identically
- CNN scoring function
- Learned features from millions of PDB structures
- Outputs: CNN score + pose score + affinity
In head-to-head benchmarks on the CASF-2016 dataset — the standard for evaluating docking programs — GNINA achieves roughly 10–15 percentage points higher success rate in reproducing crystal poses (RMSD < 2 Å) compared to standard Vina. For virtual screening enrichment, the improvement is similarly consistent across diverse target classes.
Installation
GNINA distributes as a pre-compiled Linux binary — no compilation required. It runs natively on Linux and on Windows via WSL2. macOS support exists but is less well-maintained; Linux is strongly preferred.
wget https://github.com/gnina/gnina/releases/latest/download/gnina
chmod +x gnina
gnina from any directory:# If you have sudo access:
sudo mv gnina /usr/local/bin/gnina
# Without sudo (e.g. on HPC clusters):
mkdir -p ~/bin
mv gnina ~/bin/gnina
echo 'export PATH="$HOME/bin:$PATH"' >> ~/.bashrc
source ~/.bashrc
gnina --version
Built Jun 10 2024
Using CUDA: yes (device 0: NVIDIA RTX 3080)
If CUDA is detected, GNINA will use GPU acceleration automatically. If not, it falls back to CPU — still functional, just slower.
Running your first dock with GNINA
This is the section most tutorials bury the lead on: the switch from Vina to GNINA is a one-word change on the command line. If you have a working Vina setup, you already have a working GNINA setup.
Using the same COX-2 receptor and config file from the previous tutorials:
vina --config config.txt
gnina --config config.txt
Same config file. Same receptor PDBQT. Same ligand PDBQT. Same output format. The only difference is the binary you’re calling.
GNINA also accepts all the same flags as Vina directly on the command line without a config file:
gnina \
--receptor receptor/5KIR_receptor.pdbqt \
--ligand ligand/ibuprofen.pdbqt \
--center_x 15.234 --center_y -8.441 --center_z 22.178 \
--size_x 20 --size_y 20 --size_z 20 \
--exhaustiveness 8 \
--num_modes 9 \
--out output/gnina_docked.sdf \
--log output/gnina.log
Note: GNINA’s preferred output format is SDF rather than PDBQT — this is actually more convenient since SDF files open directly in most molecular visualization tools without conversion.
Understanding GNINA’s output scores
GNINA outputs three scores per pose rather than Vina’s one. Understanding what each means is important for using them correctly.
For most use cases: use CNN score to select which pose to report, and CNN affinity to rank compounds in a virtual screening campaign. The Vina score is there for reference — you don’t need to act on it unless you’re running a comparison study.
| (kcal/mol)| | (kcal/mol)
—–+———–+———–+————-
1 -8.71 0.842 -8.91
2 -8.43 0.781 -8.55
3 -8.11 0.694 -8.22
4 -7.88 0.612 -7.94
5 -7.54 0.543 -7.71
GNINA outputs three columns per pose. The CNN score (0–1) is the primary quality indicator for pose selection.
When to use GNINA vs Vina
The answer is almost always GNINA for new projects — but there are a few cases where Vina is still the right choice.
| Situation | Use Vina | Use GNINA |
|---|---|---|
| Learning docking for the first time | Better documentation, larger community | — |
| Single-ligand docking, accuracy matters | — | Better pose prediction |
| Virtual screening, GPU available | — | Faster + more accurate |
| Virtual screening, CPU only, large library | Faster per compound | — |
| Reproducing a published Vina result | For exact reproducibility | — |
| Hit validation after initial screening | — | Better confidence in top poses |
| Difficult targets (induced fit, flexible loops) | — | CNN generalizes better |
| AlphaFold-predicted structures | — | CNN less sensitive to minor structural errors |
The one case where you should definitely not default to GNINA is when you’re learning. Start with Vina — its error messages are clearer, the community troubleshooting resources are more extensive, and the one-score output is simpler to interpret. Once you understand what docking is doing and have a working protocol, switching to GNINA is a five-second change that immediately improves your results.
Using GNINA for virtual screening
GNINA slots into the virtual screening pipeline from the previous tutorial with one change: replace vina with gnina in the subprocess call, and update the score parser to extract the CNN affinity column instead of the Vina affinity.
Update the dock_one function in your run_vs.py script:
# GNINA writes scores to the SDF file properties
# Parse CNN affinity from SDF output
def parse_gnina_score(output_sdf):
try:
text = Path(output_sdf).read_text()
match = re.search(r"> <CNNaffinity>\n([-\d.]+)", text)
return float(match.group(1)) if match else None
except:
return None
# Also grab CNN score for pose quality filtering
def parse_gnina_cnn_score(output_sdf):
try:
text = Path(output_sdf).read_text()
match = re.search(r"> <CNNscore>\n([\d.]+)", text)
return float(match.group(1)) if match else None
except:
return None
Also add a CNN score filter to your hit selection: compounds with a CNN score below 0.5 — even if their CNN affinity score looks good — are flagged as low-confidence poses worth deprioritizing.
The upgrade in one sentence
GNINA is AutoDock Vina with a neural network scoring function trained on the entire Protein Data Bank — it uses the same workflow, the same input files, and the same config format, and consistently produces better poses and better virtual screening enrichment, for free, with a one-word change to your command line.
For new projects, there’s no reason not to use it. For existing Vina workflows, switching takes thirty seconds and the only downside is slightly slower CPU performance. The accuracy improvement is real and well-documented.