GNINA Tutorial: Deep Learning Molecular Docking as an Alternative to AutoDock Vina

GNINA Tutorial: Deep Learning Molecular Docking as an Alternative to AutoDock Vina

GNINA does everything AutoDock Vina does — but replaces the scoring function with a convolutional neural network trained on millions of protein-ligand structures. The result is consistently better pose prediction with an almost identical workflow. This tutorial shows you exactly how to make the switch.

What GNINA is and why it exists

GNINA (pronounced “ninja”) is a molecular docking program developed at the University of Michigan by David Koes and colleagues. It was first published in 2021 and has been actively maintained since. The name stands for GNINA Neural Integration of Autoencoded structure.

GNINA emerged from a straightforward observation: AutoDock Vina’s search algorithm is good, but its scoring function — a hybrid empirical/knowledge-based function — was designed in the early 2000s and doesn’t capture the full complexity of protein-ligand interactions. Modern deep learning, trained on the hundreds of thousands of protein-ligand complexes now available in the PDB, can do better.

The solution GNINA implements is elegant: keep Vina’s iterated local search algorithm exactly as-is, and replace only the scoring function with a convolutional neural network (CNN) that has learned to evaluate binding poses directly from 3D molecular structure. Same search, smarter scorer.

Key paper
McNutt et al. (2021) “GNINA 1.0: molecular docking with deep learning” — Journal of Cheminformatics. If you use GNINA in a publication, cite this paper. The benchmark data in this paper is also the best independent comparison of GNINA vs Vina pose prediction accuracy.

How it differs from AutoDock Vina

The architectural difference is significant even if the user-facing workflow is nearly identical.

AutoDock Vina
Traditional pipeline
  • Iterated local search algorithm
  • Explores conformational space
  • Hybrid empirical scoring function
  • Hand-crafted features (H-bonds, VdW, hydrophobic)
  • Outputs: affinity in kcal/mol
replaced
GNINA
Deep learning pipeline
  • Same iterated local search algorithm
  • Explores conformational space identically
  • CNN scoring function
  • Learned features from millions of PDB structures
  • Outputs: CNN score + pose score + affinity

In head-to-head benchmarks on the CASF-2016 dataset — the standard for evaluating docking programs — GNINA achieves roughly 10–15 percentage points higher success rate in reproducing crystal poses (RMSD < 2 Å) compared to standard Vina. For virtual screening enrichment, the improvement is similarly consistent across diverse target classes.

AutoDock Vina
GNINA
Scoring
Empirical / knowledge-based hybrid
CNN trained on PDB structures
Pose accuracy
Good (~50–55% within 2 Å, CASF-2016)
Better (~63–68% within 2 Å)
Speed (CPU)
Fast
Slightly slower (CNN inference overhead)
Speed (GPU)
Supported (v1.2+)
Native, significant speedup
Input format
PDBQT
PDBQT (identical)
Output scores
Affinity (kcal/mol)
CNN score + pose score + affinity
Cost
Free / open source
Free / open source
Config file
Standard Vina format
Identical to Vina

Installation

GNINA distributes as a pre-compiled Linux binary — no compilation required. It runs natively on Linux and on Windows via WSL2. macOS support exists but is less well-maintained; Linux is strongly preferred.

1
Download the latest binary
Get the most recent release from the GNINA GitHub releases page. The binary is a single self-contained file:
Terminal (Linux / macOS / WSL)
wget https://github.com/gnina/gnina/releases/latest/download/gnina
chmod +x gnina
2
Move it somewhere on your PATH
So you can call gnina from any directory:
Terminal
# If you have sudo access:
sudo mv gnina /usr/local/bin/gnina

# Without sudo (e.g. on HPC clusters):
mkdir -p ~/bin
mv gnina ~/bin/gnina
echo 'export PATH="$HOME/bin:$PATH"' >> ~/.bashrc
source ~/.bashrc
3
Verify the installation
Check that GNINA runs and reports its version:
Terminal
gnina --version
Terminal
gnina 1.1
Built Jun 10 2024
Using CUDA: yes (device 0: NVIDIA RTX 3080)

If CUDA is detected, GNINA will use GPU acceleration automatically. If not, it falls back to CPU — still functional, just slower.

No CUDA GPU? Still worth using GNINA
CPU-only GNINA is slower than GPU GNINA but still produces better poses than Vina. For single-ligand docking or small libraries on a laptop, the extra runtime per compound is typically under a minute — a worthwhile tradeoff for the accuracy improvement.

Running your first dock with GNINA

This is the section most tutorials bury the lead on: the switch from Vina to GNINA is a one-word change on the command line. If you have a working Vina setup, you already have a working GNINA setup.

Using the same COX-2 receptor and config file from the previous tutorials:

Terminal — Vina command (what you already know)
vina --config config.txt
Terminal — GNINA command (the upgrade)
gnina --config config.txt

Same config file. Same receptor PDBQT. Same ligand PDBQT. Same output format. The only difference is the binary you’re calling.

GNINA also accepts all the same flags as Vina directly on the command line without a config file:

Terminal — direct flags (no config file)
gnina \
  --receptor receptor/5KIR_receptor.pdbqt \
  --ligand ligand/ibuprofen.pdbqt \
  --center_x 15.234 --center_y -8.441 --center_z 22.178 \
  --size_x 20 --size_y 20 --size_z 20 \
  --exhaustiveness 8 \
  --num_modes 9 \
  --out output/gnina_docked.sdf \
  --log output/gnina.log

Note: GNINA’s preferred output format is SDF rather than PDBQT — this is actually more convenient since SDF files open directly in most molecular visualization tools without conversion.

GNINA is slower than Vina on CPU — plan accordingly
On CPU-only hardware, GNINA typically runs 2–4× slower per compound than Vina due to the CNN inference step. On a GPU this reverses — GNINA is often faster than Vina because the GPU handles CNN inference in parallel. If you’re on a laptop without a GPU, budget extra time or reduce exhaustiveness to 4 for initial screening.

Understanding GNINA’s output scores

GNINA outputs three scores per pose rather than Vina’s one. Understanding what each means is important for using them correctly.

CNN score
The raw output of the CNN — a value between 0 and 1 representing the probability that this pose is a true binder in a correct binding mode. Higher is better. This is GNINA’s primary ranking metric and the one most predictive of pose quality. Use this for pose selection.
CNN affinity
The CNN’s estimate of binding free energy in kcal/mol, analogous to Vina’s affinity score. More negative = stronger predicted binding. Use this for compound ranking in virtual screening, but note it’s less well-calibrated than experimental Kd values.
Vina score
The traditional Vina scoring function output, included for backward compatibility and comparison. GNINA also uses this during its search process. Useful if you want to compare directly to a Vina run on the same system.

For most use cases: use CNN score to select which pose to report, and CNN affinity to rank compounds in a virtual screening campaign. The Vina score is there for reference — you don’t need to act on it unless you’re running a comparison study.

Terminal — GNINA output
mode | affinity | CNN score | CNN affinity
| (kcal/mol)| | (kcal/mol)
—–+———–+———–+————-
1 -8.71 0.842 -8.91
2 -8.43 0.781 -8.55
3 -8.11 0.694 -8.22
4 -7.88 0.612 -7.94
5 -7.54 0.543 -7.71

GNINA outputs three columns per pose. The CNN score (0–1) is the primary quality indicator for pose selection.

When to use GNINA vs Vina

The answer is almost always GNINA for new projects — but there are a few cases where Vina is still the right choice.

SituationUse VinaUse GNINA
Learning docking for the first time Better documentation, larger community
Single-ligand docking, accuracy matters Better pose prediction
Virtual screening, GPU available Faster + more accurate
Virtual screening, CPU only, large library Faster per compound
Reproducing a published Vina result For exact reproducibility
Hit validation after initial screening Better confidence in top poses
Difficult targets (induced fit, flexible loops) CNN generalizes better
AlphaFold-predicted structures CNN less sensitive to minor structural errors

The one case where you should definitely not default to GNINA is when you’re learning. Start with Vina — its error messages are clearer, the community troubleshooting resources are more extensive, and the one-score output is simpler to interpret. Once you understand what docking is doing and have a working protocol, switching to GNINA is a five-second change that immediately improves your results.

Using GNINA for virtual screening

GNINA slots into the virtual screening pipeline from the previous tutorial with one change: replace vina with gnina in the subprocess call, and update the score parser to extract the CNN affinity column instead of the Vina affinity.

Update the dock_one function in your run_vs.py script:

Updated parse_score for GNINA SDF output
# GNINA writes scores to the SDF file properties
# Parse CNN affinity from SDF output
def parse_gnina_score(output_sdf):
    try:
        text = Path(output_sdf).read_text()
        match = re.search(r">  <CNNaffinity>\n([-\d.]+)", text)
        return float(match.group(1)) if match else None
    except:
        return None

# Also grab CNN score for pose quality filtering
def parse_gnina_cnn_score(output_sdf):
    try:
        text = Path(output_sdf).read_text()
        match = re.search(r">  <CNNscore>\n([\d.]+)", text)
        return float(match.group(1)) if match else None
    except:
        return None

Also add a CNN score filter to your hit selection: compounds with a CNN score below 0.5 — even if their CNN affinity score looks good — are flagged as low-confidence poses worth deprioritizing.

The two-filter approach for GNINA virtual screening
Apply a CNN affinity cutoff first (e.g. ≤ −9.0 kcal/mol) to get your initial hit list, then apply a CNN score filter (≥ 0.6) to remove low-confidence poses within that list. This two-stage approach consistently outperforms single-score filtering for both recall of true positives and reduction of false positives.

The upgrade in one sentence

GNINA is AutoDock Vina with a neural network scoring function trained on the entire Protein Data Bank — it uses the same workflow, the same input files, and the same config format, and consistently produces better poses and better virtual screening enrichment, for free, with a one-word change to your command line.

For new projects, there’s no reason not to use it. For existing Vina workflows, switching takes thirty seconds and the only downside is slightly slower CPU performance. The accuracy improvement is real and well-documented.

Last updated on

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *