GNINA Tutorial: Deep Learning Molecular Docking as an Alternative to AutoDock Vina

15 min read Intermediate

GNINA does everything AutoDock Vina does — but replaces the scoring function with a convolutional neural network trained on millions of protein-ligand structures. The result is consistently better pose prediction with an almost identical workflow. This tutorial shows you exactly how to make the switch.

What GNINA is and why it exists

GNINA (pronounced “ninja”) is a molecular docking program developed at the University of Michigan by David Koes and colleagues. It was first published in 2021 and has been actively maintained since. The name stands for GNINA Neural Integration of Autoencoded structure.

GNINA emerged from a straightforward observation: AutoDock Vina’s search algorithm is good, but its scoring function — a hybrid empirical/knowledge-based function — was designed in the early 2000s and doesn’t capture the full complexity of protein-ligand interactions. Modern deep learning, trained on the hundreds of thousands of protein-ligand complexes now available in the PDB, can do better.

The solution GNINA implements is elegant: keep Vina’s iterated local search algorithm exactly as-is, and replace only the scoring function with a convolutional neural network (CNN) that has learned to evaluate binding poses directly from 3D molecular structure. Same search, smarter scorer.

Key paper

McNutt et al. (2021) “GNINA 1.0: molecular docking with deep learning” — Journal of Cheminformatics. If you use GNINA in a publication, cite this paper. The benchmark data in this paper is also the best independent comparison of GNINA vs Vina pose prediction accuracy.

How it differs from AutoDock Vina

The architectural difference is significant even if the user-facing workflow is nearly identical.

AutoDock Vina

Traditional pipeline

Iterated local search algorithm
Explores conformational space
Hybrid empirical scoring function
Hand-crafted features (H-bonds, VdW, hydrophobic)
Outputs: affinity in kcal/mol

→

replaced

GNINA

Deep learning pipeline

Same iterated local search algorithm
Explores conformational space identically
CNN scoring function
Learned features from millions of PDB structures
Outputs: CNN score + pose score + affinity

In head-to-head benchmarks on the CASF-2016 dataset — the standard for evaluating docking programs — GNINA achieves roughly 10–15 percentage points higher success rate in reproducing crystal poses (RMSD < 2 Å) compared to standard Vina. For virtual screening enrichment, the improvement is similarly consistent across diverse target classes.

AutoDock Vina

GNINA

Scoring

Empirical / knowledge-based hybrid

CNN trained on PDB structures

Pose accuracy

Good (~50–55% within 2 Å, CASF-2016)

Better (~63–68% within 2 Å)

Speed (CPU)

Fast

Slightly slower (CNN inference overhead)

Speed (GPU)

Supported (v1.2+)

Native, significant speedup

Input format

PDBQT

PDBQT (identical)

Output scores

Affinity (kcal/mol)

CNN score + pose score + affinity

Cost

Free / open source

Config file

Standard Vina format

Identical to Vina

Installation

GNINA distributes as a pre-compiled Linux binary — no compilation required. It runs natively on Linux and on Windows via WSL2. macOS support exists but is less well-maintained; Linux is strongly preferred.

Download the latest binary

Get the most recent release from the GNINA GitHub releases page. The binary is a single self-contained file:

Terminal (Linux / macOS / WSL)

wget https://github.com/gnina/gnina/releases/latest/download/gnina
chmod +x gnina

Move it somewhere on your PATH

So you can call gnina from any directory:

Terminal

# If you have sudo access:
sudo mv gnina /usr/local/bin/gnina

# Without sudo (e.g. on HPC clusters):
mkdir -p ~/bin
mv gnina ~/bin/gnina
echo 'export PATH="$HOME/bin:$PATH"' >> ~/.bashrc
source ~/.bashrc

Verify the installation

Check that GNINA runs and reports its version:

Terminal

gnina --version

Terminal

gnina 1.1
Built Jun 10 2024
Using CUDA: yes (device 0: NVIDIA RTX 3080)

If CUDA is detected, GNINA will use GPU acceleration automatically. If not, it falls back to CPU — still functional, just slower.

No CUDA GPU? Still worth using GNINA

CPU-only GNINA is slower than GPU GNINA but still produces better poses than Vina. For single-ligand docking or small libraries on a laptop, the extra runtime per compound is typically under a minute — a worthwhile tradeoff for the accuracy improvement.

Running your first dock with GNINA

This is the section most tutorials bury the lead on: the switch from Vina to GNINA is a one-word change on the command line. If you have a working Vina setup, you already have a working GNINA setup.

Using the same COX-2 receptor and config file from the previous tutorials:

Terminal — Vina command (what you already know)

vina --config config.txt

Terminal — GNINA command (the upgrade)

gnina --config config.txt

Same config file. Same receptor PDBQT. Same ligand PDBQT. Same output format. The only difference is the binary you’re calling.

GNINA also accepts all the same flags as Vina directly on the command line without a config file:

Terminal — direct flags (no config file)

gnina \
  --receptor receptor/5KIR_receptor.pdbqt \
  --ligand ligand/ibuprofen.pdbqt \
  --center_x 15.234 --center_y -8.441 --center_z 22.178 \
  --size_x 20 --size_y 20 --size_z 20 \
  --exhaustiveness 8 \
  --num_modes 9 \
  --out output/gnina_docked.sdf \
  --log output/gnina.log

Note: GNINA’s preferred output format is SDF rather than PDBQT — this is actually more convenient since SDF files open directly in most molecular visualization tools without conversion.

GNINA is slower than Vina on CPU — plan accordingly

On CPU-only hardware, GNINA typically runs 2–4× slower per compound than Vina due to the CNN inference step. On a GPU this reverses — GNINA is often faster than Vina because the GPU handles CNN inference in parallel. If you’re on a laptop without a GPU, budget extra time or reduce exhaustiveness to 4 for initial screening.

Understanding GNINA’s output scores

GNINA outputs three scores per pose rather than Vina’s one. Understanding what each means is important for using them correctly.

CNN score

The raw output of the CNN — a value between 0 and 1 representing the probability that this pose is a true binder in a correct binding mode. Higher is better. This is GNINA’s primary ranking metric and the one most predictive of pose quality. Use this for pose selection.

CNN affinity

The CNN’s estimate of binding free energy in kcal/mol, analogous to Vina’s affinity score. More negative = stronger predicted binding. Use this for compound ranking in virtual screening, but note it’s less well-calibrated than experimental Kd values.

Vina score

The traditional Vina scoring function output, included for backward compatibility and comparison. GNINA also uses this during its search process. Useful if you want to compare directly to a Vina run on the same system.

For most use cases: use CNN score to select which pose to report, and CNN affinity to rank compounds in a virtual screening campaign. The Vina score is there for reference — you don’t need to act on it unless you’re running a comparison study.

Terminal — GNINA output

GNINA outputs three columns per pose. The CNN score (0–1) is the primary quality indicator for pose selection.

When to use GNINA vs Vina

The answer is almost always GNINA for new projects — but there are a few cases where Vina is still the right choice.

Situation	Use Vina	Use GNINA
Learning docking for the first time	Better documentation, larger community	—
Single-ligand docking, accuracy matters	—	Better pose prediction
Virtual screening, GPU available	—	Faster + more accurate
Virtual screening, CPU only, large library	Faster per compound	—
Reproducing a published Vina result	For exact reproducibility	—
Hit validation after initial screening	—	Better confidence in top poses
Difficult targets (induced fit, flexible loops)	—	CNN generalizes better
AlphaFold-predicted structures	—	CNN less sensitive to minor structural errors

The one case where you should definitely not default to GNINA is when you’re learning. Start with Vina — its error messages are clearer, the community troubleshooting resources are more extensive, and the one-score output is simpler to interpret. Once you understand what docking is doing and have a working protocol, switching to GNINA is a five-second change that immediately improves your results.

Using GNINA for virtual screening

GNINA slots into the virtual screening pipeline from the previous tutorial with one change: replace vina with gnina in the subprocess call, and update the score parser to extract the CNN affinity column instead of the Vina affinity.

Update the dock_one function in your run_vs.py script:

Updated parse_score for GNINA SDF output

# GNINA writes scores to the SDF file properties
# Parse CNN affinity from SDF output
def parse_gnina_score(output_sdf):
    try:
        text = Path(output_sdf).read_text()
        match = re.search(r">  <CNNaffinity>\n([-\d.]+)", text)
        return float(match.group(1)) if match else None
    except:
        return None

# Also grab CNN score for pose quality filtering
def parse_gnina_cnn_score(output_sdf):
    try:
        text = Path(output_sdf).read_text()
        match = re.search(r">  <CNNscore>\n([\d.]+)", text)
        return float(match.group(1)) if match else None
    except:
        return None

Also add a CNN score filter to your hit selection: compounds with a CNN score below 0.5 — even if their CNN affinity score looks good — are flagged as low-confidence poses worth deprioritizing.

The two-filter approach for GNINA virtual screening

Apply a CNN affinity cutoff first (e.g. ≤ −9.0 kcal/mol) to get your initial hit list, then apply a CNN score filter (≥ 0.6) to remove low-confidence poses within that list. This two-stage approach consistently outperforms single-score filtering for both recall of true positives and reduction of false positives.

The upgrade in one sentence

GNINA is AutoDock Vina with a neural network scoring function trained on the entire Protein Data Bank — it uses the same workflow, the same input files, and the same config format, and consistently produces better poses and better virtual screening enrichment, for free, with a one-word change to your command line.

For new projects, there’s no reason not to use it. For existing Vina workflows, switching takes thirty seconds and the only downside is slightly slower CPU performance. The accuracy improvement is real and well-documented.

GNINA Tutorial: Deep Learning Molecular Docking as an Alternative to AutoDock Vina

What GNINA is and why it exists

How it differs from AutoDock Vina

Installation

Running your first dock with GNINA

Understanding GNINA’s output scores

When to use GNINA vs Vina

Using GNINA for virtual screening

The upgrade in one sentence

How to Install AutoDock Vina on Windows, Mac and Linux (2026 Guide)

How to Interpret Molecular Docking Results: Scores, Poses & What They Mean

AutoDock Vina Tutorial for Beginners: Dock Your First Ligand From Scratch (2026)

What is Molecular Docking? A Beginner’s Guide for Structural Biologists

10 Common Molecular Docking Mistakes (And How to Avoid Every One)

How to Prepare a Protein for Molecular Docking: Complete Step-by-Step Guide

Leave a Reply Cancel reply

What GNINA is and why it exists

How it differs from AutoDock Vina

Installation

Running your first dock with GNINA

Understanding GNINA’s output scores

When to use GNINA vs Vina

Using GNINA for virtual screening

The upgrade in one sentence

Similar Posts

Leave a Reply Cancel reply