Protein-Ligand Molecular Dynamics Simulation in GROMACS: Parameterize, Run and Analyze

Protein-Ligand Molecular Dynamics Simulation in GROMACS: Parameterize, Run and Analyze

Protein-only MD is the foundation. Protein-ligand MD is where it gets useful for drug discovery — validating docking results, characterizing binding stability, and calculating binding free energies. The extra complexity is almost entirely in Step 1: getting force field parameters for the small molecule. This tutorial covers the whole pipeline.

Step 1
Parameterize ligand
Step 2
Prep protein
Step 3
Merge topologies
Step 4
Solvate + ions
Step 5
Equilibrate + run
Step 6
Analyze

What’s different about protein-ligand MD

The GROMACS workflow you learned in the main tutorial covers protein-only simulations well. Protein-ligand MD adds one major complication: the ligand needs its own force field parameters.

Standard protein force fields (AMBER, CHARMM, GROMOS) define parameters for the 20 canonical amino acids and common modifications. They say nothing about an arbitrary drug molecule. Before you can simulate a protein-ligand complex, you need bond lengths, angles, torsional parameters, and partial charges for every atom in your ligand — and these must be compatible with your protein force field.

This tutorial uses the AMBER/GAFF2 combination: AMBER ff19SB for the protein, GAFF2 for the ligand. The parameterization tool is ACPYPE (Antechamber Python Parser Interface), which runs Antechamber and GAFF2 internally and outputs GROMACS-compatible files.

Working example
This tutorial uses COX-2 (PDB: 5KIR) with ibuprofen as the ligand — the same system from the docking tutorial series. If you’ve already downloaded and prepared these files from those tutorials, you can reuse them here.

Step 1 — Parameterize the ligand with ACPYPE

1
Step 1 — The critical step
Generate GAFF2 parameters for your ligand

Install ACPYPE via conda if you don’t have it:

Terminal
conda activate md
conda install -c conda-forge acpype ambertools -y

You need a 3D ligand structure as input. Download ibuprofen from PubChem (CID 3672) as a 3D SDF file, then convert to MOL2 or PDB format using Open Babel:

Terminal
# Download ibuprofen 3D structure from PubChem
wget "https://pubchem.ncbi.nlm.nih.gov/rest/pug/compound/CID/3672/SDF?record_type=3d" \
  -O ibuprofen.sdf

# Convert to PDB with correct protonation at pH 7.4
obabel ibuprofen.sdf -O ibuprofen.pdb -p 7.4 --gen3d

Now run ACPYPE to generate GROMACS-compatible GAFF2 parameters. The -n flag sets the net charge — check this carefully for your ligand (ibuprofen is neutral at most simulation conditions):

Terminal
acpype \
  -i ibuprofen.pdb \
  -b IBP \
  -n 0 \
  -a gaff2 \
  -c bcc

# -b IBP = residue name (3 chars) — avoid MOL, LIG — use something unique
# -n 0   = net charge (0 = neutral)
# -a gaff2 = use GAFF2 force field
# -c bcc  = AM1-BCC charge method (standard for drug-like molecules)

ACPYPE creates a directory called IBP.acpype/. The key files you need:

IBP.acpype/
├── IBP_GMX.itp ← ligand topology (force field parameters)
├── IBP_GMX.gro ← ligand coordinates in GROMACS format
└── posre_IBP.itp ← position restraints for equilibration
Check the net charge carefully
AM1-BCC charges are calculated on the input geometry. If your ligand has an unusual protonation state or a formal charge (e.g. a carboxylate that should be deprotonated at pH 7.4), verify the net charge before running. A system with non-zero charge that you think is neutral will give wrong electrostatics. Use grep "qtot" IBP.acpype/IBP_GMX.itp | tail -5 to check the cumulative charge sum in the generated ITP.

Step 2 — Prepare the protein

2
Step 2
Prepare the protein receptor

Protein preparation for a complex follows the same steps as protein-only MD, with one important difference: you keep the docked ligand coordinates from your docking run rather than removing it and re-docking it.

Start from your best docked pose (from AutoDock Vina or GNINA). Extract the protein and ligand into separate PDB files:

PyMOL command line
# Load docked complex
load 5KIR_receptor.pdbqt, receptor
load docked_pose1.pdbqt, ligand

# Save protein only (no waters, no ligand)
save protein_only.pdb, receptor

# Save ligand only
save ligand_pose.pdb, ligand

Generate the protein topology with AMBER ff19SB:

Terminal
gmx pdb2gmx \
  -f protein_only.pdb \
  -o protein.gro \
  -p topol.top \
  -water tip3p \
  -ff amber99sb-ildn \
  -ignh
Force field must match ligand parameters
This tutorial uses AMBER99SB-ILDN (close to ff19SB and widely available in GROMACS) with GAFF2 ligand parameters. If you use CHARMM36 for the protein, you need CGenFF parameters for the ligand instead of GAFF2. Never mix AMBER protein parameters with CGenFF ligand parameters or vice versa — the charge models and van der Waals parameters are not compatible.

Step 3 — Combine protein and ligand into one system

3
Step 3 — The topology merge
Combine protein and ligand coordinates and topologies

This is the most manual step in the workflow. You need to merge two coordinate files into one GRO file, and edit the topology to include the ligand force field.

Protein
protein.gro + topol.top
AMBER ff19SB parameters
+
Ligand
IBP_GMX.gro + IBP_GMX.itp
GAFF2 parameters
Complex
complex.gro + topol.top
Ready for solvation

Merge the coordinate files

Terminal
# Get atom counts from each file
head -2 protein.gro          # line 2 = atom count
head -2 IBP.acpype/IBP_GMX.gro

# Combine: take protein atoms + ligand atoms, update total count + box
# Python script to merge GRO files reliably:
python3 - << 'EOF'
with open('protein.gro') as f:
    plines = f.readlines()

with open('IBP.acpype/IBP_GMX.gro') as f:
    llines = f.readlines()

# Atom lines: protein[2:-1], ligand[2:-1]
p_atoms = plines[2:-1]
l_atoms = llines[2:-1]

total = len(p_atoms) + len(l_atoms)
box_line = plines[-1]  # use protein box line (will resize in editconf)

with open('complex.gro', 'w') as f:
    f.write('Complex protein-ligand system\n')
    f.write(f' {total}\n')
    f.writelines(p_atoms)
    f.writelines(l_atoms)
    f.write(box_line)

print(f'complex.gro written: {total} atoms')
EOF

Edit the topology to include the ligand

Open topol.top in a text editor. You need to make three additions:

Addition 1: After the last #include line at the top of the file, add a line to include the ligand force field:

topol.top — add after existing includes
; Ligand force field parameters
#include "IBP.acpype/IBP_GMX.itp"
#include "IBP.acpype/posre_IBP.itp"

Addition 2: At the end of the [ molecules ] section, add the ligand molecule. The name must exactly match the residue name in your ACPYPE output:

topol.top — [ molecules ] section
[ molecules ]
; Compound        #mols
Protein_chain_A      1
IBP                  1    ← add this line
Molecule order must match GRO file order
The order of molecules in the [ molecules ] section of topol.top must exactly match the order atoms appear in the GRO file. If protein atoms come first in complex.gro, Protein must come first in [ molecules ]. If the order doesn’t match, GROMACS will exit with a topology error during grompp. This is the most common source of errors in this step.

Step 4 — Solvate, add ions, and minimize

4
Step 4
Solvate the complex and minimize

From here, the workflow is essentially identical to protein-only MD. Define the box, solvate, add ions, and minimize:

Terminal
# Define simulation box — same as protein-only
gmx editconf \
  -f complex.gro \
  -o box.gro \
  -c -d 1.0 -bt dodecahedron

# Solvate
gmx solvate \
  -cp box.gro \
  -cs tip3p.gro \
  -o solvated.gro \
  -p topol.top

# Add ions
gmx grompp -f ions.mdp -c solvated.gro -p topol.top -o ions.tpr -maxwarn 1
echo "SOL" | gmx genion \
  -s ions.tpr \
  -o system.gro \
  -p topol.top \
  -pname NA -nname CL \
  -neutral -conc 0.15

# Energy minimization
gmx grompp -f minim.mdp -c system.gro -p topol.top -o em.tpr
gmx mdrun -v -deffnm em

If you see an error during grompp like Inconsistent box type or Fatal error: number of coordinates in topology does not match, the most likely cause is a mismatch between the atom count in complex.gro and the molecule list in topol.top. Recheck that the merged GRO has the correct total atom count on line 2.

Step 5 — Equilibrate and run production MD

5
Step 5
Equilibrate and run — with ligand position restraints

The equilibration MDP files are nearly identical to protein-only MD, with one change: include position restraints on both the protein AND the ligand during NVT and NPT equilibration. Add the ligand restraint include to your MDP file:

nvt.mdp and npt.mdp — add this define line
; Restrain both protein and ligand during equilibration
define = -DPOSRES -DPOSRES_IBP

The -DPOSRES_IBP flag activates the ligand position restraints from posre_IBP.itp. Without this, the ligand may drift out of the binding site during the early violent phase of heating before the system has equilibrated around it.

Terminal
# NVT — temperature equilibration
gmx grompp -f nvt.mdp -c em.gro -r em.gro -p topol.top -o nvt.tpr
gmx mdrun -v -deffnm nvt

# NPT — pressure equilibration
gmx grompp -f npt.mdp -c nvt.gro -r nvt.gro -t nvt.cpt -p topol.top -o npt.tpr
gmx mdrun -v -deffnm npt

# Production MD — remove all restraints
# Use same md.mdp as protein-only tutorial (no define = -DPOSRES)
gmx grompp -f md.mdp -c npt.gro -t npt.cpt -p topol.top -o md.tpr
gmx mdrun -v -deffnm md

Step 6 — Analyze binding stability

6
Step 6
Characterize binding stability from the trajectory

After centering the trajectory (gmx trjconv -pbc mol -center, selecting Protein_IBP as center and System as output), run these three key analyses:

Ligand RMSD
gmx rms
How much the ligand moves relative to its starting docked position. A stable binder stays below 0.2 nm. Above 0.5 nm means it has essentially left the binding site.
H-bond occupancy
gmx hbond
Which hydrogen bonds between ligand and protein residues are present and for what fraction of the simulation. Core contacts should have >80% occupancy.
Binding contacts
gmx mindist
Minimum distance between the ligand and each key binding site residue over time. Identifies which contacts are maintained and which are transient.

Ligand RMSD

Terminal
# Fit to protein backbone, measure ligand displacement
gmx rms \
  -s md.tpr \
  -f md_centered.xtc \
  -o rmsd_ligand.xvg \
  -tu ns
# Fit group: Backbone
# RMSD group: IBP (or your ligand group name)

Protein-ligand hydrogen bonds

Terminal
gmx hbond \
  -s md.tpr \
  -f md_centered.xtc \
  -num hbond_pl.xvg \
  -hbm hbmap.xpm
# Group 1: Protein
# Group 2: IBP

# Convert XPM to readable format
gmx xpm2ps -f hbmap.xpm -o hbmap.eps -by 1

Key residue contact distances

Terminal
# Create index group for key residues (e.g. ARG120, TYR355 for COX-2)
gmx make_ndx -f md.tpr -o contacts.ndx
# In the prompt: create groups for specific residues
# "r 120" selects residue 120, then name the group

# Measure minimum distance between ligand and each key residue
gmx mindist \
  -s md.tpr \
  -f md_centered.xtc \
  -n contacts.ndx \
  -od mindist_arg120.xvg \
  -tu ns

Common problems and fixes

These errors appear in the vast majority of protein-ligand MD setups:

  • Fatal error: atomtype not found in parameters — The ligand ITP uses atom types not in your force field. Check that you included the GAFF2 atom type definitions from ACPYPE. Add #include "IBP.acpype/IBP_GMX.itp" before the protein include, not after.
  • Fatal error: number of coordinates in topology (N) does not match coordinates in coordinate file (M) — The atom count in complex.gro (line 2) doesn’t match the sum of atoms across all molecules in [ molecules ]. Recount — protein atoms + ligand atoms must equal the GRO header number.
  • Ligand RMSD increases monotonically and exceeds 0.5 nm — The ligand has left the binding site. Check: (1) were ligand position restraints active during both NVT and NPT equilibration? (2) Is the ligand net charge correct? A charged ligand in a neutral site will be repelled. (3) Was the docked pose actually inside the binding site — visualize the starting complex in PyMOL before running.
  • ACPYPE fails: cannot determine atom types — The input geometry has problems. Make sure you used a proper 3D structure (not a 2D SMILES-derived flat structure). Run obabel -i sdf ibuprofen.sdf -o pdb -O ibuprofen.pdb --gen3d to ensure proper 3D geometry is generated.
  • ACPYPE ran successfully — IBP_GMX.itp and IBP_GMX.gro generated
  • Ligand net charge verified — matches expected value from structure
  • complex.gro atom count on line 2 matches protein + ligand atoms
  • IBP_GMX.itp included in topol.top before protein topology
  • IBP appears in [ molecules ] section of topol.top
  • Both -DPOSRES and -DPOSRES_IBP active during NVT and NPT
  • Ligand RMSD plateaus below 0.3 nm in production — ligand stayed bound

The key difference in one sentence

Protein-ligand MD is protein-only MD with one extra step at the beginning — parameterizing the ligand with ACPYPE — and one extra care item throughout: ensuring the force field for the ligand is compatible with the force field for the protein. Get those two things right and the rest of the pipeline is identical. The result is a trajectory that tells you whether the docked pose is stable, which specific contacts maintain binding, and — when combined with MM-GBSA — how tightly the ligand is predicted to bind.

Last updated on

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *