How Does Molecular Dynamics Simulation Work? Force Fields, Timesteps & Thermostats Explained

How Does Molecular Dynamics Simulation Work? Force Fields, Timesteps & Thermostats Explained

Understanding how MD simulation works under the hood isn’t just academic — it determines which settings you choose, what can go wrong, and how to interpret your results. This article explains every core component clearly, without assuming a physics background.

The core simulation loop

At its heart, an MD simulation is a loop — a calculation that runs millions of times. Each pass through the loop represents one timestep of physical time. Here is what happens in every iteration:

  1. Calculate the force on every atom from every neighboring atom using the force field
  2. Use that force to compute the new velocity and position of each atom using Newton’s second law
  3. Save the new coordinates to the trajectory file at the specified output frequency
  4. Advance the simulation clock by one timestep and repeat

For a protein system of 100,000 atoms simulated for 100 nanoseconds at a 2-femtosecond timestep, this loop runs 50 million times. The entire trajectory — every saved snapshot — is the raw data you’ll analyze. Everything else in MD simulation setup is in service of making this loop run correctly, stably, and in conditions that represent real biology.

Force fields: the rulebook for atomic interactions

A force field is the set of mathematical equations and parameters that describes how atoms in the system interact with each other. Given the positions of all atoms, it computes the potential energy of the system — and the force on each atom is the negative gradient of that energy with respect to position. Atoms move to lower their potential energy, just as a ball rolls downhill.

Force field energy terms fall into two categories: bonded and non-bonded.

Bonded terms
Interactions within connected atoms
Describe energy changes as bonds stretch and compress, bond angles bend, and dihedral angles rotate. These terms keep covalent geometry correct — bonds at their equilibrium lengths, angles near their optimal values.
Bond stretch · Angle bend · Dihedral torsion · Improper torsion
Non-bonded terms
Interactions between non-connected atoms
Describe electrostatic interactions (attraction between opposite charges, repulsion between like charges) and van der Waals interactions (short-range attraction at optimal distance, steep repulsion at close range). These dominate binding free energies.
Coulomb electrostatics · Lennard-Jones (van der Waals)

The major biomolecular force fields — AMBER, CHARMM, GROMOS, and OPLS-AA — all use this same general framework but differ in their parameterization: the specific values for equilibrium bond lengths, force constants, partial charges, and van der Waals radii. These parameters are derived from a combination of quantum mechanical calculations and fitting to experimental data. No force field is perfect, and the choice of force field is one of the most consequential decisions in setting up a simulation.

Force field and water model must match
Every force field is designed to work with a specific water model. AMBER force fields are typically paired with TIP3P water; CHARMM force fields with TIP3P or CHARMM-specific TIP3P variants. Using a mismatched combination — say, CHARMM36 protein parameters with AMBER-parameterized water — introduces systematic errors that are difficult to detect but can meaningfully affect your results.

Newton’s second law: turning forces into motion

Once forces are calculated, MD uses Newton’s second law of motion to determine how each atom moves. The relationship is straightforward:

F = m · a
F
Force
Calculated by the force field from current atomic positions (units: kJ/mol/nm)
m
Mass
Atomic mass — heavier atoms accelerate more slowly than lighter ones
a
Acceleration
Change in velocity per unit time — used to update position at the next timestep

From the acceleration, the integrator algorithm (usually the leap-frog or Verlet scheme) calculates the new velocity and then the new position for each atom. This is purely classical mechanics — MD does not account for quantum effects like zero-point energy, tunneling, or electronic structure. For the overwhelming majority of structural biology questions this is an entirely adequate approximation, but it’s a fundamental limitation to keep in mind.

The timestep: why 2 femtoseconds?

The timestep is the amount of simulated time that advances with each iteration. It must be small enough that the numerical integration remains stable — if it’s too large, atoms can overshoot their equilibrium positions, forces become enormous, and the simulation either crashes with an error or, worse, produces physically nonsensical trajectories without crashing.

The practical upper limit on the timestep is set by the fastest atomic motions in the system. In biomolecular simulations, the fastest motion is the vibration of bonds involving hydrogen — specifically X-H bonds (where X is carbon, nitrogen, or oxygen), which vibrate with a period of roughly 10 femtoseconds. The Nyquist sampling criterion requires the timestep to be substantially smaller than the period of the fastest motion being simulated.

The standard solution is to constrain X-H bonds — fix their lengths using algorithms like LINCS or SHAKE, effectively removing those fast vibrations from the dynamics. With X-H bonds constrained, the fastest remaining motions are slower heavy-atom vibrations, and a 2 femtosecond timestep is stable. Without constraints, 1 femtosecond or less is needed.

The “simulation exploded” error
If your simulation crashes with an error like Fatal error: Stepsize too small or velocities become nan, the most common cause is a timestep that’s too large for the forces present — usually because of a steric clash in the starting structure that wasn’t resolved during energy minimization. The fix is almost always to go back and run energy minimization more thoroughly before starting dynamics.

The simulation box

MD simulations don’t run in infinite space — the system is placed in a finite simulation box, typically containing the protein, water molecules, and ions. Box choice involves two decisions: shape and size.

Shape: The most common box shapes for protein simulations are the cubic box and the truncated octahedron. The truncated octahedron is more efficient — for a given minimum protein-to-edge distance, it contains about 71% of the water molecules a cubic box would require, reducing computational cost without affecting the physics. For membrane simulations, rectangular (orthorhombic) boxes are standard to accommodate the flat bilayer geometry.

Size: The box must be large enough that the protein never comes within interaction distance of its own periodic image — a minimum of 10 Å from any protein atom to the box edge is the conventional standard, corresponding to the typical non-bonded cutoff distance. Violating this creates artificial self-interactions that corrupt the simulation.

Periodic boundary conditions

Placing a protein in a finite water box creates an artificial problem: water molecules at the box edge would experience a vacuum interface — nothing like the bulk solvent environment of a real cell. Periodic boundary conditions (PBC) solve this by making the simulation box tile infinitely in all directions. When an atom exits the box through one face, it re-enters from the opposite face. The protein effectively exists in an infinite, periodic medium.

Periodic boundary conditions — 2D representation
image
image
image
image
REAL BOX
image
image
image
image
The real simulation box is surrounded by identical periodic images in all directions. Atoms exiting one face instantly re-enter from the opposite face — the system has no edges and no vacuum interface.

PBC introduces its own artifact: the minimum image convention. When calculating non-bonded interactions, each atom interacts only with the nearest image of every other atom — either in the real box or one of the periodic copies. This is why the box must be large enough that no atom interacts with the periodic image of itself or its covalently bonded neighbors.

Wrapping and centering the trajectory
Because of PBC, molecules in raw trajectory files often appear to “jump” across the box boundary or become fragmented as parts exit one face and re-enter from the opposite. Before analysis, trajectories must be “unwrapped” and the protein centered in the box. GROMACS provides gmx trjconv for this — it’s one of the first post-processing steps and skipping it produces visual artifacts and incorrect RMSD calculations.

Thermostats and barostats

Uncontrolled Newtonian dynamics conserves total energy, not temperature — the temperature of the system fluctuates as kinetic and potential energy exchange. Real biological experiments happen at constant temperature and pressure, so MD needs coupling algorithms to maintain these conditions.

A thermostat controls temperature by rescaling atomic velocities or coupling the system to a heat bath. A barostat controls pressure by periodically adjusting the simulation box volume. The choice of algorithm affects both accuracy and which thermodynamic ensemble the simulation represents.

AlgorithmTypeRecommended forNotes
V-rescale Thermostat Equilibration Production Stochastic velocity rescaling — correct canonical ensemble, good for general use
Nosé-Hoover Thermostat Production Rigorous NVT ensemble — preferred for production runs where thermodynamic accuracy matters
Berendsen Thermostat + Barostat Equilibration only Fast equilibration but incorrect ensemble — never use for production or free energy calculations
Parrinello-Rahman Barostat Production Correct NPT ensemble — the standard barostat for production MD in GROMACS
Monte Carlo Barostat Production Used in OpenMM; correct NPT ensemble; efficient for systems with slow pressure fluctuations

The Berendsen thermostat/barostat deserves special mention because it’s one of the most common sources of subtle errors in published MD studies. It equilibrates systems quickly and stably — which is why it’s appropriate for equilibration — but it artificially suppresses fluctuations and does not produce a correct thermodynamic ensemble. Using it for production runs gives the wrong heat capacity, compressibility, and free energy — errors that are invisible unless you’re specifically checking for them.

Ensembles: NVT vs NPT

A thermodynamic ensemble defines which quantities are held constant during the simulation. The two ensembles you’ll use in every MD workflow are:

NVT
Canonical ensemble
Constant Number of particles, Volume, and Temperature. Box size is fixed; temperature is controlled by a thermostat. Pressure is not controlled and will fluctuate.
→ Use for: temperature equilibration
NPT
Isothermal-isobaric ensemble
Constant Number of particles, Pressure, and Temperature. Both temperature and pressure are controlled. Box volume fluctuates to maintain target pressure. System density equilibrates.
→ Use for: density equilibration and all production runs

The standard MD workflow runs NVT first — to equilibrate temperature — then switches to NPT — to equilibrate density before production. This two-stage approach ensures the system reaches a stable thermodynamic state before any data collection begins. Running production MD in NVT (fixed volume) without prior NPT equilibration means your system density may not match real physiological conditions — a subtle but meaningful error.

The mechanism in one paragraph

MD simulation works by repeatedly calculating forces on every atom using a force field, applying Newton’s second law to update positions and velocities, and advancing the clock by a tiny timestep. The protein is placed in a periodic water box so there are no artificial edges. A thermostat and barostat keep temperature and pressure at physiological values. The choice of force field, timestep, box size, coupling algorithms, and ensemble all affect the quality of the simulation — and understanding what each does is what allows you to set them correctly and interpret what goes wrong when they don’t.

Last updated on

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *