Homology Modeling Tutorial: Swiss-Model and MODELLER for Beginners

16 min read Beginner

Homology modeling has been the workhorse of structural biology for three decades — and despite AlphaFold’s emergence, it remains the method of choice in several important situations. This tutorial explains what it is, when to use it, and how to run it using two of the most widely used tools: Swiss-Model and MODELLER.

What homology modeling is

Homology modeling — also called comparative modeling — predicts the 3D structure of a protein (the target) by using the known experimental structure of a related protein (the template) as a guide. The underlying assumption is evolutionary: proteins that share significant sequence identity almost certainly share a similar fold. If two proteins are 50% identical in sequence, their backbone structures typically superimpose within 1–2 Å.

The method works in three stages. First, find one or more structurally characterized proteins with detectable similarity to your target. Second, align the target sequence to the template sequence, mapping each residue of the target to a position in the template structure. Third, use that alignment to build a 3D model: copy the template backbone, rebuild divergent regions (loops) using sampling algorithms, and optimize side chains and overall geometry by energy minimization.

The quality of the result depends almost entirely on two things: the quality and resolution of the template, and the sequence identity between target and template. Both are checkable before you run a single calculation.

When to use homology modeling vs AlphaFold

Use AlphaFold when

AlphaFold (via ColabFold or AFDB)

No structural template exists in the PDB
Sequence identity to known structures is below 30%
You need a quick, no-template prediction
The protein is in the AlphaFold Database already
You want confidence scores (pLDDT, PAE)
Large-scale proteome-wide prediction

Use homology modeling when

Swiss-Model or MODELLER

You need a specific conformation (active, ligand-bound)
A high-identity template (>50%) exists in a relevant state
You want explicit control over template selection
Modeling a mutant using the wild-type structure
Comparing with previous homology modeling literature
MHC/antibody modeling with specialized templates

The most practically important case where homology modeling beats AlphaFold is conformation control. AlphaFold predicts its best estimate of the stable apo (unbound) state. If you want a model in the active conformation, or in the conformation observed when a specific ligand is bound, and a close homolog crystallized in that conformation exists in the PDB — homology modeling using that template will give you exactly what you need. AlphaFold cannot.

How sequence identity affects model quality

Sequence identity between target and template is the single most predictive factor of homology model quality. Before choosing a template, check the identity — it determines whether the model will be worth building.

> 50%

Excellent. Model backbone accuracy comparable to a medium-resolution crystal structure (2.0–2.5 Å RMSD). Side chains reliable in the core; some variation at surface positions. Suitable for docking, MD, and structure-based drug design.

30–50%

Good. Backbone generally correct; some loop regions may be poorly modeled. Core side chains reliable; surface side chains less so. Useful for most structural analyses with appropriate caveats.

20–30%

Moderate — twilight zone. Template and target may share a fold but alignment is uncertain. Model errors can be large, especially in loops. Validate carefully with multiple templates and quality checks. AlphaFold likely performs better in this range.

< 20%

Unreliable. Below this threshold, sequence similarity is no longer a reliable indicator of structural similarity. Do not build homology models at this identity level — use AlphaFold or abandon template-based approaches.

Running Swiss-Model: step by step

Swiss-Model (swissmodel.expasy.org) is the world’s most widely used homology modeling server. It is fully automated — you provide a sequence, it finds templates, aligns, builds, and evaluates a model in a single submission. No installation, no command line, no coding required.

Step 1

Submit your sequence

Go to swissmodel.expasy.org. Click Start Modelling. Paste your protein sequence in FASTA format into the target sequence field. Optionally paste in a project name — something descriptive so you can find the result later. Click Build Model.

Swiss-Model will automatically search the PDB for structural templates, select the best one based on sequence identity and coverage, build the alignment, and generate a model. For a typical protein (200–400 residues), this takes 2–10 minutes. You’ll receive an email when it’s done if you register, or you can keep the browser tab open.

Step 2

Review the template selection

The results page shows the template(s) selected, their PDB IDs, sequence identity, and coverage. This is the first thing to check: is the template appropriate for your question?

Sequence identity — ideally above 40%; check the scale above
Coverage — what fraction of your sequence is covered by the template? Low coverage means large regions will be modeled without a template (loop modeling), which is much less accurate
Template resolution — prefer templates at 2.5 Å or better; lower resolution templates have larger coordinate uncertainties
Template conformation — is the template in the state you need (apo, holo, active, inactive)? Click the PDB ID to check in the RCSB

If the auto-selected template is not appropriate, use the Template Selection tab to search for and manually specify a better template. This is one of Swiss-Model’s key advantages over automated tools — you can override the template choice.

Step 3

Evaluate the model quality

Swiss-Model automatically evaluates the model with two quality scores: QMEAN and the Ramachandran plot. Both appear on the results page alongside a 3D viewer.

The QMEAN Z-score compares your model to experimentally determined structures of similar size. A score near 0 means the model quality is consistent with real crystal structures of that size. Scores below −4 indicate poor model quality and should prompt investigation — wrong template, poor alignment, or a difficult loop region.

The Ramachandran plot shows backbone dihedral angle distributions. In a good model, over 95% of residues should fall in the favored regions (dark blue area). Residues in outlier regions (red) have strained geometry and may indicate modeling errors at those positions.

Download the model PDB file from the Download section for use in docking, MD simulation, or further analysis.

Running MODELLER: step by step

MODELLER is a command-line program developed at UCSF that gives you full control over every step of the homology modeling process. It’s more complex than Swiss-Model but offers capabilities Swiss-Model doesn’t: multi-template modeling, loop refinement protocols, and model optimization you can tune for your specific application. It’s free for academic use.

Install MODELLER via conda

The simplest installation is via conda-forge: conda install -c conda-forge modeller. You’ll need to register for a free academic license key at salilab.org/modeller and add it to your environment. The conda package handles the rest.

The core MODELLER workflow requires three things: your target sequence in FASTA format, the template PDB file, and an alignment file in PIR format mapping target residues to template residues. Swiss-Model (or HHpred for more sensitive alignments) can generate this alignment for you.

MODELLER — core script

Build a model with MODELLER Python API

Save this script as model.py in your working directory alongside the template PDB and alignment file:

model.py

# Basic MODELLER homology modeling script
# Replace TARGET, TEMPLATE, and ALIGNMENT values with yours
from modeller import *
from modeller.automodel import *

env = Environ()
env.io.atom_files_directory = ['.', '../templates']

# AutoModel: standard homology modeling
a = AutoModel(
    env,
    alnfile  = 'alignment.ali',   # PIR alignment file
    knowns   = '4XYZ',            # template PDB code (4-letter)
    sequence = 'target_protein',  # target sequence name in alignment
    assess_methods = (assess.DOPE,
                      assess.GA341)
)

# Generate 5 models — more models = better sampling
a.starting_model = 1
a.ending_model   = 5

a.make()

# Select best model by DOPE score (most negative = best)
ok_models = [m for m in a.outputs if m['failure'] is None]
key = 'DOPE score'
ok_models.sort(key=lambda a: a[key])
m = ok_models[0]
print(f"Best model: {m['name']} — DOPE: {m[key]:.3f}")

Terminal

python model.py 2>&1 | tee modeller.log

MODELLER generates the requested number of models (5 in the script above), calculates DOPE scores for each, and outputs them to the terminal. The best model is the one with the lowest (most negative) DOPE score.

The alignment is the most critical input

MODELLER builds exactly what the alignment tells it to. A misaligned residue produces a modeling error at that position — and MODELLER won’t warn you. Always inspect your alignment visually before running MODELLER, particularly around insertions and deletions. Use HHpred (toolkit.tuebingen.mpg.de/tools/hhpred) for sensitive alignments when sequence identity is below 40%; it’s significantly more accurate than standard BLAST for structurally similar but sequentially divergent proteins.

Evaluating model quality: DOPE, QMEAN and MolProbity

No homology model should be used for research without a quality assessment. Three complementary metrics cover different aspects of model quality:

DOPE Score

Discrete Optimized Protein Energy

A statistical energy function that scores how “protein-like” the modeled structure is. Calculated per residue, so you can identify which specific regions are poorly modeled. Used by MODELLER to rank multiple models.

More negative = better. No absolute threshold — compare models against each other. A DOPE profile plot identifies problem loops as local peaks.

QMEAN Z-score

Qualitative Model Energy ANalysis

Compares the model against a reference set of experimentally determined structures of similar size. Provides a global quality estimate normalized to what is expected for a real structure of that size.

Near 0 = good. Below −2: investigate. Below −4: significant quality problems. Available from Swiss-Model and ProSA servers.

After DOPE and QMEAN, run the model through MolProbity (molprobity.biochem.duke.edu) for stereochemical validation. MolProbity checks Ramachandran statistics, rotamer quality, bond lengths and angles, and steric clashes. A publication-quality homology model should have:

Ramachandran favored > 95%, outliers < 0.5%
Rotamer outliers < 1%
MolProbity clashscore below 20 (ideally below 10)
MolProbity score below 2.0

Also try ProSA for Z-score visualization

ProSA (prosa.services.came.sbg.ac.at) is a free web server that calculates the Z-score and plots energy per residue, making it easy to visually identify problematic loop regions that warrant refinement or closer scrutiny before use.

What to report in a methods section

A complete homology modeling methods section should include: the target sequence source (UniProt accession), template selection method (automated Swiss-Model vs manual), the selected template PDB ID and chain, sequence identity and coverage to the target, how many models were generated, which model was selected and by what criterion (DOPE score, QMEAN), and the stereochemical validation results from MolProbity.

Example methods statement

“A homology model of [protein] was generated using MODELLER 10.4 with the crystal structure of [homolog] (PDB: 4XYZ, chain A, 2.1 Å resolution) as template, selected based on 58% sequence identity and 94% coverage of the target sequence. Five models were generated and ranked by DOPE score; the best-scoring model was selected for further analysis. Model quality was assessed using QMEAN (Z-score = −0.8) and MolProbity (clashscore 7.2, Ramachandran favored 97.3%, outliers 0.2%).”

Homology modeling in one paragraph

Homology modeling builds a 3D protein model by copying the backbone from a structurally characterized relative and optimizing the sequence-specific differences. It works best when sequence identity to the template exceeds 30%, and it remains the preferred method when you need a model in a specific conformation — active, ligand-bound, or mutant — that AlphaFold cannot access from sequence alone. Swiss-Model is the fastest entry point: paste your sequence, get a model and quality scores in minutes. MODELLER gives you full control for publication-quality work. Always validate with DOPE, QMEAN, and MolProbity before using any model for docking or MD simulation.

Homology Modeling Tutorial: Swiss-Model and MODELLER for Beginners

What homology modeling is

When to use homology modeling vs AlphaFold

How sequence identity affects model quality

Running Swiss-Model: step by step

Running MODELLER: step by step

Evaluating model quality: DOPE, QMEAN and MolProbity

What to report in a methods section

Homology modeling in one paragraph

8 Common Mistakes When Using AlphaFold Structures in Research (And How to Avoid Every One)

What is Protein Structure Prediction? A Beginner’s Guide for Structural Biologists

How to Interpret AlphaFold Output: pLDDT Scores, PAE Plots and Model Quality

How to Validate a Predicted Protein Structure Before Using It in Research

How to Use AlphaFold2 with ColabFold: Predict a Protein Structure for Free (2026 Guide)

AlphaFold vs Homology Modeling: When to Use Each

Leave a Reply Cancel reply

What homology modeling is

When to use homology modeling vs AlphaFold

How sequence identity affects model quality

Running Swiss-Model: step by step

Running MODELLER: step by step

Evaluating model quality: DOPE, QMEAN and MolProbity

What to report in a methods section

Homology modeling in one paragraph

Similar Posts

Leave a Reply Cancel reply