Homology Modeling Tutorial: Swiss-Model and MODELLER for Beginners
Homology modeling has been the workhorse of structural biology for three decades — and despite AlphaFold’s emergence, it remains the method of choice in several important situations. This tutorial explains what it is, when to use it, and how to run it using two of the most widely used tools: Swiss-Model and MODELLER.
What homology modeling is
Homology modeling — also called comparative modeling — predicts the 3D structure of a protein (the target) by using the known experimental structure of a related protein (the template) as a guide. The underlying assumption is evolutionary: proteins that share significant sequence identity almost certainly share a similar fold. If two proteins are 50% identical in sequence, their backbone structures typically superimpose within 1–2 Å.
The method works in three stages. First, find one or more structurally characterized proteins with detectable similarity to your target. Second, align the target sequence to the template sequence, mapping each residue of the target to a position in the template structure. Third, use that alignment to build a 3D model: copy the template backbone, rebuild divergent regions (loops) using sampling algorithms, and optimize side chains and overall geometry by energy minimization.
The quality of the result depends almost entirely on two things: the quality and resolution of the template, and the sequence identity between target and template. Both are checkable before you run a single calculation.
When to use homology modeling vs AlphaFold
- No structural template exists in the PDB
- Sequence identity to known structures is below 30%
- You need a quick, no-template prediction
- The protein is in the AlphaFold Database already
- You want confidence scores (pLDDT, PAE)
- Large-scale proteome-wide prediction
- You need a specific conformation (active, ligand-bound)
- A high-identity template (>50%) exists in a relevant state
- You want explicit control over template selection
- Modeling a mutant using the wild-type structure
- Comparing with previous homology modeling literature
- MHC/antibody modeling with specialized templates
The most practically important case where homology modeling beats AlphaFold is conformation control. AlphaFold predicts its best estimate of the stable apo (unbound) state. If you want a model in the active conformation, or in the conformation observed when a specific ligand is bound, and a close homolog crystallized in that conformation exists in the PDB — homology modeling using that template will give you exactly what you need. AlphaFold cannot.
How sequence identity affects model quality
Sequence identity between target and template is the single most predictive factor of homology model quality. Before choosing a template, check the identity — it determines whether the model will be worth building.
Running Swiss-Model: step by step
Swiss-Model (swissmodel.expasy.org) is the world’s most widely used homology modeling server. It is fully automated — you provide a sequence, it finds templates, aligns, builds, and evaluates a model in a single submission. No installation, no command line, no coding required.
Go to swissmodel.expasy.org. Click Start Modelling. Paste your protein sequence in FASTA format into the target sequence field. Optionally paste in a project name — something descriptive so you can find the result later. Click Build Model.
Swiss-Model will automatically search the PDB for structural templates, select the best one based on sequence identity and coverage, build the alignment, and generate a model. For a typical protein (200–400 residues), this takes 2–10 minutes. You’ll receive an email when it’s done if you register, or you can keep the browser tab open.
The results page shows the template(s) selected, their PDB IDs, sequence identity, and coverage. This is the first thing to check: is the template appropriate for your question?
- Sequence identity — ideally above 40%; check the scale above
- Coverage — what fraction of your sequence is covered by the template? Low coverage means large regions will be modeled without a template (loop modeling), which is much less accurate
- Template resolution — prefer templates at 2.5 Å or better; lower resolution templates have larger coordinate uncertainties
- Template conformation — is the template in the state you need (apo, holo, active, inactive)? Click the PDB ID to check in the RCSB
If the auto-selected template is not appropriate, use the Template Selection tab to search for and manually specify a better template. This is one of Swiss-Model’s key advantages over automated tools — you can override the template choice.
Swiss-Model automatically evaluates the model with two quality scores: QMEAN and the Ramachandran plot. Both appear on the results page alongside a 3D viewer.
The QMEAN Z-score compares your model to experimentally determined structures of similar size. A score near 0 means the model quality is consistent with real crystal structures of that size. Scores below −4 indicate poor model quality and should prompt investigation — wrong template, poor alignment, or a difficult loop region.
The Ramachandran plot shows backbone dihedral angle distributions. In a good model, over 95% of residues should fall in the favored regions (dark blue area). Residues in outlier regions (red) have strained geometry and may indicate modeling errors at those positions.
Download the model PDB file from the Download section for use in docking, MD simulation, or further analysis.
Running MODELLER: step by step
MODELLER is a command-line program developed at UCSF that gives you full control over every step of the homology modeling process. It’s more complex than Swiss-Model but offers capabilities Swiss-Model doesn’t: multi-template modeling, loop refinement protocols, and model optimization you can tune for your specific application. It’s free for academic use.
conda install -c conda-forge modeller. You’ll need to register for a free academic license key at salilab.org/modeller and add it to your environment. The conda package handles the rest.
The core MODELLER workflow requires three things: your target sequence in FASTA format, the template PDB file, and an alignment file in PIR format mapping target residues to template residues. Swiss-Model (or HHpred for more sensitive alignments) can generate this alignment for you.
Save this script as model.py in your working directory alongside the template PDB and alignment file:
# Basic MODELLER homology modeling script
# Replace TARGET, TEMPLATE, and ALIGNMENT values with yours
from modeller import *
from modeller.automodel import *
env = Environ()
env.io.atom_files_directory = ['.', '../templates']
# AutoModel: standard homology modeling
a = AutoModel(
env,
alnfile = 'alignment.ali', # PIR alignment file
knowns = '4XYZ', # template PDB code (4-letter)
sequence = 'target_protein', # target sequence name in alignment
assess_methods = (assess.DOPE,
assess.GA341)
)
# Generate 5 models — more models = better sampling
a.starting_model = 1
a.ending_model = 5
a.make()
# Select best model by DOPE score (most negative = best)
ok_models = [m for m in a.outputs if m['failure'] is None]
key = 'DOPE score'
ok_models.sort(key=lambda a: a[key])
m = ok_models[0]
print(f"Best model: {m['name']} — DOPE: {m[key]:.3f}")
python model.py 2>&1 | tee modeller.log
MODELLER generates the requested number of models (5 in the script above), calculates DOPE scores for each, and outputs them to the terminal. The best model is the one with the lowest (most negative) DOPE score.
Evaluating model quality: DOPE, QMEAN and MolProbity
No homology model should be used for research without a quality assessment. Three complementary metrics cover different aspects of model quality:
After DOPE and QMEAN, run the model through MolProbity (molprobity.biochem.duke.edu) for stereochemical validation. MolProbity checks Ramachandran statistics, rotamer quality, bond lengths and angles, and steric clashes. A publication-quality homology model should have:
- Ramachandran favored > 95%, outliers < 0.5%
- Rotamer outliers < 1%
- MolProbity clashscore below 20 (ideally below 10)
- MolProbity score below 2.0
What to report in a methods section
A complete homology modeling methods section should include: the target sequence source (UniProt accession), template selection method (automated Swiss-Model vs manual), the selected template PDB ID and chain, sequence identity and coverage to the target, how many models were generated, which model was selected and by what criterion (DOPE score, QMEAN), and the stereochemical validation results from MolProbity.
Homology modeling in one paragraph
Homology modeling builds a 3D protein model by copying the backbone from a structurally characterized relative and optimizing the sequence-specific differences. It works best when sequence identity to the template exceeds 30%, and it remains the preferred method when you need a model in a specific conformation — active, ligand-bound, or mutant — that AlphaFold cannot access from sequence alone. Swiss-Model is the fastest entry point: paste your sequence, get a model and quality scores in minutes. MODELLER gives you full control for publication-quality work. Always validate with DOPE, QMEAN, and MolProbity before using any model for docking or MD simulation.