How to Set Up a Python Environment for Structural Biology (2026)

10 min read Beginner

Getting your Python environment right is the unglamorous prerequisite to everything else. Do it once properly — with conda, a dedicated environment, and all the right packages — and you’ll never hit a dependency conflict or a “module not found” error mid-analysis.

Why conda, not pip

Python packages can be installed with either pip (Python’s built-in package manager) or conda (Anaconda’s environment manager). For structural biology, conda is strongly preferred for two reasons.

First, structural biology packages like MDAnalysis have compiled C and C++ extensions that pip installs from source — requiring a working compiler toolchain on your machine, which is not always present and frequently causes cryptic error messages. Conda installs pre-compiled binary packages that work immediately on any supported platform.

Second, conda manages full environment isolation. A dedicated structbio environment has its own Python interpreter and package set, completely separate from any other Python on your machine. This means BioPython’s dependencies can’t conflict with your web scraping project’s dependencies, and you can always recreate the environment from scratch if something breaks.

Miniconda vs Anaconda

Anaconda is the full distribution — 3 GB, pre-loads hundreds of packages. Miniconda is the minimal installer — around 100 MB, gives you conda and Python, nothing else. Use Miniconda. You’ll install exactly what you need for structural biology without gigabytes of unrelated packages slowing down your environment solves.

Installing Miniconda

🍎

macOS

Download the macOS installer (Intel or Apple Silicon) from the Miniconda page. Run the .pkg installer and follow prompts. docs.conda.io/en/latest/miniconda.html

🪟

Windows

Download the Windows .exe installer. During setup, leave “Add to PATH” unchecked — use Anaconda Prompt instead. docs.conda.io/en/latest/miniconda.html

🐧

Linux

Download the .sh installer and run it in terminal. Works on any distribution — Ubuntu, CentOS, Rocky Linux. docs.conda.io/en/latest/miniconda.html

Linux / macOS — terminal

# Download and run the Miniconda installer (Linux example)
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
bash Miniconda3-latest-Linux-x86_64.sh

# Follow prompts — accept license, confirm install location
# When asked "Do you wish to initialize Miniconda3?" → yes

# Restart terminal, then verify
conda --version
# Expected output:
conda 24.x.x

Apple Silicon Macs — download the right installer

There are two macOS Miniconda installers: one for Intel (x86_64) and one for Apple Silicon (arm64 / M1/M2/M3). Download the arm64 version for any Mac with an M-series chip. Installing the Intel version on Apple Silicon causes subtle performance issues and occasional package incompatibilities with conda-forge builds.

Creating the structbio environment

Once Miniconda is installed, create a dedicated environment for all structural biology work. Using a separate environment means you can always return to a clean working state by recreating it, and you’ll never break your base conda installation by installing conflicting packages.

All platforms

# Create the environment with Python 3.10
conda create -n structbio python=3.10 -y

# Activate it — do this at the start of every work session
conda activate structbio

# Your prompt changes to show the active environment:
(structbio) $

# Deactivate when done
conda deactivate

Always activate before working

Every time you open a new terminal session, run conda activate structbio before running any Python code. If you see (base) in your prompt instead of (structbio), you’re working in the base environment and none of the structural biology packages are available. Add the activation command to your shell config (~/.zshrc on macOS or ~/.bashrc on Linux) if you want it to activate automatically.

Installing the essential packages

With the environment active, install all structural biology packages in a single conda command. Installing everything at once lets conda resolve the full dependency graph correctly — doing it piecemeal can lead to conflicts.

All platforms — with structbio active

# Install all essential packages in one command
conda install -c conda-forge \
    biopython \
    mdanalysis \
    numpy \
    pandas \
    matplotlib \
    scipy \
    jupyter \
    -y

# This takes 3–10 minutes depending on connection speed
# conda-forge has the most up-to-date builds of all these packages

Package	What it does	Used for
biopython	Structure and sequence analysis	Parsing PDB files, protein properties, RMSD, AlphaFold structures
mdanalysis	MD trajectory analysis	Loading trajectories, RMSD/RMSF, H-bonds, per-frame analysis
numpy	Numerical arrays and math	Coordinate arrays, distance matrices, mathematical operations
pandas	Tabular data and DataFrames	Docking result tables, RMSF summaries, data filtering and export
matplotlib	Plotting and visualization	RMSD plots, RMSF bar charts, score distributions
scipy	Scientific algorithms	Clustering, statistics, signal processing on trajectory data
jupyter	Interactive notebooks	Exploratory analysis, inline plots, sharing workflows

Setting up Jupyter notebooks

Jupyter notebooks let you run Python code in interactive cells and see plots inline — ideal for exploratory analysis where you want to try different selections or visualization parameters without rerunning the entire script. For automated pipelines that run on servers or process large datasets, plain .py scripts are more appropriate.

Terminal — with structbio active

# Launch Jupyter — opens in your default browser
jupyter notebook

# Or use JupyterLab (more modern interface)
conda install -c conda-forge jupyterlab -y
jupyter lab

# To run on a remote server (HPC cluster) without a browser:
jupyter notebook --no-browser --port=8888
# Then SSH tunnel from your local machine:
# ssh -L 8888:localhost:8888 username@cluster.university.edu
# Then open http://localhost:8888 in your local browser

When Jupyter opens, create a new notebook with the Python 3 kernel. The kernel should automatically use your structbio environment if you launched Jupyter from within it. If you see “kernel not found” or imports fail, confirm that you activated the environment before running jupyter notebook.

Working on an HPC cluster?

Most university HPC clusters don’t allow running Jupyter directly in a browser from a login node. The standard workflow is to request an interactive compute node (srun --pty bash on SLURM systems), activate your conda environment there, start Jupyter with --no-browser, then set up the SSH tunnel described above. Your university’s HPC documentation should have cluster-specific instructions for port forwarding.

Verifying the installation

Run these checks in a new Python session or Jupyter notebook cell to confirm everything is installed and working before starting your first analysis:

Python / Jupyter cell

import Bio
import MDAnalysis as mda
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

print(f"BioPython:    {Bio.__version__}")
print(f"MDAnalysis:   {mda.__version__}")
print(f"NumPy:        {np.__version__}")
print(f"pandas:       {pd.__version__}")

# Quick functional test — parse a structure from the PDB
from Bio.PDB import PDBList, PDBParser

pdbl = PDBList()
pdbl.retrieve_pdb_file("1ubq", file_type="pdb", pdir=".")

parser = PDBParser(QUIET=True)
structure = parser.get_structure("ubq", "./ub/pdb1ubq.ent")
chain = structure[0]["A"]
print(f"Ubiquitin: {len(list(chain.get_residues()))} residues loaded")
# Expected output:
Ubiquitin: 76 residues loaded

Check	Expected result	If it fails
import Bio	No error	Run `conda install -c conda-forge biopython` again
import MDAnalysis	No error	Run `conda install -c conda-forge mdanalysis` again
Structure loads, 76 residues	76 residues loaded	Check internet connection; PDBList downloads from RCSB
jupyter notebook opens	Browser opens at localhost:8888	Try `conda install -c conda-forge jupyter` again

Optional: adding PyMOL to the same environment

If you also do PyMOL visualization work, installing it in the same structbio environment means your analysis scripts can call PyMOL commands directly — loading a structure in BioPython, filtering it, then passing it to PyMOL for figure generation in one pipeline:

Terminal — with structbio active

# Add open-source PyMOL to the same environment
conda install -c conda-forge pymol-open-source -y

# Verify
python -c "from pymol import cmd; print('PyMOL available in structbio')"

With all three — BioPython, MDAnalysis, and PyMOL — in a single environment, you can write scripts that span the full structural biology workflow: download an AlphaFold structure with BioPython, analyze the trajectory with MDAnalysis, and generate a publication figure with PyMOL, all in one Python file.

Save your environment spec for reproducibility

Once your environment is set up and working, export its full specification so you (or a collaborator) can recreate it exactly: conda env export > structbio_environment.yml. Recreate from the file later with: conda env create -f structbio_environment.yml. Add this YAML file to your project’s git repository — it’s the equivalent of a requirements.txt but with full dependency pinning.

Environment setup in one paragraph

Install Miniconda (not Anaconda), create a dedicated structbio environment with Python 3.10, and install BioPython, MDAnalysis, NumPy, pandas, matplotlib, and Jupyter in one conda command using the conda-forge channel. Activate the environment with conda activate structbio at the start of every session. Verify the install by importing all packages and loading a test structure. Export the environment spec with conda env export and commit it to your project repository. Do this once and every tutorial in this pillar works from the same clean, reproducible foundation.

Last updated on April 29, 2026

How to Set Up a Python Environment for Structural Biology (2026)

Why conda, not pip

Installing Miniconda

Creating the structbio environment

Installing the essential packages

Setting up Jupyter notebooks

Verifying the installation

Optional: adding PyMOL to the same environment

Environment setup in one paragraph

BioPython Tutorial for Beginners: Parsing PDB Files and Working with Protein Structures

How to Find and Analyze Protein-Protein Interfaces with BioPython

MDAnalysis Tutorial for Beginners: Loading and Working with MD Trajectories in Python

How to Calculate RMSD and Align Protein Structures with Python and BioPython

How to Download and Parse AlphaFold Structures with Python and BioPython

How to Analyze Hydrogen Bonds in MD Simulations with MDAnalysis

Leave a Reply Cancel reply

Why conda, not pip

Installing Miniconda

Creating the structbio environment

Installing the essential packages

Setting up Jupyter notebooks

Verifying the installation

Optional: adding PyMOL to the same environment

Environment setup in one paragraph

Similar Posts

Leave a Reply Cancel reply