Introduction to Biopython

Biopython for Computer-Aided Drug Design (CADD) I

What You Need to Know Before Starting

This tutorial is written for someone who:

Has never used Biopython before
May not know what a FASTA file is
Wants to do CADD (docking, structure modeling, MD) and needs the right protein sequence

What is Biopython?

Biopython is a Python toolkit that provides reliable, biology-aware tools to work with sequences and some biology file formats.

In CADD, most workflows need a protein sequence (FASTA) and a protein structure (PDB/mmCIF) when dealing with structure-based drug design.

If the sequence/protein that is used is wrong (wrong isoform, truncated, wrong chain), then the following errors can occur:

You need to think of the following concept like this: