Loading
aalexmmaldonado
This is a live streamed presentation. You will automatically follow the presenter and see the slide they're currently on.
Computational Biology
(BIOSC 1540)
Oct 22, 2024
Lecture 14:
Molecular system representations
THF is needed for
Disrupting THF production has a cascading effect on essential cellular processes, primarily affecting DNA and RNA synthesis and amino acid metabolism
This is a useful process for drug design
Dihydrofolate reductase (DHFR) is a crucial enzyme that produces THF from dihydrofolate (DHF)
DHF + NADPH
THF + NADP(+)
DHF
NADPH
(We will use this protein for our project)
DHFR has been extensively studied as an antibiotic (e.g., trimethoprim) and cancer (e.g., methotrexate) target
Patient could have deleterious side effects
What would happen if a patient with a bacterial infection is prescribed a drug loosely targeting DHFR
Both proteins have high structural similarity, even around the active site
Bacteria and humans have similar structures, but their dynamics are different
Outcome: We need to ensure drugs only bind to bacterial proteins by exploiting dynamic insights
MD simulations will explore various low-energy conformations that are, hopefully, similar to reality
Knowing conformations unique to bacteria allow us to design a small molecule that competitively inhibits DHFR
If our starting structure is very far away from our desired equilibrium, our simulations will take longer
For example, we would have to wait for the protein to fold to study it's dynamics
Experimental structures offer the best option for their accuracy
PDB contains experimentally determined structures for thousands of proteins
General resolution preference: X-ray, Cryo-EM, NMR
Proteins can exist in different functional conformations: active vs. inactive state, bound to ligands or unbound
Functional state
Higher B-factors suggest more uncertainty in atom positions, which might make that part of the structure less reliable
B-factors
Flexible loops or disordered regions are often missing from the structure
Completeness
Resolution
The resolution of a structure refers to how well the atomic positions are determined
Tip: A resolution below 2.0 Å is generally preferred for high-quality simulations.
Factor | 7D4L | 4NX6 | 4KJK | 4NX7 |
---|---|---|---|---|
Resolution (Å) | 1.60 | 1.35 | 1.35 | 1.15 |
Temperature | 298 | 298 | 298 | 100 |
R-free | 0.196 | 0.190 | 0.166 | 0.170 |
Clashscore | 2 | 5 | 8 | 12 |
Ramachandran outliers | 0 | 0 | 0 | 0 |
Rotamer outliers | 1 | 2 | 1 | 5 |
Here are some example structural characteristics with the best value in bold
7D4L is a good choice
Resolution and R-free are comparable, and few clashes are highly desirable
Factor | 7D4L | 4KJK |
---|---|---|
Resolution | 1.60 | 1.35 |
Temperature | 298 | 298 |
R-free | 0.196 | 0.166 |
Clashscore | 2 | 8 |
Ramachandran outliers | 0 | 0 |
Rotamer outliers | 1 | 1 |
Alpha carbon RMSD is 0.141 (indicating high similarity)
Either structure would provide comparable results if simulation protocols are appropriate
It’s essential to fix chain breaks and missing loops before simulation
8UCX is missing residues 17 and 18
Missing atoms or residues can be added using modeling software like Modeller
Pink: 8UCX
Grey: Modeller
Dashed lines often indicate missing atoms
Many PDB structures contain ligands, ions, or crystallization agents that are not physiologically relevant
These can distort the protein's behavior in a simulated biological environment if not removed
Manganese (II)
Mercaptoethanol
Water molecules
Ligands
Experimental structures often cannot resolve hydrogens, so we need to add them ourselves
Histidine (His, H): pKa ~6.0
Protonation switching around pH 6 - 7
Cysteine (Cys, C): pKa ~8.3
Could form disulfide bonds in oxidizing environments
Affects interactions like salt bridges and hydrogen bonds
Glu's protonation state affects electrostatic interactions.
Can form ionic bonds with negatively charged residues
Hydrogen bonding and in enzyme active sites
Protonation states of amino acids affect the charge distribution, which influences electrostatic interactions during the simulation
Ions
Potassium, Sodium, Calcium, Magnesium, Iron, Zinc, Copper, Manganese, Phosphate, Chloride, Bicarbonate, Sulfate, Citrate, ATP, ADP, AMP, . . .
Molecules
Glucose, pyruvate, lactate, amino acids, fatty acids, nucleotides, NADH, FADH, citrate, oxaloacetate, biotin, riboflavin, coenzyme A, ubiquinone, . . .
Proteins
Glycolytic enzymes, TCA cycle enzymes, DNA/RNA polymerases, kinases, phosphatases, G-proteins, heat shock proteins, molecular motors, transcription factors, transcription regulators, ribosomes, proteasomes, . . .
Organelles
Mitochondria, endoplasmic reticulum, golgi apparatus, lysosomes, peroxisomes, vacuoles, endosomes, ribosomes, centrosomes, . . .
Cytoskeleton
Actin, profilin, cofilin, myosin, keratins, vimentin, neurofilaments, tubulin, . . .
Membranes
Phospholipid bilayer with embedded proteins, cholesterol, clycoproteins, glycolipids, . . .
and more
What biological or chemical components are crucial for modeling the dynamics of a protein in the cytosol?
We must balance computational feasibility with biological realism
Starting structure for simulating Cu(I) binding to Cys147 and 204 in roGFP2 with Na+ and Cl- counterions
(Actually used in my research.)
For this simulation, we would have to apply a force to keep the molecules in this box
Water molecules and proteins would bounce off these walls in an unphysical manner (i.e., edge effects)
A protein in vivo or in vitro will have plenty of space to move around
We could make the box very large, but this would dramatically increase the cost
Periodic boundary conditions (PBC) is how we solve this issue
Think PackMan: If he crosses the right side of the map, he reappears on the left
We (virually) place exact copies of our system in all directions
Atoms that cross the box edge reappear on the other side; thus, do not have edge effects
Image atoms in adjacent boxes are used to calculate interactions across the boundaries
The minimum image convention (MIC) ensures that an atom in the primary box only interacts with the closest image of another atom
1. Generate structures and use quantum chemistry to compute energy and forces
2. Optimize force field parameters until they reproduce the quantum chemistry dataset
3. Run MD simulations and predict experimental data (e.g., NMR, Raman spectroscopy, solvation energies, etc.)
4. Continue to optimize force field parameters to minimizing quantum chemistry and simulation prediction errors
Force fields are not inherently compatible with each other
Example: Simulating a DNA-binding protein
Suppose my protein force field was fit to:
Suppose my DNA force field was fit to:
Simulations would be unreliable because the force fields are incompatible with each other
Examples:
A topology file contains information on atom types, bonds, angles, dihedrals, and non-bonded interactions based on the chosen force field
Essentially tells the program which force field parameters to use where
%VERSION VERSION_STAMP = V0001.000 DATE = 01/20/24 21:37:10
%FLAG TITLE
%FORMAT(20a4)
default_name
%FLAG POINTERS
%FORMAT(10I8)
33582 19 31714 1852 4022 2505 8139 7890 0 0
59724 10270 1852 2505 7890 89 205 205 47 0
0 0 0 0 0 0 0 1 36 0
0
%FLAG ATOM_NAME
%FORMAT(20a4)
N H1 H2 H3 CA HA CB HB2 HB3 CG HG2 HG3 SD CE HE1 HE2 HE3 C O N
H CA HA CB HB2 HB3 OG HG C O N H CA HA CB HB2 HB3 CG HG2 HG3
CD HD2 HD3 CE HE2 HE3 NZ HZ1 HZ2 HZ3 C O N H CA HA2 HA3 C O N
H CA HA CB HB2 HB3 CG HG2 HG3 CD OE1 OE2 C O N H CA HA CB HB2
HB3 CG HG2 HG3 CD OE1 OE2 C O N H CA HA CB HB2 HB3 CG HG CD1 HD11
HD12HD13CD2 HD21HD22HD23C O N H CA HA CB HB2 HB3 CG CD1 HD1 CE1 HE1
CZ HZ CE2 HE2 CD2 HD2 C O N H CA HA CB HB CG2 HG21HG22HG23OG1 HG1
C O N H CA HA2 HA3 C O N H CA HA CB HB CG1 HG11HG12HG13CG2
HG21HG22HG23C O N H CA HA CB HB CG1 HG11HG12HG13CG2 HG21HG22HG23C
O N CD HD2 HD3 CG HG2 HG3 CB HB2 HB3 CA HA C O N H CA HA CB
HB CG2 HG21HG22HG23CG1 HG12HG13CD1 HD11HD12HD13C O N H CA HA CB HB2
HB3 CG HG CD1 HD11HD12HD13CD2 HD21HD22HD23C O N H CA HA CB HB CG1
HG11HG12HG13CG2 HG21HG22HG23C O N H CA HA CB HB2 HB3 CG HG2 HG3 CD
OE1 OE2 C O N H CA HA CB HB2 HB3 CG HG CD1 HD11HD12HD13CD2 HD21HD22
HD23C O N H CA HA CB HB2 HB3 CG OD1 OD2 C O N H CA HA2 HA3
C O N H CA HA CB HB2 HB3 CG OD1 OD2 C O N H CA HA CB HB
CG1 HG11HG12HG13CG2 HG21HG22HG23C O N H CA HA CB HB2 HB3 CG OD1 ND2
HD21HD22C O N H CA HA2 HA3 C O N H CA HA CB HB2 HB3 CG ND1
HD1 CE1 HE1 NE2 CD2 HD2 C O N H CA HA CB HB2 HB3 CG HG2 HG3 CD HD2
HD3 CE HE2 HE3 NZ HZ1 HZ2 HZ3 C O N H CA HA CB HB2 HB3 CG CD1 HD1
CE1 HE1 CZ HZ CE2 HE2 CD2 HD2 C O N H CA HA CB HB2 HB3 OG HG C
O N H CA HA CB HB CG1 HG11HG12HG13CG2 HG21HG22HG23C O N H CA
HA CB HB2 HB3 OG HG C O N H CA HA2 HA3 C O N H CA HA CB
HB2 HB3 CG HG2 HG3 CD OE1 OE2 C O N H CA HA2 HA3 C O N H CA
HA CB HB2 HB3 CG HG2 HG3 CD OE1 OE2 C O N H CA HA2 HA3 C O N
H CA HA CB HB2 HB3 CG OD1 OD2 C O N H CA HA CB HB1 HB2 HB3 C
O N H CA HA CB HB CG2 HG21HG22HG23OG1 HG1 C O N H CA HA CB
HB2 HB3 CG CD1 HD1 CE1 HE1 CZ OH HH CE2 HE2 CD2 HD2 C O N H CA HA2
HA3 C O N H CA HA CB HB2 HB3 CG HG2 HG3 CD HD2 HD3 CE HE2 HE3 NZ
HZ1 HZ2 HZ3 C O N H CA HA CB HB2 HB3 CG HG CD1 HD11HD12HD13CD2 HD21
HD22HD23C O N H CA HA CB HB CG2 HG21HG22HG23OG1 HG1 C O N H
CA HA CB HB2 HB3 CG HG CD1 HD11HD12HD13CD2 HD21HD22HD23C O N H CA
HA CB HB2 HB3 CG HG2 HG3 CD HD2 HD3 CE HE2 HE3 NZ HZ1 HZ2 HZ3 C O N
H CA HA CB HB2 HB3 CG CD1 HD1 CE1 HE1 CZ HZ CE2 HE2 CD2 HD2 C O N
H CA HA CB HB CG2 HG21HG22HG23CG1 HG12HG13CD1 HD11HD12HD13C O N H
CA HA CB HB2 HB3 OG HG C O N H CA HA CB HB CG2 HG21HG22HG23OG1
HG1 C O N H CA HA CB HB CG2 HG21HG22HG23OG1 HG1 C O N H CA
HA2 HA3 C O N H CA HA CB HB2 HB3 CG HG2 HG3 CD HD2 HD3 CE HE2 HE3
NZ HZ1 HZ2 HZ3 C O N H CA HA CB HB2 HB3 CG HG CD1 HD11HD12HD13CD2
HD21HD22HD23C O N CD HD2 HD3 CG HG2 HG3 CB HB2 HB3 CA HA C O N
H CA HA CB HB CG1 HG11HG12HG13CG2 HG21HG22HG23C O N CD HD2 HD3 CG
HG2 HG3 CB HB2 HB3 CA HA C O N H CA HA CB HB2 HB3 CG CD1 HD1 NE1
HE1 CE2 CZ2 HZ2 CH2 HH2 CZ3 HZ3 CE3 HE3 CD2 C O N CD HD2 HD3 CG HG2 HG3
CB HB2 HB3 CA HA C O N H CA HA CB HB CG2 HG21HG22HG23OG1 HG1 C
O N H CA HA CB HB2 HB3 CG HG CD1 HD11HD12HD13CD2 HD21HD22HD23C O
N H CA HA CB HB CG1 HG11HG12HG13CG2 HG21HG22HG23C O N H CA HA
CB HB CG2 HG21HG22HG23OG1 HG1 C O N H CA HA CB HB CG2 HG21HG22HG23
OG1 HG1 C O N H CA HA CB HB2 HB3 CG HG CD1 HD11HD12HD13CD2 HD21HD22
HD23C O CD2 CE2 CZ CG2 CD1 CE1 CB2 CA2 C2 H10 H12 H11 H9 H8 OH O2 N2
N3 C1 CA3 CA1 H13 H14 C3 O3 N1 H1 H2 CB1 CG1 H5 H6 H7 H3 OG1 H4 N
H CA HA CB HB CG1 HG11HG12HG13CG2 HG21HG22HG23C O N H CA HA CB
HB2 HB3 CG HG2 HG3 CD OE1 NE2 HE21HE22C O N H CA HA CB HB2 HB3 SG
HG C O N H CA HA CB HB2 HB3 CG CD1 HD1 CE1 HE1 CZ HZ CE2 HE2 CD2
HD2 C O N H CA HA CB HB2 HB3 OG HG C O N H CA HA CB HB2
HB3 CG HG2 HG3 CD HD2 HD3 NE HE CZ NH1 HH11HH12NH2 HH21HH22C O N H
CA HA CB HB2 HB3 CG CD1 HD1 CE1 HE1 CZ OH HH CE2 HE2 CD2 HD2 C O N
CD HD2 HD3 CG HG2 HG3 CB HB2 HB3 CA HA C O N H CA HA CB HB2 HB3
CG OD1 OD2 C O N H CA HA CB HB2 HB3 CG ND1 HD1 CE1 HE1 NE2 CD2 HD2
C O N H CA HA CB HB2 HB3 CG HG2 HG3 SD CE HE1 HE2 HE3 C O N
H CA HA CB HB2 HB3 CG HG2 HG3 CD HD2 HD3 CE HE2 HE3 NZ HZ1 HZ2 HZ3 C
O N H CA HA CB HB2 HB3 CG HG2 HG3 CD HD2 HD3 NE HE CZ NH1 HH11HH12
NH2 HH21HH22C O N H CA HA CB HB2 HB3 CG ND1 CE1 HE1 NE2 HE2 CD2 HD2
C O N H CA HA CB HB2 HB3 CG OD1 OD2 C O N H CA HA CB HB2
HB3 CG CD1 HD1 CE1 HE1 CZ HZ CE2 HE2 CD2 HD2 C O N H CA HA CB HB2
HB3 CG CD1 HD1 CE1 HE1 CZ HZ CE2 HE2 CD2 HD2 C O N H CA HA CB HB2
HB3 CG HG2 HG3 CD HD2 HD3 CE HE2 HE3 NZ HZ1 HZ2 HZ3 C O N H CA HA
CB HB2 HB3 OG HG C O N H CA HA CB HB1 HB2 HB3 C O N H CA
HA CB HB2 HB3 CG HG2 HG3 SD CE HE1 HE2 HE3 C O N CD HD2 HD3 CG HG2
HG3 CB HB2 HB3 CA HA C O N H CA HA CB HB2 HB3 CG HG2 HG3 CD OE1
OE2 C O N H CA HA2 HA3 C O N H CA HA CB HB2 HB3 CG CD1 HD1
CE1 HE1 CZ OH HH CE2 HE2 CD2 HD2 C O N H CA HA CB HB CG1 HG11HG12
HG13CG2 HG21HG22HG23C O N H CA HA CB HB2 HB3 CG HG2 HG3 CD OE1 NE2
HE21HE22C O N H CA HA CB HB2 HB3 CG HG2 HG3 CD OE1 OE2 C O N
H CA HA CB HB2 HB3 CG HG2 HG3 CD HD2 HD3 NE HE CZ NH1 HH11HH12NH2 HH21
HH22C O N H CA HA CB HB CG2 HG21HG22HG23OG1 HG1 C O N H CA
HA CB HB CG2 HG21HG22HG23CG1 HG12HG13CD1 HD11HD12HD13C O N H CA HA
CB HB2 HB3 CG CD1 HD1 CE1 HE1 CZ HZ CE2 HE2 CD2 HD2 C O N H CA HA
CB HB2 HB3 CG CD1 HD1 CE1 HE1 CZ HZ CE2 HE2 CD2 HD2 C O N H CA HA
CB HB2 HB3 CG HG2 HG3 CD HD2 HD3 CE HE2 HE3 NZ HZ1 HZ2 HZ3 C O N H
CA HA CB HB2 HB3 CG OD1 OD2 C O N H CA HA CB HB2 HB3 CG OD1 OD2
C O N H CA HA2 HA3 C O N H CA HA CB HB2 HB3 CG OD1 ND2 HD21
HD22C O N H CA HA CB HB2 HB3 CG CD1 HD1 CE1 HE1 CZ OH HH CE2 HE2
CD2 HD2 C O N H CA HA CB HB2 HB3 CG HG2 HG3 CD HD2 HD3 CE HE2 HE3
NZ HZ1 HZ2 HZ3 C O N H CA HA CB HB CG2 HG21HG22HG23OG1 HG1 C O
We never actually look at these files
Non-standard residues or ligands are not always included in standard force field parameter sets
These require additional parameterization to ensure proper interactions in the simulation
Example: GFP chromophore
Energy minimization adjusts the initial structure to remove unfavorable atom positions and steric clashes that could cause instability during simulations
Without minimization, high-energy configurations may lead to unrealistic results or early failures in the molecular dynamics simulation
Steric clashes occur when atoms are too close together, resulting in excessively high energy
Energy minimization gently adjusts the structure to lower the system’s energy
Unphysical
Physical
Lecture 14:
Molecular system representations
Today
Thursday
Lecture 15:
Atomistic insights