Computational Biology Seminar
(BIOSC 1630)
Sep 11, 2024
Lecture 03:
Paper 1
Disclaimer: There are some oversimplifications and missing nuances in some physics and explanations. This is done to help students digest this material
Describe the basic stages of drug discovery and explain the role of computational methods in modern drug design
Many use biochemical, in vitro, and in vivo assays to identify drug targets
This is relevant, but costly data
Reducing this cost is a high priority (and we are making good progress)
Using computational power to expand our search space for novel compounds
Highly interdisciplinary:
Search through tons of molecules to find a few that show promise
Need to assess
Structure-based
Ligand-based
Identify the main types of molecular forces and explain how they relate to binding affinity and free energy.
We can model this as a reversible protein-ligand binding
Too much binding: Potential toxicity and long-term effects
Too little binding: No effect
However, it is much harder to identify drugs that bind enough
Thermodynamics
Kinetics
Computing either
or
is sufficient for now
We usually start with free energy change
Kinetic calculations are numerically sensitive and require long simulations
Entropy
Enthalpy
Accounts for energetic interactions
How much conformational flexibility changes
Typically, we don't calculate enthalpy and entropy separately; just straight free energy
If the molecule connectivity does not change; intramolecular interactions are consistent
We can then focus on intermolecular (i.e., non-covalent) interactions
Changes in these interactions contribute to enthalpic changes
(Desolvation is another, but not important right now.)
Electron density
Hard spheres
Instead of modeling the wave function, we model atoms as hard spheres
Quantum particles behave like a "wave" and "particle"
Electrons are neither; they are something else
A (over) simplification is to think of electrons like a swirling, charged dust cloud
Thus, our electron clouds will distort based on "unequal sharing of electrons"
Here, Si and O have different electronic properties
(We call this "electronegativity")
We call these dipoles because they vary about the radius and z axis
Quadrupoles have additional variation
These contributions are called "polarization", and they are not always included because of cost
There is an unequal distribution of electron density in rings
Edge-to-face
Displaced
Face-to-face
Explain basic concepts of statistical thermodynamics, including ensemble averages and the relationship between microscopic properties and macroscopic observables.
Great!
Now we just need to compute these differences
Let's focus on the enthalpic contributions of the ligand
Remember that these are free energies in solution!
The issue is, there is more than one conformation
We can go through and compute all interactions using a force field (discussed later)
We can compute the ligand's free energy by computing the mean of all conformations
Well, no. We have an issue.
(This collection of conformations is called an ensemble.)
Suppose some conformations have really high energy
If we use a simple mean, then these conformations have equal weight to low energy conformations
Molecules spend more time in low-energy conformations, so they should have a larger contribution to the average
So, we can compute a weighted average by
What is the weight for each conformation?
This is called the Boltzmann weight
is the Boltzmann constant
Identify important degrees of freedom
Scan along each angle with a step size of a N degrees
Remove structures with high strain
How many different conformations would we have in this molecule if we scanned only dihedrals every 45 degrees?
8 dihedrals
1
2
3
4
5
6
7
8
8 angles
8 × 8 × 8 × 8 × 8 × 8 × 8 × 8 = 16,777,216
That's a lot of structures, and many of them will clash!
We almost never do a systematic search in practice without some precautions to combinatorics
High energy conformations will have a small weight, so we can get close enough if we just identify low energy conformations
It's much easier to run molecular simulations to "sample" low energy geometries
For high accuracy, you still need high energy conformations
Explain the basic principles of molecular simulations.
Suppose we have 3D coordinates of atoms in our system
These atoms exert forces on each other
Using Newton's equation of motion, we can predict their velocity
Now, we move the atoms the distance they would travel in one femtosecond
Then we repeat
By running these simulations correctly, you can sample low energy conformations
Most simulations (in my experience) are on the order of 100s of ns
Explain the basic principles of force fields.
Quantum mechanics is the most accurate at a steep, step computational cost
Iteratively optimize orbital shapes until you minimize energies
Many, many intensive integrals
30 atoms can take hours
Instead, we use analytical expressions to approximate quantum chemical forces
Analytical functions (i.e., typical equations) are way faster to compute
H2 energies along a bond scan
What do you think the curve would look like?
Energies are computed with CISD/aug-cc-pVTZ
Do we care about all bond lengths?
No
Unless we are breaking bonds, we only care about the minimum
Note that we shifted the minimum to be at zero
1/2 is optional
Exponentials are significantly slower to calculate
Simple timing test showed Morse potentials are at least 1.5 to 2.0 times slower
Balance of cost and accuracy
Systematic evaluations can develop transferable, generalized parameters
Energies are computed with CISD/cc-pVTZ
Aside: We have been ignoring the zero-point vibrational energy
The lowest energy of a dimer is not at the bottom of the energy function
However, this is computationally intensive to account for?
Molecules will still vibrate at 0 Kelvin (and we can never get to 0 Kelvin)
Energies are computed with CISD/cc-pVTZ
Energies are computed with MP2/cc-pVTZ
Even if we could, QC is not perfect
Different "level of theories" gives you different accuracy
If you fit a classical force field just to QC, you will get "okay" accuracy
We cannot run accurate QC on whole proteins, so we have to chunk it into amino acid interactions
How do we get accurate, useable force fields?
Use experimental data
Experimental Data for Protein Force Field Fitting:
This is why we have different force fields. Different labs focus on specific criteria and have opinions on what is important
Compare and contrast Free Energy Perturbation (FEP) and Thermodynamic Integration (TI), including their advantages and limitations.
This is a theoretically valid way, but is not practical
Binding energies are small (e.g., ~10 kcal/mol)
Absolute free energies are very large (e.g., thousands of kcal/mol)
Sampling is largely uncorrelated
This has several advantages:
More relevant conformational sampling
Can run independent simulations in parallel
Focuses on taking differences with smaller numbers
This technical is generally called alchemical simulations
One means interactions are normal; zero means no intermolecular interactions are on
Intramolecular interactions are left alone
Our non-covalent interactions:
Electrostatic and van der Waals interactions
We can to integrate over these small free energy changes
In TI, we run a simulation with one alchemical parameter value
In FEP, we run fewer simulations but calculate the energy of other alchemical parameters at the same time
Explain how replica exchange methods enhance sampling in molecular simulations and their application in free energy calculations.
Similar to a reaction, our simulations have to overcome energetic barriers to sample conformations
For vanilla molecular simulations, all we can do is sit and wait simulations to end up in that conformation
We artificially add energy to conformations we have seen before
This reduces the energetic barrier to get to high energy conformations
After our simulations, we can remove this bias
This is called metadynamics
For reactions, we can use bond lengths as coordinates
What coordinate can we use for protein and ligand conformations?
However, reducing all of these conformations to one number is a massive oversimplification
By cycling the temperature, we can help escape local minima
Lecture 04:
Paper 01 discussion
Lecture 03:
Paper 01 methods
Today
Next Wednesday