Computational Biology
(BIOSC 1540)
Oct 24, 2024
Lecture 15:
Ensembles and atomistic insights
Number of Particles: Biological systems contain billions of atoms interacting simultaneously
Thermal Motion: Atoms and molecules are in constant motion due to thermal energy
Uncertainty and Variability: Exact positions and velocities of particles are inherently uncertain
Microscopic level: Individual atoms and molecules
Macroscopic level: Bulk properties from collective behavior
Atomistic systems are stochastic, measurable properties are computed as averages
Statistical mechanics: Uses statistical methods to relate microscopic properties to macroscopic observables
Relevance to biology: Helps in understanding the dynamics of proteins, DNA, and other biomolecules
Changing any one of these values changes the macrostate
A macrostate specifies the temperature, pressure, volume, and number of particles of a molecular system
Example: Methanol and water
Composition: 70% methanol and 30% water by volume
Temperature: 25 C
Pressure: 1.01325 bar
Volume: 100 mL
His148 in GFP stabilizes the anionic chromophore through a hydrogen bond
Let's use MD simulations to compute hydrogen bond length and energy
How would you approach this?
Our macrostate: roGFP2 in water, with 150 micromolar NaCl at 300 K and 1 atm
An ensemble is the collection of all possible microstates of a single macrostate
A microstate is a unique configuration defined by the positions and velocities of all particles
Here is the MD trajectory
What is wrong with this?
with a mean of 3.155 Å
The MD simulation is extremely short
Our previous MD simulation was very short
Longer simulations provide better sampling of microstates and their probabilities
More accurate hydrogen bond distance estimate!
Remember: Multiple microstates (i.e., configurations) can have the same distance
We measure the ensemble probability of observing a microstate with value
Expected value of ensemble is computed by weighted mean
Note: Our denominator will always be 1 because we are not using actual partition function
2.946 Å
Microcanonical Ensemble (NVE):
Fixed Number of particles (N), Volume (V), and Energy (E)
Most common
Canonical Ensemble (NVT):
Fixed Number of particles (N), Volume (V), and Temperature (T)
Isothermal-Isobaric Ensemble (NPT):
Fixed Number of particles (N), Pressure (P), and Temperature (T)
Here is a plot of simulation temperature during a 500 ps MD simulation at 300 K
Is there something wrong with the simulation?
Talk with your neighbors on what could be wrong with the simulation
Remember: Macrostate observables are ensemble averages
True:
Something is wrong
False:
Nothing is wrong
The instantaneous temperature of microstates will fluctuate, but the ensemble average should be constant
There should be no net flow of energy
300 K
500 K
Note: 3/2 comes from each degree of freedom (x, y, z)
Boltzmann constant
Temperature in Kelvin
Ensemble average kinetic energy
Every particle does not have the same velocity; they generally follow the Maxwell-Boltzmann distribution
Mass of each particle
Velocity magnitude
Berendsen thermostat: Adjusts the velocities of all particles uniformly based on the current temperature and the target temperature
Velocity scaling factor
current velocity
is computed by
scaling the
slowly/carefully
based on the
temperature deviation
This prevents abrupt changes that could destabilize the simulation
Simple velocity scaling does not generate a true canonical (NVT) ensemble; it cannot reproduce realistic temperature fluctuations
Berendsen thermostats inaccurately models thermal energy transfer via particle collisions
Momenta scaling provides realistic kinetic energy and thus temperature control
This is the principle behind the Nosé-Hoover thermostat
If two particles of different masses collide, will their velocities scale in the same way?
No
1
2
Momenta adjustment:
This heat bath allows thermal energy to flow in and out of our simulation
"Friction" coupling constant:
is a "mass" coupling parameters that controls thermostat responsiveness
Adjusts the volume of the simulation box to achieve and maintain target pressure
Virial corrections to real gas
Corrects for intermolecular forces
Represents thermal energy of ideal gas
Assumes (1) non-interacting particles and (2) elastic collisions
Pressure is directly proportional to density and temperature
Same concept as Berendsen thermostat: Scale box volume based on pressure difference to target
Atomic positions get scaled with box size
Velocities do not get affected
Remember: Starting structures often come from experiments not relevant for our simulation
Once our macrostate variable(s) reach steady state, we are now sampling valid microstates
Remember: Ensemble averages improve with more simulation time by sampling additional microstates
"Replicates" do not exists as it does in experimental biology and chemistry
Which simulation protocol provides better sampling of microstates?
Assume: Each simulation starts with the same structure, but different initial velocities
Option 1
Three simulations of 500 ns each
Option 2
One simulation of 1,500 ns
Option 1 is correct
Suppose the initial velocities send it in
this direction
Suppose my simulation starts
here
on my potential energy surface (PES)
There is a chance that it never samples this minima
Multiple simulations
with random velocities reduces this chance
RMSD measures the overall change in the structure during a simulation, tracking deviations from the starting conformation
The difference between the coordinates represents the displacement of atom iii from its reference position at time ttt
Number of atoms to compare
Position of atom i at time t
Reference position of atom i
RMSF identifies regions of flexibility in the protein by calculating the fluctuation of each atom or residue
This measures how much the atom is fluctuating around its mean, not relative to a reference structure
Total number of time frames
Position of atom i at time t
Average position of atom i
Regions in red have high flexability
LID is an ATP binding domain
NMP is an ADP binding domain
Adenylate kinase (AdK), a phosophotransferase enzyme
PMF represents the effective potential that governs the behavior of a system along a collective variable
A collective variable defines the progress of an interaction or molecular reaction
Common collective variable include distances between atoms, bond angles, or dihedral angles.
This 1D PES comes from 1500 ns of roGFP2 simulation data
System | ΔG [kcal/mol] |
---|---|
Reduced | -0.559 |
Oxidized | -1.329 |
Cu(I) | -0.282 |
His148 in GFP stabilizes the anionic chromophore
Probability density
Energy
Lecture 15:
Ensembles and atomistic insight
Today
Tuesday
Lecture 16:
Structure-based drug design