Computational Biology
(BIOSC 1540)
Oct 31, 2024
Lecture 17:
Docking and virtual screening
Alchemical simulations compute free energy changes by gradually transforming one molecule into another
Importance in drug discovery: Highly precise, offering detailed insights into binding affinities essential for drug design
Atomistic forces: Computes forces for all atoms in proteins, ligands, cofactors, ions, solvents for millions of structures
Detailed sampling: Captures a wide range of conformations, which adds more dimensions to the calculation
Alchemical parameters: Simulations must be performed at various alchemical parameters
Okay, so how long is this really?
~10,000 CPU hours
(For context, most supercomputers have ~24 cores per $30,000 node.)
Remember: Chemical space is unfathomably large and the role of computation is to virtually test as many compounds as possible
Objective: Directly predict binding affinity from protein and ligand structures with high accuracy and minimal computational resources.
We can carefully simplify our methodology to improve speed with (hopefully) minimal impact to accuracy
What are some ideas?
Avoid sampling all microstates and determine one "optimal" protein-ligand structure
Using this bound structure, predict a "score" that is correlated to binding affinity
This is called docking
Significance of Protein Conformation in Docking
Docking still considers the protein structure, but we only select one
Experimental Methods
Computational Techniques
Extract representative structures using clustering algorithms.
Identify conformations with open or closed binding sites.
Role in Binding: Structured water molecules can mediate interactions between the protein and ligand.
Handling Water in Docking
Inclusion Criteria: Retain water molecules that are conserved across multiple crystal structures.
The binding pocket is the specific region where a ligand interacts with a protein
Accurate identification of binding pockets is essential for successful docking and virtual screening.
Terminology
Protein Surface Characteristics
Alpha Shape Theory: Uses Delaunay triangulation and alpha complexes to define cavities.
Cryptic sites are hidden in the unbound structure and require conformational changes to become apparent.
Strategies
Case Study: Identification of allosteric sites in Hsp90
Search strategies
Identify important degrees of freedom
Scan along each angle with a step size of a N degrees
Remove structures with high strain
How many different conformations would we have in this molecule if we scanned only dihedrals every 45 degrees?
8 dihedrals
1
2
3
4
5
6
7
8
8 angles
8 × 8 × 8 × 8 × 8 × 8 × 8 × 8 = 16,777,216
That's a lot of structures, and many of them will clash!
We almost never do a systematic search in practice without some precautions to combinatorics
Monte Carlo
Steps:
Allows us to sample efficiently
Use pre-generated libraries
Physics-based methods using force-field like methods
Recently, machine learning (e.g., graph neural networks) have been gaining traction
Overview of ERK2:
Role in Disease:
AlphaFold 2 Prediction
Lecture 17:
Docking and virtual screening
Today
Thursday
Lecture 18:
Ligand-based drug design
Objective: Directly predict binding affinity from protein and ligand structures with high accuracy and minimal computational resources.
Why Empirical Data Matters
Thermal shift assays
Data-driven approaches leverage large datasets and machine learning algorithms to make predictions, identify patterns, and generate models without relying solely on physical or chemical principles.
These methods harness the power of available biological data to accelerate research and discovery processes.