EVOLVE
Single and Multi-Objective Genetic Algorithm for Molecular Design
Outline
Introduction to Evolutionary Algorithms (EAs)
Introduction to Single Objective Genetic Algorithms (SOGAs)
- Genetic Representation
- Algorithm Overview
Selection
Crossover
Mutation
Introduction to Multi-Objective Genetic Algorithms (MOGAs)
- Algorithm Overview
Applications
Future Work
Evolutionary Algorithms
Uses mechanisms inspired by biological evolution, i.e
-
Reproduction
-
Mutation
-
Recombination
- Natural Selection
EA Application
Travelling Salesman Problem
Good combinatorial / permutational problem solvers!
BiologY
BIOLOGY
Biology
BIOLOGY
"One general law, leading to the advancement of all organic beings, namely, multiply, vary, let the strongest live and the weakest die."
-Charles Darwin
Single Objective Genetic Algorithm
Each solution contains a "chromosome" which fully defines it in terms of the property to be optimized
- Crossover Operator - Reproduction, Recombination
-
Mutation Operator - Mutation
-
Selection Operator - Natural Selection
Algorithm Overview
Natural Selection
SELECTION
- Operator designed to select the parents of the next generation of candidate solutions
- Chance of being selected is proportional to fitness
- Creates a "mating pool"
- Essential operator to reach near-global minimum
- Many operators available
Roulette
UNIVERSAL Stochastic
- Same as roulette wheel, but wheel only spun once.
- Wheel divided into n equally spaced portions.
- random starting position generated.
- move around wheel in equidistant steps, sampling at every point visited
-
less bias
tournament
-
Individual solutions compete in duels to enter mating pool
- Repeated until mating pool filled
- User defines the number of individuals in each fight
- Encourages diversity
- May not give true representation of fitness ranking in mating pool
Truncation
- Solutions in the population sorted according to fitness (or some performance criteria)
- allocate S copies to the top N/S individuals
- Fast convergence
- Problems with diversity
Reproduction - Recombination
Crossover
- Solutions in mating pool perform pairwise recombination/reproduction
- Essential operator to reach near-global minimum
- Number of operators available
- Recommended crossover probability = 50 - 70 %
Single / two Point
Uniform
Coin flipped on each gene
Probability of per-gene exchange chosen in input file
SIMULATED BINARY
- Polynomial distribution used.
-
"Width" of envelope defines how close children are to parents
- "eta" set in input file
Self-adapting simulated binary
- Same as simulated binary crossover
- eta modified each generation depending on children performance
- Found to be very good in a number of complex fitness landscapes
- WIP!
MUTATION
- Operator to produce genetic mutations in chromosome to modify solution
- Necessary to escape local minima
-
Affects convergence
- Recommended (total) probability = 20 - 50 %
- Number of operators available
SELECTIVE
Randomly select one of the genes and replace it with a random variables sampled uniformly between specified gene ranges
GENEWISE
To each of the gene add a value sampled from a normal distribution using a user specified standard deviation
Polynomial
Based on simulated binary crossover
MULTI-OBJECTIVE GENETIC ALGORITHM
- With more than one objective, two solutions may not be better than one another
- Generates "pareto fronts" of solutions, with each front having an associated "rank"
MOGA - Pareto FRONT
MOGA ALGORITHM
decide which solutions should
reproduce, and which should
enter the next generation
MOGA ALGORITHM
Solution: use non-dominated sorting and crowding comparison
Procedure:
-
sort the organisms into pareto fronts
-
crowding distance is a metric to determine how crowded each solution is in a particular front
- Rank dominates followed by crowding comparison
- Algorithm prefers less crowded solutions to ensure diversity
Comparison against Existing Code
-
Unified and generalized molecular editing through OpenBabel
-
Vastly improved computation time (3-7 days vs 5 months, project dependent)
- Modular, bounded fitness functions
-
Greater control over mutation and crossover operators
-
A number of bug fixes
-
Better memory management
- Now possible to link in to other optimization techniques, e.g swarm-particle optimization, in "one-pot"
TEST CASE/APPLICATIONS
-
Optimize a small poly-ALA protein (20AAs) against a complete rotamer library (100 + structures) for all 20 amino acids at different dielectrics
-
Fitness Function - Classical MM Minimized Energy
Results
RESults
- Lowest energy structure found thus far: pure poly-TRP protein
- Fitness: -37.6 kcal/mol
- Sequence: W04 W04 W04 W03 W04 W04 W05 W04
- Many other low energy structures found
Future APPLICATIONS / DEVELOPMENTS
Code
- Scripting interface for user-defined fitness functions
- Structures for approximate fitness functions
- Parallelisation for highly-distributed/GPU architecture
Near Future Applications
-
Optimize the molecular structure of an organic dye for use in solar cells
EVOLVE - Single/Multi-objective Genetic Algorithm for Molecular Design
By Nick Browning
EVOLVE - Single/Multi-objective Genetic Algorithm for Molecular Design
- 1,214