September 1, 2022
aalexmmaldonado
Keith Group
Nuclear power
Molten salts
Nuclear Reactor by Olena Panasovska from Noun Project
Batteries
Electrolyte by M. Oki Orlando from Noun Project
Charge carriers and electrolytes
Park, C.; et al. J. Power Sources 2018, 373, 70-78. DOI: 10.1016/j.jpowsour.2017.10.081
Lv, X.; et al. Chem. Phys. Lett. 2018, 706, 237-242. DOI: 10.1016/j.cplett.2018.06.005
Fuel production
Solvation plays an important role
Catalysts
Ma, C.; et al. ACS Catal. 2012, 373, 1500-1506. DOI: 10.1021/cs300350b
Computational modeling
Cost
AIMD
Classical MD
Implicit/explicit
Implicit
Confidence
Cost
Screening
approach
Promising candidates
Search space
without experimental data
Solvation treatments
Confidence requires explicit (MD) methods
Goal
Lv, X.; et al. Chem. Phys. Lett. 2018, 706, 237-242. DOI: 10.1016/j.cplett.2018.06.005
Let's model a molten salt
DFT
MP2
CCSD(T)
Pro: Fast
Con: Parameters
Classical potential
Quantum chemistry
Pro: Accurate
Con: Cost
We need a new method
Confidence
2x atoms
128x cost
Structure
ML potential
Energy and forces
Quantum
chemistry
Calculate total energies with QC
Training a typical ML potential
Sample tens of thousands of configurations
Approximation: atomic contributions can reproduce total energy
Examples: DeePMD, GAP, SchNet, PhysNet, ANI, . . .
Known
Learned
with a local descriptor
Training with a global descriptor
Local
Global
Encodes each
atom
Encodes entire
structure
increased data efficiency
Many descriptors and parameters
Single descriptor
Training on forces provides more information about the geometry and energy relationship
Chimiela, S. ; et al. Sci. Adv. 2017, 3 (5), e1603015. DOI: 10.1126/sciadv.1603015
Gradient-domain machine learning (GDML)
Better interpolation
Global descriptor
Training on forces
+
requires 1 000 structures instead of 10 000+
=
System size is still a limiting factor
Global
Fewer structures enables higher levels of theory
CCSD(T)
CCSD(T)
Local
No
Tons of sampling
CCSD(T)
What we want
What we can afford
Descriptor
Size transferable?
CCSD(T)
How can we make GDML potentials size transferable?
-76.31270
-76.31251
-76.31273
-228.96298
-0.00831
-0.00705
-0.00700
-228.93794
(-0.02504)
-228.96031
(-0.00267)
1 body
1+2 body
3-body
+
+
=
+
+
=
Add energy
Remove energy
All energies are in Hartrees
MBE: the total energy of a system is equal to the sum of all n-body interactions
Truncate
Unlocks size transferability for highly accurate methods
CCSD(T)
Training a many-body GDML (mbGDML) potential
Sample a thousand
configurations
Calculate n-body energy (+ forces) with QC
Calculate total energies with QC
Known
Known
Learned
Reproduce physical n-body energies
Approximation: atomic contributions can reproduce total energy
Incorporates more physics into our ML potentials
Sample tens of thousands of configurations
Unique opportunity with GDML accuracy and efficiency
If successful
Modeling three common solvents
Water (H2O)
Acetonitrile (MeCN)
Methanol (MeOH)
Training set
1 000 structures
(instead of 10 000+)
Sampling
n-body structures from GFN2-xTB simulations
Level of theory
MP2/def2-TZVP
ORCA v4.2.0
Isomer rankings
Which tetramer (4mer) has the lowest energy (i.e., global minimum)?
Requires accurate relative energies
Isomer #1
Isomer #2
Isomer #3
Tetramer rankings
(per monomer)
System | Energy RMSE [kcal/mol] | Force RMSE [kcal/(mol A)] |
---|---|---|
(H2O)4 | 0.82 (0.20) |
0.78 (0.07) |
(MeCN)4 | 0.29 (0.07) | 0.16 (0.01) |
(MeOH)4 | 1.37 (0.34) | 1.49 (0.06) |
(per atom)
Many-body GDML accurately captures relative energies
Info: We desire methods with less than 2 kcal/mol error
Does mbGDML scale to larger systems?
System | Energy Error [kcal/mol] | Force RMSE [kcal/(mol A)] |
---|---|---|
(H2O)16 | 4.01 (0.25) | 1.12 (0.02) |
(MeCN)16 | 0.28 (0.02) | 0.35 (0.004) |
(MeOH)16 | 5.56 (0.35) | 1.79 (0.02) |
Consistent normalized errors
Size transferable
Radial distribution function (rdf)
What we want
r
g(r)
Tells us if we are getting the correct liquid structure
Many-body GDML accurately captures liquid structure
137 H2O molecules
67 MeCN molecules
61 MeOH molecules
Reminder: We have only trained on clusters with up to three molecules
Time for 10 ps MeCN simulation:
mbGDML 19 hours
MP2 23 762 years
20 ps NVT MD simulations; 1 fs time step; Berendsen thermostat at 298 K
Explicit solvent modeling without experimental data
Classical
Ab initio
ML
mbGDML
Training
Speed
Accuracy
Scaling
Poor
Excellent
InSiDe ChEMS group
Dr. John Keith
Dr. Yasemin Basdogan
Dr. Charles Griego
Dr. Emily Eikey
Lingyan Zhao
Barbaro Zulueta
Chinmay Mhatre
Dominick Filonowich
Funding
Office of the Provost