Dashing Through Search Spaces
in the Physical Sciences
PHD DEFENCE
University of Oslo — 7 October 2022
Jeriek Van den Abeele
16 orders of magnitude separate the Planck scale and the weak scale!
The Standard Model is incomplete.
- Extreme sensitivity to UV contributions indicates a real problem, manifest in radiative corrections to the Higgs mass
- One solution, supersymmetry, connects fermions and bosons — and introduces new particles and opportunities to explain more
- Upgrading supersymmetry to a local symmetry leads to gravitinos (supergravity)

[CERN/C. David]
Global fits address the need for a consistent comparison of BSM theories to all relevant experimental data
Challenge:
Scanning increasingly high-dimensional parameter spaces with varying phenomenology
Exploration of a combined likelihood function:
L=Lcollider× LHiggs× LDM× LEWPO× Lflavour×…
Supersymmetry turns out hard to find ...

High-dimensional search spaces are non-intuitive
Most of the volume lies in the extremities!
Adaptive sampling techniques are essential for search space exploration: differential evolution, nested sampling, genetic algorithms, ...
In a 7-dimensional parameter space, you have only covered 47.8% of the space.
In a 19-dimensional parameter space, only 13.5%. And a grid with just 2 points per dimension takes 524,288 evaluations.
Imagine scanning the central 90% of each parameter range really well, at great cost.
90%
Quick cross-section prediction with Gaussian processes
based on work with A. Buckley, A. Kvellestad, A. Raklev, P. Scott, J. V. Sparre, and I. A. Vazquez-Holm
Global fits need quick, but sufficiently accurate theory predictions
BSM scans today easily require ∼107 samples or more.
Higher-order BSM production cross-sections and theoretical uncertainties make a significant difference!

[GAMBIT, 1705.07919]
CMSSM

[hep-ph/9610490]

Existing higher-order evaluation tools are insufficient for large MSSM scans
- Prospino/MadGraph: full calculation, minutes/hours per point
- N(N)LL-fast: fast grid interpolation, but only degenerate squark masses
pp→g~g~, g~q~i, q~iq~j,
q~iq~j∗, b~ib~i∗, t~it~i∗
at s=13 TeV
xsec 1.0 performs Gaussian process regression for all strong SUSY cross-sections in the MSSM-24 for the LHC

- New Python tool for NLO cross-section predictions within seconds
A. Buckley, A. Kvellestad, A. Raklev, P. Scott, J. V. Sparre, JVDA, I. A. Vazquez-Holm
- Pointwise estimates of PDF, αs and scale uncertainties, in addition to the subdominant regression uncertainty
- Future expansions: other processes, more CoM energies, higher-order corrections, ...
- Trained with Prospino results
prior distribution over all functions
with the estimated smoothness
The Bayesian Way: quantifying beliefs with probability

prior distribution over all functions
with the estimated smoothness
posterior distribution over functions
with updated m(x)
data

The Bayesian Way: quantifying beliefs with probability
After optimising correlation length scales, we obtain a posterior by conditioning on known data.

target function

target function
After optimising correlation length scales, we obtain a posterior by conditioning on known data.

After optimising correlation length scales, we obtain a posterior by conditioning on known data.

After optimising correlation length scales, we obtain a posterior by conditioning on known data.
Make posterior predictions at new points by computing correlations to known points.


Make posterior predictions at new points by computing correlations to known points.
Posterior predictive distributions are Gaussian too!


[Liu+, 1806.00720]
Distributed Gaussian Processes
For standard GP regression, training scales as O(n3), prediction as O(n2).
Divide-and-conquer approach for dealing with large datasets:
- Make local predictions with smaller data subsets
- Compute a weighted average of these predictions
The exact weighting procedure is important, to ensure
- Smooth final predictions
- Valid regression uncertainties
"Generalized Robust Bayesian Committee Machine"
GP regularisation
- Add a white-noise component to the kernel
- Can compute minimal value that reduces condition number sufficiently
- Can model discrepancy between GP model and latent function, e.g., regarding assumptions of stationarity and differentiability
[Mohammadi+, 1602.00853]
- Avoid a large S/N ratio by adding a likelihood penalty term guiding the hyperparameter optimisation
- Nearly noiseless data is problematic. Artificially increasing the noise level may be necessary, degrading the model a bit
Numerical errors may arise in the inversion of the covariance matrix, leading to negative predictive variances.
Gluino-gluino production example

Gluino-squark


Squark-squark

Fast estimate of SUSY (strong) production cross- sections at NLO, and uncertainties from
- regression itself
- renormalisation scale
- PDF variation
- αs variation
Goal
pp→g~g~, g~q~i, q~iq~j,
q~iq~j∗, b~ib~i∗, t~it~i∗
Interface
Method
Pre-trained, distributed Gaussian processes
Python tool with command-line interface
Processes
at s=13 TeV

Scanning for gravitinos:
Gravitino dark matter despite high reheating temperatures
based on work with J. Heisig, J. Kersten and I. Strümke
- Planck-suppressed widths make the NLSP long-lived and a danger to BBN
- Thermal leptogenesis needs TR≳109 GeV to match baryon asymmetry
-
Now, for an MSSM NLSP, thermal freeze-out (not TR!) determines the abundance
-
ΩNLSPth is controlled by MSSM parameters; if it is low, the BBN impact is minimal!

Given R-parity, the LSP is stable and a dark matter candidate.
In a gravitino LSP scenario:
Gravitino trouble
The heavy Higgs funnel
Parameter region where a neutralino NLSP dominantly annihilates via resonant heavy Higgs bosons: 2mχ~10≈mH0/A0

- Requires wino-higgsino NLSP
-
Open window facilitating tiny NLSP freeze-out abundance
- Avoids injecting too much energy into BBN
- Minimises non-thermal gravitino abundance contribution from NLSP decays
Guided scan strategy
Construct a likelihood Lscan=Lrelic density× LBBN× Lcollider× LTRfake.
Nested sampling with MultiNest to examine region with highest Lscan:
- Focusing on funnel region and guided towards high TR
- 7 parameters varied in the scan: TR,mG~,M1,M2,M3,μ,A0
- Fixed tanβ=10 and mscalars=15 TeV
-
External dependencies:
- SOFTSUSY (spectrum generation)
- MicrOMEGAs (thermal NLSP abundance, chargino production cross sections)
- SmodelS, CheckMATE [ROOT, Delphes, MadGraph, Pythia, HepMC] (collider checks)
Preliminary results



Scanning for gravitinos:
Light gravitinos hiding at colliders
based on work with the GAMBIT Collaboration
SUSY with a light gravitino: LHC impact
EWMSSM: MSSM with only electroweakinos (χ~i0,χ~i±) not decoupled
GEWMSSM: EWMSSM + nearly massless gravitino LSP
- The lightest EWino can decay, changing the collider phenomenology
- 4D parameter space: M1,M2,μ and tanβ
- Gravitino mass fixed to 1 eV, so the lightest EWino decays promptly
- Scan with ColliderBit, using differential evolution (Diver)



Profile likelihood ratio shows preference for higgsinos near 200 GeV
(tiny excesses in searches for MET+leptons/jets)
Preliminary results
Preliminary results

Profile likelihood ratio, with likelihood capped at SM expectation (no signal)
- Besides higgsinos, also light/lonely bino NLSP (low production cross-section)
- Large part of parameter space excluded (mostly due to photon+MET searches)
Illuminating molecular optimisation
based on work with J. Verhellen
Developing new drugs is an expensive process, typically taking 10 - 15 years.
Simple rules of thumb only provide little guidance in small-molecule drug design.
AI techniques, leveraging increased computational power and data availability, promise to speed it up.
Graph-based Elite Patch Illumination (GB-EPI) is a new illumination algorithm, based on MAP-ELITES from soft robot design.

Traditional graph-based genetic algorithm
Frequent stagnation!


Graph-based Elite Patch Illumination
Explicitly enforcing diversity in chosen feature space!
GB-EPI illuminates search spaces: it reveals how interesting features affect performance, and finds optima in each region


Benchmarks show that the quality-diversity approach boosts speed and success rate
- Supersymmetry is not dead.
- With increasing data and processing power, AI techniques can accelerate search space exploration.
- Looking beyond field boundaries can be fun!
Closing remarks
Thank you!



Backup slides
Let's make some draws from this prior distribution.

Let's make some draws from this prior distribution.

Let's make some draws from this prior distribution.

Let's make some draws from this prior distribution.

Let's make some draws from this prior distribution.

At each input point, we obtain a distribution of possible function values (prior to looking at data).


At each input point, we obtain a distribution of possible function values (prior to looking at data).

A Gaussian process sets up an infinite number of correlated Gaussians, one at each parameter point.
A Gaussian process sets up an infinite number of correlated Gaussians, one at each parameter point.

Sometimes, a curious problem arises: negative predictive variances!
It is due to numerical errors when computing the inverse of the covariance matrix K. When K contains many training points, there is a good chance that some of them are similar:
GP regularisation
Nearly equal columns make K ill-conditioned. One or more eigenvalues λi are close to zero and K can no longer be inverted reliably. The number of significant digits lost is roughly the log10 of the condition number
This becomes problematic when κ≳108. In the worst-case scenario,
signal-to-noise ratio
number of points
GP regularisation
Squark-antisquark

Squared Exponential kernel
Matérn-3/2 kernel


Different kernels lead to different function spaces to marginalise over.
Workflow
Generating data
Random sampling
SUSY spectrum
Cross-sections
Optimise kernel hyperparameters
Training GPs
GP predictions
Input parameters
Linear algebra
Cross-section
estimates
Compute covariances between training points
Workflow
Generating data
Random sampling
SUSY spectrum
Cross-sections
Optimise kernel hyperparameters
Training GPs
GP predictions
Input parameters
Linear algebra
Cross-section
estimates
XSEC
Compute covariances between training points
Training scales as O(n3), prediction as O(n2)

A balancing act
Random sampling with different priors, directly in mass space
Evaluation speed
Sample coverage
Need to cover a large parameter space
Distributed Gaussian processes
Some linear algebra
Regression problem, with 'measurement' noise:
y=f(x)+ε, ε∼N(0,σn2)→ infer f, given data D={X,y}
Assume covariance structure expressed by a kernel function, like
Consider the data as a sample from a multivariate Gaussian distribution
[x1,x2,…]
[y1,y2,…]
signal kernel
white-noise kernel
Some linear algebra
Regression problem, with 'measurement' noise:
y=f(x)+ε, ε∼N(0,σn2)→ infer f, given data D={X,y}
Training: optimise kernel hyperparameters by maximising the marginal likelihood
Posterior predictive distribution at a new point x∗ :
with
Implicit integration over points not in X
[

Scale-dependence of LO/NLO
[Beenakker+, hep-ph/9610490]
How to get more stuff than anti-stuff?
To succeed, Big Bang nucleosynthesis requires nγnB−nBˉ∼10−9.
The Sakharov conditions for generating a baryon asymmetry dynamically:
- Baryon number violation (of course)
- C and CP violation (else: same rate for matter and anti-matter creation)
- Departure from thermal equilibrium (else: same abundance anti-particles, given CPT)
Not satisfied in the Standard Model.
Baryogenesis via thermal leptogenesis provides a minimal realisation, only requiring heavy right-handed neutrinos Ni (→mν via see-saw mechanism):
out-of-eq. CP-violating N decays at T∼mN cause lepton asymmetry
baryon asymmetry
SM sphalerons
Given R-parity, the LSP is stable and a dark matter candidate.
In a neutralino LSP scenario with mG~∼mSUSY:
- Thermal scattering during reheating produces gravitinos with ΩG~th∝TR
- Thermal leptogenesis needs TR≳109 GeV, to match observed baryon asymmetry
-
Due to MPl-suppressed couplings, the gravitino easily becomes long-lived: τG~∼107 s(mG~100 GeV)3
-
Overabundant, delayed gravitino decays disrupt BBN, excluding TR≳105 GeV!
So ... why not try a gravitino LSP (with neutralino NLSP)?
Gravitino trouble
Gravitino production
The gravitino relic density should match the observed ΩDMh2=0.1199±0.0022
No thermal equilibrium for gravitinos in the early universe, due to superweak couplings (unless very light, but then no longer DM candidate due to Lyman-α)
So no standard mechanism to lower the gravitino abundance: instead, gradual build-up!
- UV-dominated freeze-in from thermal scatterings at T∼TR ΩG~UVh2=(100GeVmG~)(1010GeVTR)[i=1∑3ωigi2(1+3mG~2Mi2)ln(giki)+0.00319yt2(1+3mG~2At2)] (accounting for 1-loop running of gi,Mi,yt,At)
- IR-dominated freeze-in from decays of heaviest sparticles
- SuperWIMP contribution from decays of thermal neutralino NLSP
- (Direct inflaton decays to gravitinos)
from processes like (g+g→)g→g~+G~
Gravitino production
The gravitino relic density should match the observed ΩDMh2=0.1199±0.0022
No thermal equilibrium for gravitinos in the early universe, due to superweak couplings (unless very light, but then no longer DM candidate due to Lyman-α)
So no standard mechanism to lower the gravitino abundance: instead, gradual build-up!
- UV-dominated freeze-in from thermal scatterings at T∼TR
- IR-dominated freeze-in from decays of heaviest sparticles
- SuperWIMP contribution from decays of thermal neutralino NLSP mG~ΩG~SWh2=mχ~Ωχ~thh2
- (Direct inflaton decays to gravitinos)
NLSP decays
χ~10→G~+γ
χ~10→G~+Z
χ~10→G~+γ∗→G~+ffˉfor f=u,d,s,c,b,t,e,μ,τ
χ~10→G~+Z∗→G~+ffˉfor f=u,d,s,c,b,t,e,μ,τ
χ~10→G~+(γ/Z)∗→G~+ffˉfor f=u,d,s,c,b,t,e,μ,τ
χ~10→G~+h0→G~+XYfor XY=μ+μ−,τ+τ−,ccˉ,bbˉ,gg,γγ,Zγ,ZZ,W+W−
Relevant decay channels:
Crucially, the NLSP lifetime behaves as τχ~∝MP2mG~2/mχ~5 !
BBN constraints
Lifetime/abundance limits for a generic particle decaying into uuˉ,bbˉ,ttˉ,gg,e+e−,τ+τ−,γγ,W+W− and thus injecting energy into the primordial plasma


arXiv:1709.01211
p/n conversion
hadrodissociation
photodissociation
Collider constraints
Last step due to computational expense, split into 5 components:
- Quick veto with lower bound on EWino masses, upper bound on lifetimes and pass some gluino/neutralino exclusion limits from ATLAS search [1712.02332]
- Quick veto on chargino production cross-section from CMS disappearing-track search [1804.07321]
- SmodelS determines simplified model topologies and tests them against LHC limits
- CheckMATE performs event generation, detector simulation, cuts for different signal regions, and applies LHC limits
PhD Defence
By jeriek
PhD Defence
- 330