Non-prompt background estimation in same charged \(W^{\pm}W^{\pm}\) scattering
Sebastian Ordoñez
jsordonezs@unal.edu.co
Supervisors: Diego A. Milanés (UNAL)
Joany Manjarrés (TUD)


Outline
-
Introduction
- Motivation
- Theoretical Foundation
- Background estimation
-
Data-driven Fake Factor Method
- Non-prompt Background
- Fake Factor Extraction
- Dilepton Control Region
- Validation Region
-
Current status
- Analysis Framework
- References
Motivation
- The \(W^{\pm}W^{\pm}jj\) final state has the largest ratio of electroweak to strong production cross section compared to other VBS diboson proccesses.
- Misidentified leptons is the second largest background in \(W^{\pm}W^{\pm} jj-EW\) scattering.
- The largest exp. uncertainty in the above publication is coming from data-driven estimation of non-prompt background.


- VBS, \(VV\longrightarrow VV\) with \(V=W\) or \(Z\), is an important process for studying the mechanism of electroweak symmetry breaking as well as physics BSM.
- In 2017, the ATLAS and CMS Collaborations announced for the first time the observation of Vector Boson Scattering (VBS) https://arxiv.org/pdf/1906.03203.pdf
Motivation
In consequence, it is important to reduce the large uncertainty of this data-driven estimation of the non-prompt background for the new ongoing analysis, i.e., a collision energy of \(13 \text{TeV}\) with an integrated luminosity of \(138.7 \text{fb}^{-1}\) .


Motivation

Theoretical Foundation
- The scattering of vector bosons is extremely sensitive to the exact properties of the electroweak symmetry breaking (EWSB) by the Higgs mechanism.

- We can have interactions with only electroweak vertices, \(W^{\pm}W^{\pm} jj-EW\) and productions including at least one strong vertex, \(W^{\pm}W^{\pm} jj-QCD\).
VBS is expected to proceed via several processes, including the self-interaction of four gauge bosons as well as the exchange of a Higgs boson.
Theoretical Foundation


\(VV jj-EE\) without any scattering of the vector bosons and fermionic decays of bosons.
\(VV jj-QCD\) with fermionic decays of the vector bosons.
Final states studied for \(W^{\pm}W^{\pm} jj-EW\) consist of two same charged leptons, the two corresponding neutrinos and two jets close to the beam axis originating from the initial quarks.
Background Estimation
Typically for data-driven background estimation models reliying on statistically independent control (CR), validation (VR) and signal (SR) regions are employed.
- SR: regions in a phase space that are defined through selections on kinematic variables, enriched in potential signal of interest.
- CR: regions enriched in non-prompt background and extrapolated to SR using a fake factor.
- VR: regions in phase space bet. CR and SR, where extrapolation is verified.
VR
SR
CR
FF apply
FF test
\(W^{\pm}W^{\pm} jj-EW\) Signal Region
Object Selection: requirements on kinematic properties of objects measured inside of the ATLAS detector to select the leptons and jets that are of relevance for the analysis.


Object selection for Ana and Veto electrons
Object selection for Ana and Veto muons
\(W^{\pm}W^{\pm} jj-EW\) Signal Region
Event Selection: requirements on the whole event and discard events instead of just single objects.

Data-driven Matrix Method
- Prompt leptons: originated from \(W^{\pm}\) or \(Z\) decay
- Non-prompt leptons: coming from other sources e.g. hadron decays faking a signal lepton (The main issue).
In order to estimate non-promt background, we introduce Non-Ana leptons.
- Non-Ana leptons: kinematically close to Ana leptons but more likely to be non-promt.

Data-driven Matrix Method

Object selection for Ana, Non-Ana and Veto electrons
Fake Factor Method
On MC truth level we have prompt (P) and non-prompt (F) leptons, then we can have:
- \(N_{PP}\): Events with two prompt leptons.
- \(N_{PF}\), \(N_{FP}\): Events with one non-prompt and one prompt lepton.
- \(N_{FF}\): Events with two non-prompt leptons.
On detector level we deal with Ana (A) and Non-ana (N) leptons:
- \(N_{AA}\): Events with two Ana leptons.
- \(N_{AN}\), \(N_{NA}\): Events with one Non-Ana and one Ana lepton.
- \(N_{NN}\): Events with two Non-Ana leptons.
Fake Factor Method
Indeed, this is the number of non-promt events in the \(W^{\pm}W^{\pm} jj-EW \) signal region or \(N_{fake}\).
These quantities are related by
Where \(\epsilon_{i}\) (\(f_{i}\)) is the probability that a prompt (non-promt) lepton with index \(i\) passes the Ana requirements.
- \(N_{AN}\), \(N_{NA}\) and \(N_{NN}\) are measured in data in the SR.
- \(N_{AN}^{prompt}\), \(N_{NA}^{prompt}\) and \(N_{NN}^{prompt}\) are simulated by MC.
The Fake Factor
The only missing terms in previous equation are the fractions \(\frac{f_{i}}{\bar{f}_{i}}\), which are the so-called fake factors
In summary we have
- In the case of \(W^{\pm}W^{\pm}\) scattering we expect \(F_{1}\) and \(F_{2}\) to be equal.
- The FF is extracted from data in a separated control region.
We can argue that neglect second orders of \(F\) is plausible (Be careful!)
Fake Factor Extraction
The mentioned control region consists of Ana and Non-Ana leptons and should be dominated by non-promt events.
- One assumes that \(f\) and \(\bar{f}\) are in agreement between the signal region and the control region, i.e. FF are approx. the same.
- Note that
Thereby
Dilepton Control Region
We will use a control region with one Ana slected lepton and at least one additional Ana/Non-Ana lepton. The criteria for the CR are :
- Statistical significance: large number of total events and purity in non-promt events.
- Composition of the non-promt leptons: in order to guarantee the assumption of FF being the same in both regions.
- Orthogonality to the signal region.
- Data modeling: Data in CR well modeled by MC.
Dilepton Control Region

Comparison of signal region and control region in the composition of Monte
Carlo samples the non-prompt leptons are originating from.
Validation Region
In order to evaluate the control region, the criteria previously defined are now taken up again
- Statistical significance
- Data modeling
- Non-prompt composition
- Orthogonality to the signal region.
- The validity of the data-driven method is proven by using the low dijet invariant mass validation region.
- The data in this validation region is sufficiently well modeled by the sum of the data-driven estimated non-prompt background and the prompt and charge flip contribution estimated by Monte Carlo simulations.
Current status


Learning the basics about the Analysis framework
Current status
Learning the basics about the Analysis framework


References
- https://cds.cern.ch/record/2746597 (Max's M.Sc. thesis)
- http://cds.cern.ch/record/2719126 (Carsten's PhD thesis)
- https://link.aps.org/doi/10.1103/PhysRevLett.123.161801
- https://cds.cern.ch/record/2309552
Thank you!
[FENYX] Non-prompt Background ssWW
By Sebastian Ordoñez
[FENYX] Non-prompt Background ssWW
- 632