Non-prompt background estimation in same charged \(W^{\pm}W^{\pm}\) scattering

Sebastian Ordoñez

jsordonezs@unal.edu.co

Supervisors: Diego A. Milanés (UNAL)

Joany Manjarrés (TUD)

Outline

Introduction
- Motivation
- Theoretical Foundation
- Background estimation
Data-driven Fake Factor Method
- Non-prompt Background
- Fake Factor Extraction
- Dilepton Control Region
- Validation Region
Current status
- Analysis Framework
References

Motivation

The \(W^{\pm}W^{\pm}jj\) final state has the largest ratio of electroweak to strong production cross section compared to other VBS diboson proccesses.

Misidentified leptons is the second largest background in \(W^{\pm}W^{\pm} jj-EW\) scattering.
The largest exp. uncertainty in the above publication is coming from data-driven estimation of non-prompt background.

VBS, \(VV\longrightarrow VV\) with \(V=W\) or \(Z\), is an important process for studying the mechanism of electroweak symmetry breaking as well as physics BSM.

In 2017, the ATLAS and CMS Collaborations announced for the first time the observation of Vector Boson Scattering (VBS) https://arxiv.org/pdf/1906.03203.pdf

Motivation

Taken from https://arxiv.org/pdf/1906.03203.pdf

In consequence, it is important to reduce the large uncertainty of this data-driven estimation of the non-prompt background for the new ongoing analysis, i.e., a collision energy of \(13 \text{TeV}\) with an integrated luminosity of \(138.7 \text{fb}^{-1}\) .

Motivation

https://atlas.cern/updates/feature/vector-boson-scattering

Theoretical Foundation

The scattering of vector bosons is extremely sensitive to the exact properties of the electroweak symmetry breaking (EWSB) by the Higgs mechanism.

We can have interactions with only electroweak vertices, \(W^{\pm}W^{\pm} jj-EW\) and productions including at least one strong vertex, \(W^{\pm}W^{\pm} jj-QCD\).

VBS is expected to proceed via several processes, including the self-interaction of four gauge bosons as well as the exchange of a Higgs boson.

Theoretical Foundation

\(VV jj-EE\) without any scattering of the vector bosons and fermionic decays of bosons.

\(VV jj-QCD\) with fermionic decays of the vector bosons.

Final states studied for \(W^{\pm}W^{\pm} jj-EW\) consist of two same charged leptons, the two corresponding neutrinos and two jets close to the beam axis originating from the initial quarks.

Background Estimation

Typically for data-driven background estimation models reliying on statistically independent control (CR), validation (VR) and signal (SR) regions are employed.

SR: regions in a phase space that are defined through selections on kinematic variables, enriched in potential signal of interest.
CR: regions enriched in non-prompt background and extrapolated to SR using a fake factor.
VR: regions in phase space bet. CR and SR, where extrapolation is verified.

VR

SR

CR

FF apply

FF test

\(W^{\pm}W^{\pm} jj-EW\) Signal Region

Object Selection: requirements on kinematic properties of objects measured inside of the ATLAS detector to select the leptons and jets that are of relevance for the analysis.

Object selection for Ana and Veto electrons

Object selection for Ana and Veto muons

\(W^{\pm}W^{\pm} jj-EW\) Signal Region

Event Selection: requirements on the whole event and discard events instead of just single objects.

Data-driven Matrix Method

Prompt leptons: originated from \(W^{\pm}\) or \(Z\) decay
Non-prompt leptons: coming from other sources e.g. hadron decays faking a signal lepton (The main issue).

In order to estimate non-promt background, we introduce Non-Ana leptons.

Non-Ana leptons: kinematically close to Ana leptons but more likely to be non-promt.

Data-driven Matrix Method

Object selection for Ana, Non-Ana and Veto electrons

Fake Factor Method

On MC truth level we have prompt (P) and non-prompt (F) leptons, then we can have:

\(N_{PP}\): Events with two prompt leptons.
\(N_{PF}\), \(N_{FP}\): Events with one non-prompt and one prompt lepton.
\(N_{FF}\): Events with two non-prompt leptons.

On detector level we deal with Ana (A) and Non-ana (N) leptons:

\(N_{AA}\): Events with two Ana leptons.
\(N_{AN}\), \(N_{NA}\): Events with one Non-Ana and one Ana lepton.
\(N_{NN}\): Events with two Non-Ana leptons.

Fake Factor Method

Indeed, this is the number of non-promt events in the \(W^{\pm}W^{\pm} jj-EW \) signal region or \(N_{fake}\).

\boxed{N_{AA}-\epsilon_{1}\epsilon_{2}N_{PP}=(N_{NA}-\bar{\epsilon}_{1}\epsilon_{2}N_{PP})\frac{f_{1}}{\bar{f}_{1}}+(N_{AN}-\epsilon_{1}\bar{\epsilon}_{2}N_{PP})\frac{f_{2}}{\bar{f}_{2}}-(N_{NN}-\bar{\epsilon}_{1}\bar{\epsilon}_{2}N_{PP})\frac{f_{1}f_{2}}{\bar{f}_{1}\bar{f}_{2}}}

These quantities are related by

Where \(\epsilon_{i}\) (\(f_{i}\)) is the probability that a prompt (non-promt) lepton with index \(i\) passes the Ana requirements.

\(N_{AN}\), \(N_{NA}\) and \(N_{NN}\) are measured in data in the SR.
\(N_{AN}^{prompt}\), \(N_{NA}^{prompt}\) and \(N_{NN}^{prompt}\) are simulated by MC.

The Fake Factor

F_{i}\equiv\frac{f_{i}}{\bar{f}_{i}}

The only missing terms in previous equation are the fractions \(\frac{f_{i}}{\bar{f}_{i}}\), which are the so-called fake factors

In summary we have

\boxed{N_{fake}=(N_{NA}-N_{NA}^{prompt})F_{1}+(N_{AN}-N_{AN}^{prompt})F_{2}-(N_{NN}-N_{NN}^{prompt})F_{1}F_{2}}

In the case of \(W^{\pm}W^{\pm}\) scattering we expect \(F_{1}\) and \(F_{2}\) to be equal.
The FF is extracted from data in a separated control region.

We can argue that neglect second orders of \(F\) is plausible (Be careful!)

\boxed{N_{fake}=(N_{NA}-N_{NA}^{prompt})F_{1}+(N_{AN}-N_{AN}^{prompt})F_{2}+\mathcal{O}(F^{2})}

Fake Factor Extraction

The mentioned control region consists of Ana and Non-Ana leptons and should be dominated by non-promt events.

One assumes that \(f\) and \(\bar{f}\) are in agreement between the signal region and the control region, i.e. FF are approx. the same.
Note that

f=\frac{N_{A}^{fake}}{N^{fake}}

\bar{f}=\frac{N_{N}^{fake}}{N^{fake}}

Thereby

\boxed{F=\frac{f}{\bar{f}}=\frac{N_{A}^{fake}}{N_{N}^{fake}}=\frac{N_{A}-N_{A}^{prompt}}{N_{N}-N_{N}^{promt}}}

Dilepton Control Region

We will use a control region with one Ana slected lepton and at least one additional Ana/Non-Ana lepton. The criteria for the CR are :

Statistical significance: large number of total events and purity in non-promt events.
Composition of the non-promt leptons: in order to guarantee the assumption of FF being the same in both regions.
Orthogonality to the signal region.
Data modeling: Data in CR well modeled by MC.

Dilepton Control Region

Comparison of signal region and control region in the composition of Monte
Carlo samples the non-prompt leptons are originating from.

Validation Region

In order to evaluate the control region, the criteria previously defined are now taken up again

Statistical significance
Data modeling
Non-prompt composition
Orthogonality to the signal region.

The validity of the data-driven method is proven by using the low dijet invariant mass validation region.

The data in this validation region is sufficiently well modeled by the sum of the data-driven estimated non-prompt background and the prompt and charge flip contribution estimated by Monte Carlo simulations.

Current status

Learning the basics about the Analysis framework

Current status

Learning the basics about the Analysis framework

Non-prompt background estimation in same charged \(W^{\pm}W^{\pm}\) scattering

Outline

Motivation

Motivation

Motivation

Theoretical Foundation

Theoretical Foundation

Background Estimation

\(W^{\pm}W^{\pm} jj-EW\) Signal Region

\(W^{\pm}W^{\pm} jj-EW\) Signal Region

Data-driven Matrix Method

Data-driven Matrix Method

Fake Factor Method

Fake Factor Method

The Fake Factor

Fake Factor Extraction

Dilepton Control Region

Dilepton Control Region

Validation Region

Current status

Current status

References

Thank you!

[FENYX] Non-prompt Background ssWW

[FENYX] Non-prompt Background ssWW

Sebastian Ordoñez

Non-prompt background estimation in same charged \(W^{\pm}W^{\pm}\) scattering

Outline

Motivation

Motivation

Motivation

Theoretical Foundation

Theoretical Foundation

Background Estimation

\(W^{\pm}W^{\pm} jj-EW\) Signal Region

\(W^{\pm}W^{\pm} jj-EW\) Signal Region

Data-driven Matrix Method

Data-driven Matrix Method

Fake Factor Method

Fake Factor Method

The Fake Factor

Fake Factor Extraction

Dilepton Control Region

Dilepton Control Region

Validation Region

Current status

Current status

References

Thank you!

[FENYX] Non-prompt Background ssWW

More from Sebastian Ordoñez