Abstract
Background modelling is one of the main challenges in particle physics data analysis. Commonly employed
strategies are the use of simulated events of the background process and the fit of parametric
background models to the observed data. However, reliable simulations are not always available or may
be extremely costly to produce. As a result, the uncertainties arising from simulation-based background
modelling or from limited simulation statistics in many cases are the limiting factor in the analysis
sensitivity. At the same time, parametric models are limited by the a priori unknown functional form
and parameter values of the modelled background. These issues become ever more pressing when studying
exclusive signatures involving hadronic backgrounds, and when large datasets become available, as it is
the case at the LHC.
Two novel non-parametric data-driven background modelling techniques are presented, which address these
issues for a broad class of searches and measurements by providing an almost fully generic background
modelling strategy. The first method uses data from a relaxed version of the event selection to
estimate a graph of conditional probability density functions of the variables used in the analysis,
accounting for all significant correlations. A background model is then generated by sampling events
from this graph, before the full event selection is applied. In the second method, a generative
adversarial network is trained to estimate the joint probability density function of the variables used
in the analysis, conditioned on the variable used to blind the signal region. This training proceeds in
the sidebands, and the conditional probability density function is interpolated into the signal region
to estimate the background. Results are presented which demonstrate the performance of both methods,
and their impacts on two benchmark analyses are discussed.