Extended Likelihood¶
Unbinned Extended Likelihood¶
Let \(x\) be a random variable distributed according to a p.d.f. \(~f\left(x\,\middle|\,\vec{\theta}\right)\),
with \(n\) observations, \(\vec{x} = \left(x_1, \cdots, x_n\right)\), and \(m\) unknown parameters, \(\vec{\theta} = \left(\theta_1, \cdots, \theta_m\right)\). The likelihood would normally then be
However, if \(n\) itself is a Poisson random variable with mean \(\nu\),
then it follows that
This equation is known as the “extended likelihood function”, as we have “extended” the information encoded in the likelihood to include the expected number of events — a quantity of great importance to physicists. It can be see from inspection though that the extended likelihood still follows the form of a likelihood, so no different treatment is required in finding its MLE estimators.
\(\nu\) is dependent on \(\vec{\theta}\)¶
In the instance that \(\nu\) is a function of \(\vec{\theta}\), \(\nu = \nu\left(\vec{\theta}\right)\), then
such that
where \(n\) is a constant of the data, and so will have no effect on finding the estimators of any parameters, leading it to be safely ignored. Thus,
Note that as the resultant estimators, \(\hat{\vec{\theta}}\), exploit information from both \(n\) and \(x\) this should generally lead to smaller variations for \(\hat{\vec{\theta}}\).
\(\nu\) is independent of \(\vec{\theta}\)¶
In the instance that \(\nu\) is independent of \(\vec{\theta}\),
then
such that
As \(L\) is maximized with respect to a variable \(\alpha\) when \(−\ln L\) is minimized,
then it is seen from
that the maximum likelihood estimator for \(\nu\) is \begin{equation} \hat{\nu} = n,, \end{equation}
and that
results in the the same estimators \(\hat{\vec{\theta}}\) as in the “usual” maximum likelihood case.
If the p.d.f. is of the form of a mixture model,
and an estimate of the weights is of interest, then as the parameters are not fully independent, given the constraint
then one of the \(m\) parameters can be replaced with
so that the p.d.f. only constrains \(m-1\) parameters. This then allows the the likelihood to be constructed that allows to find the estimator for the unconstrained parameter.
Equivalently, the extended likelihood function can be used, as
Letting \(\mu_i\), the expected number of events of type \(i\), be \(\mu_i \equiv \theta_i \nu\), for \(\vec{\mu} = \left(\mu_1, \cdots, \mu_m\right)\), then
Here, \(\vec{\mu}\) are unconstrained and all parameters are treated symmetrically, such that \(\hat{\mu_i}\) give the maximum likelihood estimator means of number of events of type \(i\).
Toy Example¶
import numpy as np
from scipy.optimize import minimize
def NLL(x, n, S, B):
nll = sum(
(x[0] * S[meas] + B[meas]) - (n[meas] * np.log(x[0] * S[meas] + B[meas]))
for meas in np.arange(0, len(n))
)
return nll
n_observed = [6, 24]
f = np.array([1.0])
S = [0.9, 4.0]
B = [0.2, 24.0]
model = minimize(NLL, f, args=(n_observed, S, B), method="L-BFGS-B", bounds=[(0, 10)])
print(f"The MLE estimate for f: {model.x[0]}")
The MLE estimate for f: 2.6155471948792126
HistFactory Example¶
Consider a single channel with one signal and one bakcagound contribution (and no systematics). For \(n\) events, signal model \(f_{S}(x_e)\), background model \(f_{B}(x_e)\), \(S\) expected signal events, \(B\) expected backagound events, and signal fraciton \(\mu\), a “marked Poisson model” [2] may be constructed, which treating the data as fixed results in the likelihood of
and so
Binned Extended Likelihood¶
References and Acknowledgements¶
Statistical Data Analysis, Glen Cowan, 1998
ROOT collaboration, K. Cranmer, G. Lewis, L. Moneta, A. Shibata and W. Verkerke, HistFactory: A tool for creating statistical models for use with RooFit and RooStats, 2012.
Vince Croft, Discussions with the author at CERN, July 2017