This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Signal significance incorporating systematic uncertainty for continuous test

Yi Ding Weiming Song Kai Zhu
Abstract

To properly estimate signal significance while accounting for both statistical and systematic uncertainties, we conducted a study to analyze the impact of typical systematic uncertainties, such as background shape, signal shape, and the number of backgrounds, on significance calculation using the continuous test method. Our investigation reveals unexpected and complex features, leading us to recommend a conservative approach: one should estimate signal significance by conducting trials with as many as possible combinations of various uncertainties associated with the fitting procedure, and then select the “worst” outcome as the final result.

journal:
\affiliation

[author1]Jilin University, Changchun Jilin, China

\affiliation

[author2]Institute of High Energy Physics, Beijing, China

1 Introduction

Scientists often lay claim to new discoveries, yet quantifying the extent of their deviation from established knowledge remains a persistent challenge. In last decades, the concept of “significance” has gained widespread adoption in scientific disciplines, particularly in physics, economics, biology, psychology, sociology, and medicine. It serves as a means to gauge the confidence with which a scientist asserts a new discovery. For instance, in high-energy physics, a conventional “rule” for significance is based on standard deviations (σ\sigma), where a signal with a significance of 3σ3\sigma is considered as evidence, and only one with a significance of 5σ5\sigma is deemed an observation of note. The paper [1] by P. K. Sinervo provides a historical review of the concept of “statistical significance” in observations and its application in high-energy experiments. Accurate estimation of significance is crucial in data analysis, as overestimation can damage one’s reputation, while underestimation may lead to overlooking important discoveries. A “significant” observation typically allows for the elimination of one or more hypotheses in favor of alternatives within a statistical framework. Hence, in essence, this constitutes a statistical problem, which is why significance is sometimes referred to as “statistical significance”. Two fundamental methods complement the calculation of significance: the counting method and the continuous test method. This paper follows the definition and calculation of statistical significance proposed in Ref. [2], which establishes a correlation between the normal distribution integral probability and the observed p-value. This approach yields explicit expressions for both counting experiments and continuous test statistics. For counting experiments,

SSN(0,1)=1p(nobs)=n=0nobs1bnn!eb,\int^{S}_{-S}N(0,1)=1-p(n_{obs})=\sum^{n_{obs}-1}_{n=0}\frac{b^{n}}{n!}e^{-b}\ , (1)

where SS, nobsn_{obs}, bb, and N(0,1)N(0,1) are the signal significance, number of observed events, number of backgrounds, and normal distribution, respectively. For continuous test,

SSN(0,1)=1p(tobs)=0tobsχ2(t;r)𝑑t,\int^{S}_{-S}N(0,1)=1-p(t_{obs})=\int^{t_{obs}}_{0}\chi^{2}(t;r)dt\ , (2)

where tobs2[lnLmax(s+b)lnLmax(b)]t_{obs}\equiv 2[\mathrm{ln}L_{max}(s+b)-\mathrm{ln}L_{max}(b)] with LL is the likelihood obtained from fits, Lmax(s+b)L_{max}(s+b) is the maximum likelihood value with both signal and background in the fit and Lmax(b)L_{max}(b) is the maximum likelihood value with background in the fit. The rr is the difference in numbers of freedom.

The significance calculation may initially appear to be a mere statistical exercise, involving the numbers of observed events, backgrounds, and signals, as well as the uncertainties associated with these figures. However, it became evident that this calculation must extend beyond pure statistical analysis, as there are always systematic uncertainties related to factors such as the number of backgrounds, efficiency, signal shape, and background shape. These additional uncertainties introduce new complexities and difficulties to the significance calculation. In the case of the counting method, some proposed discussions on how to account for systematic uncertainty in significance calculation have been put forward [3; 4]. The central idea in these papers involves varying the conditions during the calculation based on the systematic uncertainty, and then selecting the “worst” (i.e., the least) significance as the final result. However, to the best of our knowledge, the consideration of systematic uncertainty in the continuous test method remains unaddressed. Therefore, this paper aims to explore the impact of systematic uncertainties on signal significance under various typical conditions and seeks to provide a solution for properly estimating signal significance when employing the continuous test method in the presence of systematic uncertainty.

2 Effects to significance due to systematic uncertainties

We determine the signal significance using a continuous test approach with a toy model constructed from ad hoc invariant mass distributions. In this model, the data consists of two components, namely signal and background, and is generated with specific numbers and distributions. We assess the impact of three types of systematic uncertainties related to the background shape, number of backgrounds, and signal shape.

To begin, we generate a one-variable sample comprising 1500 events using the following formula

F(x)=(1f)G(m,σ;x)+fP(a0,a1;x),F(x)=(1-f)\cdot G(m,\sigma;x)+f\cdot P(a_{0},a_{1};x)\ , (3)

where G(m,σ;x)G(m,\sigma;x) and P(a0,a1;x)P(a_{0},a_{1};x) are Gaussian and Polynomial functions, and ff is the ratio of the background. The values of the corresponding parameters are listed in Table 1. This sample will be used in the following for further studies. A fit to the generated sample, with F(x)F(x) described by Eq. 3, is presented in Fig 1. And the statistical significance is estimated to be 6.7σ6.7~{}\sigma by the continuous test method that is described by Eq. 2.

Refer to caption
Figure 1: The fit of sample by generating function.
NevtN_{evt} ff mm σ\sigma a0a_{0} a1a_{1}
1500 0.90.9 5050 55 0.00.0 0.1-0.1
Table 1: Parameters used to generate the toy MC sample.

2.1   Background shape

We investigate the impact of systematic uncertainty associated with the background shape by fitting the generated sample using the formula:

F(x)=(1f)G(m,σ;x)+fP(a0,a1,a2;x).F(x)=(1-f)\cdot G(m,\sigma;x)+f\cdot P(a_{0},a_{1},a_{2};x)\ . (4)

This formula is identical to the one used to generate the sample, except for an additional higher-order term in the polynomial series represented by its coefficient a2a_{2}. We determine the significance of the signal by applying Eq. 2, with a2a_{2} varied from 0.4-0.4 to 0.40.4 in increments of 0.0250.025. It is important to note that the fit function reduces to the generating function when a2=0a_{2}=0. The resulting significance is presented in Fig. 2 as a function of a2a_{2}.

Refer to caption
Figure 2: The significance distribution is plotted with respect to different background shapes, which are modified by an additional polynomial term indicated by coefficient a2a_{2}. The vertical line represents the position where the fit function is reduced to the generating function, i.e., when a2=0a_{2}=0.

From Fig. 2, it is evident that the background shape can have an unpredictable impact on the signal significance in both direction and magnitude. Within the parameter space, the significance changes gradually around a2=0a_{2}=0, but increases rapidly when the value of a2a_{2} significantly deviates from the true value, regardless of whether it is larger or smaller than the true value. In such cases, the fit function is unable to adequately describe the data. In summary, the significance may either increase or decrease, and establishing a correlation between changes in background shape and changes in significance appears to be extremely difficult.

2.2   Number of background events

We investigate the impact of systematic uncertainty associated with the number of background events by fitting the generated sample the the generating function, while the background ratio ff is varied from 0.80.8 to 1.01.0 in increments of 0.0050.005. It is important to note that the fit function reduces to the generating function when f=0.9f=0.9. The resulting significance is presented in Fig. 3 as a function of ff.

Refer to caption
Figure 3: The significance distribution is plotted with respect to different background configurations, which are set with a sequence of varied values. The vertical line represents the position where the fit function matches the generating function, i.e., when f=0.9f=0.9.

From Fig. 3, we observe that the number of background events can unexpectedly impact the signal significance. For instance, when the proportion of background events increases slightly, the significance also increases, contrary to our expectations of a decrease. This feature makes it nearly impossible to predict the variation in significance in relation to changes in the number of background events.

2.3   Signal shape

We investigate the impact of systematic uncertainty associated with the signal shape by fitting the generated sample the the generating function, while the σ\sigma, that describe the width of the signal, is varied from 3.03.0 to 7.07.0 in increments of 0.10.1. It is important to note that the fit function reduces to the generating function when σ=5.0\sigma=5.0. The resulting significance is presented in Fig. 4 as a function of σ\sigma.

Refer to caption
Figure 4: The significance distribution is plotted with respect to different signal width, which are set with a sequence of varied values. The vertical line represents the position where the fit function matches the generating function, i.e., when σ=5.0\sigma=5.0.

From Fig. 4, it is evident that in the vicinity of the true value, the significance increases as the signal shape decreases, while it exhibits the opposite trend when the signal shape increases. This aligns with our expectations, as a narrower signal width facilitates easier discrimination between signal and background. However, when the signal width becomes very narrow, the significance decreases. We attribute this behavior to the fit function’s inability to accurately capture the generated sample under such circumstances. Despite the seemingly linear correlation between the signal width and the significance, we conclude that it would be overly ambitious to make any qualitative predictions about the significance based on variations in the signal shape.

3 Comparison between counting method and continuous test method

It would be intriguing to compare the effectiveness of the continuous test method and the counting method in estimating significance. Intuitively, the continuous test should offer advantages when the signal shape significantly differs from the background shape, as it provides additional information from the shapes. However, if the signal shape closely resembles that of the background, separating the signal and background from the fit would be challenging, making the counting method a better choice. To test this idea, we generated a series of samples with varying signal widths, i.e., σ\sigma ranging from 5.05.0 to 14.014.0 in increments of 0.10.1, and calculated the significance using the two methods described by Eqs. 1 and 2. The results, depicted in Fig. 5, confirm our expectations.

Refer to caption
Figure 5: The significance distributions are calculated by counting method and continuous test method, respectively, with respect to different signal width, which are set with a sequence of varied values.

4 Discussion

After investigating the impact of various systematic uncertainties on significance, our conclusion is that these uncertainties typically lead to unpredictable effects on signal significance. Due to the complex correlation between cause and result, we recommend a conservative approach: estimate signal significance by conducting trials with all possible combinations of uncertainties associated with the fitting procedure, and then select the “worst” outcome as the final result. Additionally, our study suggests that when the signal shape is distinguishable from the background, the continuous test method should be used for calculating signal significance, whereas the counting method is favored when the signal shape closes to the background. It’s important to note that this study solely focuses on significance calculation for signals of known location, without considering the look-elsewhere effect, which is beyond the scope of this paper and may require specific extensive methods for further exploration.

References

  • [1] P. K. Sinervo, Signal significance in particle physics, [arXiv:hep-ex/0208005 [hep-ex]].
  • [2] Y. S. Zhu, On statistical Significance of Signal, High Ener. Phys. Nucl. Phys. 30 (2006) 331.
  • [3] S. I. Bityukov, Signal significance in the presence of systematic and statistical uncertainties, JHEP 09, 060 (2002), [arXiv:hep-ph/0207130 [hep-ph]].
  • [4] R. D. Cousins, J. T. Linnemann and J. Tucker, Evaluation of three methods for calculating statistical significance when incorporating a systematic uncertainty into a test of the background-only hypothesis for a Poisson process, Nucl. Instrum. Meth. A 595, no.2, 480-501 (2008), [arXiv:physics/0702156 [physics.data-an]].