This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

aainstitutetext: Institute of High Energy Physics, Chinese Academy of Sciences,
Beijing 100049, China
bbinstitutetext: University of Chinese Academy of Sciences,
Beijing 100049, China

The 𝑯𝒊𝒈𝒈𝒔𝒃𝒃¯,𝒄𝒄¯,𝒈𝒈Higgs\to b\bar{b},c\bar{c},gg measurement at CEPC

Yongfeng Zhu,111Also at Some University. a,b,2    Hanhua Cui,222Also at Some University. a,b,3    Manqi Ruan333Corresponding author. [email protected]
Abstract

Accurately measuring the properties of the Higgs boson is one of the core physics objectives of the Circular Electron Positron Collider (CEPC). As a Higgs factory, the CEPC is expected to operate at a centre-of-mass energy of 240GeV240\,GeV, deliver an integrated luminosity of 5.6ab15.6\,ab^{-1}, and produce one million Higgs bosons according to the CEPC Conceptual Design Report (CDR). Combining measurements of the +H\ell^{+}\ell^{-}H, νν¯H\nu\bar{\nu}H, and qq¯Hq\bar{q}H channels, we conclude that the signal strength of Hbb¯/cc¯/ggH\to b\bar{b}/c\bar{c}/gg can be measured with a relative accuracy (relative statistical uncertainty only) of 0.27%/4.03%/1.56%. Extrapolating to the recently released TDR operating parameters corresponding to the integrated luminosity of 20ab120\,ab^{-1}, the relative accuracy of Hbb¯/cc¯/ggH\to b\bar{b}/c\bar{c}/gg signal strength is 0.14%/2.13%/0.82% (relative statistical uncertainty only). We analyze the dependence of the expected accuracies on the critical detector performances: Color Singlet Identification (CSI) for the qq¯Hq\bar{q}H channel and flavor tagging for both νν¯H\nu\bar{\nu}H and qq¯Hq\bar{q}H channels. Compared to the baseline CEPC detector performance, ideal flavor tagging can increase the Hbb¯/cc¯/ggH\to b\bar{b}/c\bar{c}/gg signal strength accuracy by 2%/63%/13% in the νν¯H\nu\bar{\nu}H channel and 35%/122%/181% in the qq¯Hq\bar{q}H channel. A strong dependence between the CSI performance and anticipated accuracies in qq¯Hq\bar{q}H channel is identified. The relevant systematic uncertainties are also discussed in this paper.

Keywords:
Higgs Physics, Branching fraction, CEPC
arxiv: 2203.01469

1 Introduction

Lepton-Higgs factories FCCCDR ; ILC:2019gyn ; Robson:2018enq ; CEPCStudyGroup:2018ghi , could precisely determine the properties of the Higgs boson, provide crucial information on top of the HL-LHC HLLHC and search for New Physics signatures beyond the Standard Model (SM). Intensive studies of the physics potential of various future facilities have been conducted ESBook , leading to the conclusion that "an electron-positron Higgs factory is the highest priority for the next collider"  ES . Many electron-positron Higgs factories are proposed, including the International Linear Collider (ILC) ILC:2019gyn , the Compact Linear e+ee^{+}e^{-} Collider (CLIC) Robson:2018enq , the Future Circular Collider e+ee^{+}e^{-} (FCC-ee) FCC:2018evy , and the Circular Electron Positron Collider (CEPC) CEPCStudyGroup:2018ghi .

The CEPC is designed with a circumference of 100 km and two interaction points CEPCAcc . It can operate at multiple centre-of-mass energies, including 240GeV240\,GeV as a Higgs factory, 160GeV160\,GeV for the W+WW^{+}W^{-} threshold scan, and 91GeV91\,GeV as a Z factory. The main SM processes and corresponding cross sections are shown in figure 1. It also has the potential to increase its centre-of-mass energy to 360GeV360\,GeV for top-quark pair production. In the future, it can be upgraded to a proton-proton collider to directly search for new physics signals at a centre-of-mass energy of about 100TeV100\,TeV, which is an order of magnitude higher than the LHC. When operating at 240GeV240\,GeV, the CEPC could produce Higgs bosons by the processes of Higgs-strahlung (ZH), WW fusion (e+eνeν¯eHe^{+}e^{-}\to\nu_{e}\bar{\nu}_{e}H), and ZZ fusion (e+ee+eHe^{+}e^{-}\to e^{+}e^{-}H), with more than 96% of Higgs bosons produced by the ZH process. Their Feynman diagrams are shown in figure 2.

Refer to caption
Figure 1: The cross section for an unpolarized e+ee^{+}e^{-} collision An:2018dwb , the right side shows the expected number of events at the nominal parameters of the CEPC Higgs runs at 240GeV240\,GeV centre-of-mass energy.
Refer to caption
Figure 2: The Feynman diagrams of the Higgs boson production processes in electron-positron collisions An:2018dwb , (a)e+eZHe^{+}e^{-}\to ZH, (b)e+eνeν¯eHe^{+}e^{-}\to\nu_{e}\bar{\nu}_{e}H and (c)e+ee+eHe^{+}e^{-}\to e^{+}e^{-}H.

Measuring the branching fractions of the Hbb¯/cc¯/ggH\to b\bar{b}/c\bar{c}/gg decays is one of the core CEPC physics objectives. This paper evaluates the statistical accuracies achievable for these measurements. According to the particles generated in association with the Higgs boson, the analysis channels are classified into three categories: +H\ell^{+}\ell^{-}H, νν¯H\nu\bar{\nu}H, and qq¯Hq\bar{q}H. The expected performance in the +H\ell^{+}\ell^{-}H channel is analyzed in ref. Bai:2019qwd . This paper focuses on the analysis and detector performance optimization studies in the νν¯H\nu\bar{\nu}H and qq¯Hq\bar{q}H channels and combines the results from all three channels.

This paper is organized into five sections. Section 2 introduces the detector, the software, and the simulated data samples used in this analysis. Section 3 presents the analyzes in the νν¯H\nu\bar{\nu}H and qq¯Hq\bar{q}H channels, and combines the results from all three channels. Section 4 analyzes the dependence of objective accuracies on critical detector performances, including flavor tagging performance and Color Singlet Identification (CSI) CSI . Section 5 discusses various systematic uncertainties and possible strategies for controlling them. The conclusions are summarized in the last section.

2 Detector, softwares and data samples

Refer to caption
Figure 3: The CEPC baseline detector CEPCStudyGroup:2018ghi . From inner to outer, the detector is composed of a silicon pixel vertex detector, a silicon inner tracker, a TPC, a silicon external tracker, an ECAL, an HCAL, a solenoid of 3 Tesla and a return yoke embedded with a muon detector. In the forward regions, five pairs of silicon tracking disks are installed to enlarge the tracking acceptance.

The CEPC uses a Particle Flow Oriented (PFO) detector as its baseline detector CEPCStudyGroup:2018ghi . This detector reconstructs and identifies all visible particles in the final state and measures their energy and momentum in the most-suited subdetector systems. From inner to outer, this detector is composed of a silicon pixel vertex detector, a silicon inner tracker, a Time Projection Chamber (TPC) surrounded by a silicon external tracker, a silicon-tungsten sampling Electromagnetic Calorimeter (ECAL), a steel-Glass Resistive Plate Chambers (GRPC) sampling Hadronic Calorimeter (HCAL), a 3 Tesla superconducting solenoid, and a flux return yoke embedded a muon detector. The structure of the CEPC detector is shown in figure 3. Precise measurement of the Hbb¯/cc¯/ggH\to b\bar{b}/c\bar{c}/gg branching fractions require good jet flavor tagging performance, which is highly dependent on the CEPC vertex detector. The vertex detector is described in subsection 4.3, which focuses on the optimization of flavor tagging performance.

Refer to caption
Figure 4: The information flow of the CEPC software chain Zhu:2018ift .

A baseline reconstruction software chain (see figure 4) was developed to estimate the physics potential based on the simulation and reconstruction of physics objects with realistic detector effects. The data flow of the CEPC baseline software starts with the event generators of WHIZARD whizard and PYTHIA 6.4 pythia , unless explicitly stated. The detector geometry is implemented in MokkaPlus mokka , a GEANT4-based simulation framework. MokkaPlus calculates the energy deposition in the sensitive volumes and produces simulated hits. The reconstruction modules include tracking, particle flow, and high-level reconstruction algorithms. The tracker hits are reconstructed into tracks based on CLUPATRA track . The Particle Flow algorithm, ARBOR arbor , reads the reconstructed tracks and the calorimeter hits to build reconstructed particles. The physics objects, including electrons, muons, taus, missing energy, jets, etc., are reconstructed from the reconstructed particles.

The CEPC detector is expected to record at least one million Higgs boson events and about one billion SM background events, see figure 1. We classify the SM backgrounds into several categories, including two-fermion and four-fermion processes. The two-fermion processes include the qq¯q\bar{q}, Bhabha, μ+μ\mu^{+}\mu^{-}, and τ+τ\tau^{+}\tau^{-} processes, while four-fermion processes include the single-Z, single-W, ZZ, WW, and mixed processes. The mixed processes are used to properly model the interference between intermediate processes, i.e. the 4-quark final state uu¯dd¯u\bar{u}d\bar{d}, which can be generated from both ZZ(Zuu¯Z\to u\bar{u}, Zdd¯Z\to d\bar{d}) and W+WW^{+}W^{-}(W+ud¯W^{+}\to u\bar{d}, Wu¯dW^{-}\to\bar{u}d) processes.

The samples used in this paper were fully simulated with the CEPC baseline detector concept and reconstructed with its baseline software. The following analyses and referred factors are based on the integrated luminosity of 5.6ab15.6\,ab^{-1}. We simulate 217,000/566,000 νν¯H\nu\bar{\nu}H(including WW fusion)/qq¯Hq\bar{q}H events, corresponding to 84%/74% of the statistics predicted by SM. We also simulated 47 million SM background events, including all major SM processes. The ratio between the statistics of the simulated events and the prediction of SM is then referred to as the scaling factor. To maximize the efficiency of limited computational resources, four-fermion backgrounds are assigned larger scaling factors (20% - 83%), while the scaling factors for two-fermion backgrounds range from 0.23% to 2.8%, as the overall statistics of two-fermion backgrounds are huge, but relatively easy to distinguish from the signal event. The γγhadrons\gamma\gamma\to hadrons process is also included and is generated with PYTHIA 8.3 Bierlich:2022pfr . Two photons are produced by the incoming electrons and their interaction yields hadrons. The detailed description of this process can be found in gamma ; Helenius:2017aqz . Because γγhadrons\gamma\gamma\to hadrons background can be easily separated from the signal events, we simulate 1.36 million samples corresponding to 0.28% of its total statistics. The appendix A describes the detailed information on the samples, including the process, cross section, expected event number, simulated event number, and scaling factor.

3 Measurement of the relative statistical uncertainties

The objective observables of our analyzes are the number of νν¯H\nu\bar{\nu}H and qq¯Hq\bar{q}H events, with the Higgs decaying into jets fragmented from quarks or gluons. The analysis processes are generally divided into two steps. The first step is to distinguish the Higgs-to-two-jets signal events from the background events. The second step is to separate different Higgs decay modes based on the flavor tagging information. In the ref. Bai:2019qwd , the analysis process in the +H\ell^{+}\ell^{-}H channel is described in detail. We briefly summarize the analysis process in subsection 3.1. The analysis processes of νν¯H\nu\bar{\nu}H and qq¯Hq\bar{q}H are described in subsections 3.2 and 3.3, respectively.

3.1 +H\ell^{+}\ell^{-}H

This analysis is based on a centre-of-mass energy of 250GeV250\,GeV and an integrated luminosity of 5ab15\,ab^{-1}, which is slightly different from the normal setting with a centre-of-mass energy of 240GeV240\,GeV and an integrated luminosity of 5.6ab15.6\,ab^{-1}. The signal events have two isolated leptons, e+ee^{+}e^{-} or μ+μ\mu^{+}\mu^{-}, mostly from the Z-boson decay in the Higgs-strahlung process. The invariant mass and recoil mass of these two leptons should be close to the Z boson and the Higgs boson, respectively. The signal events also have two jets generated from the Higgs boson decay. Jet kinematics, i.e. the invariant mass and angle of the two jets, is used to improve the separation performance between signal and background.

After event selection, a template fitting method is used to determine the component fractions of the Hbb¯H\to b\bar{b}, Hcc¯H\to c\bar{c}, and HggH\to gg processes. The relative accuracy of the Hbb¯/cc¯/ggH\to b\bar{b}/c\bar{c}/gg signal strength, corresponding to the centre-of-mass energy of 250GeV250\,GeV and an integrated luminosity of 5ab15\,ab^{-1}, is 1.1%/10.5%/5.4% in the μ+μH\mu^{+}\mu^{-}H channel and 1.6%/14.7%/10.5% in the e+eHe^{+}e^{-}H channel. The relative accuracy in the μ+μH\mu^{+}\mu^{-}H channel is better than that in the e+eHe^{+}e^{-}H channel because, first, the background in the e+eHe^{+}e^{-}H channel is significantly larger than that in the μ+μH\mu^{+}\mu^{-}H channel due to the single-Z processes, and second, the momentum resolution for μ±\mu^{\pm} is better than that for e±e^{\pm}. Extrapolating to the CEPC nominal settings under the assumption that the signal efficiencies and background suppression rates are the same at the two centre-of-mass energies, the Hbb¯/cc¯/ggH\to b\bar{b}/c\bar{c}/gg signal strength accuracy is 1.57%/14.43%/10.31% in the e+eHe^{+}e^{-}H channel and 1.06%/10.16%/5.23% in the μ+μH\mu^{+}\mu^{-}H channel.

3.2 νν¯H\nu\bar{\nu}H

This subsection describes the measurement of the branching fractions of Hbb¯/cc¯/ggH\to b\bar{b}/c\bar{c}/gg in the νν¯H\nu\bar{\nu}H channel. The first step focuses on the separation of Higgs-to-two-jets signal events from the entire sample corresponding to SM prediction with a cut-based event selection and the TMVA tool TMVA . The cut variables are designed according to the characteristics of the signal and background, which are described below, and the cut flow is summarized in table 1. The γγ\gamma\gamma is the abbreviation for the γγhadrons\gamma\gamma\to hadrons process.

νν¯Hqq¯/gg\nu\bar{\nu}Hq\bar{q}/gg 2f SW SZ WW ZZ Mixed ZH γγ\gamma\gamma S+BS\frac{\sqrt{S+B}}{S}(%)
total 178890 8.01E88.01E8 1.95E71.95E7 9.07E69.07E6 5.08E75.08E7 6.39E66.39E6 2.18E72.18E7 961606961606 4.91E84.91E8 20.92
recoilMass (GeV) 157822 5.11E75.11E7 2.17E62.17E6 1.38E61.38E6 4.78E64.78E6 1.30E61.30E6 1.08E61.08E6 7499174991 2.69E72.69E7 5.98
(74,131)\in(74,131)
visEnvisEn (GeV) 142918 2.37E72.37E7 1.35E61.35E6 8.81E58.81E5 3.60E63.60E6 1.03E61.03E6 6.29E56.29E5 5098950989 1.31E71.31E7 4.67
(109,143)\in(109,143)
leadLepEnleadLepEn (GeV) 141926 2.08E72.08E7 3.65E53.65E5 7.24E57.24E5 2.81E62.81E6 9.72E59.72E5 1.34E51.34E5 4696346963 1.31E71.31E7 4.41
(0,42)\in(0,42)
multiplicitymultiplicity 139545 1.66E71.66E7 2.36E52.36E5 5.24E55.24E5 2.62E62.62E6 9.07E59.07E5 49774977 4275142751 1.24E71.24E7 4.15
(40,130)\in(40,130)
leadNeuEnleadNeuEn (GeV) 138653 1.46E71.46E7 2.24E52.24E5 4.72E54.72E5 2.49E62.49E6 8.69E58.69E5 45524552 4230342303 1.10E71.10E7 3.94
(0,41)\in(0,41)
PtPt (GeV) 121212 248715248715 1.56E51.56E5 2.48E52.48E5 1.51E61.51E6 4.31E54.31E5 999999 3545335453 1437 1.37
(20,60)\in(20,60)
PlPl (GeV) 118109 5278452784 1.05E51.05E5 74936 7.30E57.30E5 1.13E51.13E5 847847 34279 1078 0.94
(0,50)\in(0,50)
-log10(Y23) 96156 4086140861 2608826088 60349 2.25E52.25E5 82560 640640 10691 1078 0.76
(3.375,+)\in(3.375,+\infty)
InvMass (GeV) 71758 22200 11059 6308 77912 13680 248 6915 359 0.64
(110,134)\in(110,134)
BDT 60887 9140 266 2521 3761 3916 58 1897 00^{*} 0.47
(0.02,1)\in(-0.02,1)
Table 1: The event selection of νν¯H(Hqq¯/gg)\nu\bar{\nu}H(H\to q\bar{q}/gg) is based on the integrated luminosity of 5.6ab15.6\,ab^{-1}. The γγ\gamma\gamma is the abbreviation for γγhadrons\gamma\gamma\to hadrons process. The symbol 00^{*} represents that the number of γγhadrons\gamma\gamma\to hadrons events is less than 3.09/0.00283.09/0.0028 at a confidence level of 95% according to Feldman-Cousins approach in the case of observing a zero event, where 3.09 is quoted from ref. Feldman:1997qc and 0.0028 is the scaling factor of the γγhadrons\gamma\gamma\to hadrons process.
  • Most of the νν¯H\nu\bar{\nu}H events are from ZH process with Zνν¯Z\to\nu\bar{\nu}, while the WW-fusion contributes about 13% (ignore the interference when calculating the contribution of WW fusion to νν¯H\nu\bar{\nu}H). The signal events have only the jets from the Higgs boson decay. Therefore, the signal should have a recoil mass peak at the mass of the Z boson. However, the SM process of νν¯Z(Zqq¯)\nu\bar{\nu}Z(Z\to q\bar{q}) is an irreducible background for this analysis. In the recoil mass distribution (see the left plot of figure 5), the signal and the νν¯Z(Zqq¯)\nu\bar{\nu}Z(Z\to q\bar{q}) backgrounds peak at the mass of the Z boson, and the other SM backgrounds peak at two sides of the distribution. An optimized cut on the recoil mass has a signal efficiency of 88% and reduces the background by more than one order of magnitude, see the first row in table 1. The signal has significant invisible energy, so the visible energy (visEn, the second row in table 1) is about half of the total energy. The leptonic and semi-leptonic backgrounds have high-energy leptons in the final state, and the fully hadronic backgrounds have a larger multiplicity in the final state. So, the cut variable of the leading lepton energy (leadLepEn, the third row in table 1) is used to suppress leptonic and semi-leptonic backgrounds, the cut variable of multiplicity (the fourth row in table 1) is used to suppress leptonic and some fully hadronic backgrounds. With the above cuts, 0.01% of the leptonic backgrounds, 9.24% of the semi-leptonic backgrounds and 4.82% of the fully hadronic backgrounds remain in the selected sample.

    Refer to caption
    Refer to caption
    Figure 5: The distributions of the recoil mass of all SM samples and of the invariant mass of SM samples after the Y23 cut for νν¯Hqq¯\nu\bar{\nu}Hq\bar{q}, νν¯Zqq¯\nu\bar{\nu}Zq\bar{q}, and other backgrounds are shown in the left plot and right plot, respectively.
  • The remaining backgrounds are dominated by 2f processes consisting of e+eqq¯e^{+}e^{-}\to q\bar{q}. The e+eqq¯e^{+}e^{-}\to q\bar{q} backgrounds with high-energy Initial State Radiation (ISR) detected by the detector would have energetic neutral particles in the final state. Meanwhile, the final state particles of e+eqq¯e^{+}e^{-}\to q\bar{q} would fly in the end-cap region, so the visible transverse momentum (PtPt, the sixth row in table 1) would be lower and the visible longitudinal momentum (PlPl, the seventh row in table 1) would be higher than in the signal events. With the cut variables of the leading neutral energy (leadNeuEn, the fifth row in table 1), PtPt, and PlPl, 0.3% of the e+eqq¯e^{+}e^{-}\to q\bar{q} backgrounds remain in the selected sample.

  • A cut on Y23444The Durham distance at which a two-jet system can be reconstructed into a three-jet system. Catani:1991hj ; Cacciari:2011ma is applied to suppress the backgrounds, resulting in an improvement of the signal strength accuracy from 0.94% to 0.76%, see the eighth row in table 1. The signal should have an invariant mass (InvMass, the ninth row in table 1) close to that of the Higgs boson. After applying the above cut variables, the distributions of invariant mass for signal and backgrounds are shown in the right plot in figure 5. The νν¯Z(Zqq¯)\nu\bar{\nu}Z(Z\to q\bar{q}) backgrounds have their peak at the Z boson, the other SM backgrounds exhibit a flat distribution with statistics comparable to those of νν¯Z(Zqq¯)\nu\bar{\nu}Z(Z\to q\bar{q}) and signal. The cut variable of the invariant mass could preserve 75% of the remaining signal events and veto 69% of the remaining backgrounds.

After the cut-based event selection, a Boosted Decision Tree (BDT) is implemented to further improve the selection performance. The input variables include the cut variables mentioned above and the four-momentum of two jets. Figure 6 shows the BDT responses. With the optimized cut of the BDT response at 0.02, over 71% of the backgrounds are rejected at the cost of a 10% signal loss. With a signal efficiency of 34%, the first step reduces background by more than four orders of magnitude, resulting in a relative accuracy of 0.47% in the measurement of νν¯H,Hqq¯/gg\nu\bar{\nu}H,H\to q\bar{q}/gg. The selection efficiency for Hbb¯/cc¯/ggH\to b\bar{b}/c\bar{c}/gg is 35%/33%/35%. The inhomogeneity in Hbb¯/cc¯/ggH\to b\bar{b}/c\bar{c}/gg selection is mainly caused by the cut variables Y23 and invariant mass, which are discussed in section 5.

Refer to caption
Figure 6: BDT output distributions for signal and background events. The samples used here are those passed all the cuts introduced above.

The second step focuses on the separation of the different Higgs decay modes and can be divided into two stages. The first stage aims to obtain the optimized flavor tagging performance matrix (see below) of the CEPC baseline detector, and the second stage calculate the signal strength accuracy.

Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Figure 7: The distributions of b/c-likeness for νν¯H(Hbb¯)\nu\bar{\nu}H(H\to b\bar{b}) (top left), νν¯H(Hcc¯)\nu\bar{\nu}H(H\to c\bar{c}) (top right), νν¯H(Hgg)\nu\bar{\nu}H(H\to gg) (middle left), and SM backgrounds (middle right). The optimized flavor tagging performance matrix is shown as the bottom plot, where the element represents the flavor identification efficiency.

In the first stage, the particles in the final state are forced into two jets by using the Durham Catani:1991hj jet clustering algorithm implemented in the LCFIPlus software package LCFIPlus . The jet clustering algorithm considers single reconstructed particles and composite objects such as reconstructed secondary vertices as basic candidates. For each jet, the flavor tagging algorithm is used to calculate its likeness to reference samples of b or c jets. The flavor tagging used in this work is also implemented in the LCFIPlus software package and is performed using Gradient Boosted Decision Tree (GBDT). The training is applied to the simulated Zqq¯Z\to q\bar{q} sample produced at s\sqrt{s} of 91.2GeV91.2\,GeV. The reconstructed jets in the sample are divided into 4 categories depending on the number of secondary vertices and isolated leptons in the jet: jets with secondary vertex and lepton, jets with secondary vertex but without lepton, jets without secondary vertex but with lepton and jets without secondary vertex and lepton. In each category, two types of flavor tagging algorithms are trained using the GBDT method, one for the b-tagging algorithm and the other for the c-tagging algorithm.

Refer to caption
Refer to caption
Refer to caption
Refer to caption
Figure 8: The distributions of νν¯H(Hbb¯)\nu\bar{\nu}H(H\to b\bar{b}) (top left), νν¯H(Hcc¯)\nu\bar{\nu}H(H\to c\bar{c}) (top right), νν¯H(Hgg)\nu\bar{\nu}H(H\to gg) (bottom left), and backgrounds (bottom right) based on the optimized flavor tagging performance matrix.

The distributions of b/c-likeness are shown in figure 7 for νν¯H(Hbb¯)\nu\bar{\nu}H(H\to b\bar{b}), νν¯H(Hcc¯)\nu\bar{\nu}H(H\to c\bar{c}), νν¯H(Hgg)\nu\bar{\nu}H(H\to gg), and the remaining SM backgrounds. The phase space spanned by the b/c-likeness is divided into three different regions corresponding to the identified b, c, and gluons. We then obtain the ratio of b-jet identified as b-jet, b-jet identified as c-jet, and so on. These ratios can be represented with a migration matrix, the form of which is shown in figure 7. We optimize the working point (phase space separation) to maximize the trace of the migration matrix. The optimized migration matrix is shown as the bottom plot in figure 7. In principle, the working point can be optimized independently for Hbb¯H\to b\bar{b}, cc¯c\bar{c}, and gggg measurements. We evaluate the corresponding performance and find that the final accuracy can be improved by sub-percent level. Since the improvement is not significant, a uniform matrix for νν¯H(Hbb¯/cc¯/gg)\nu\bar{\nu}H(H\to b\bar{b}/c\bar{c}/gg) is used for simplicity. According to the identified jet-flavor combinations, the signal events and backgrounds are classified into six different categories (see figure 8).

In the second stage, the relative accuracy of the signal strength could be calculated by the log-likelihood function (1Zyla:2020zbs ; MLE ,

2log()=i=1i=6[SbNb,i+ScNc,i+SgNg,i+Nbkg,iNi]2Ni,-2\cdot log(\ell)=\sum_{i=1}^{i=6}\frac{[S_{b}\cdot N_{b,i}+S_{c}\cdot N_{c,i}+S_{g}\cdot N_{g,i}+N_{bkg,i}-N_{i}]^{2}}{N_{i}}, (1)

where SbS_{b} represents the signal strength of νν¯H(Hbb¯)\nu\bar{\nu}H(H\to b\bar{b}), Nb,iN_{b,i} represents the event count of νν¯H(Hbb¯)\nu\bar{\nu}H(H\to b\bar{b}) in the ithith bin, Nbkg,iN_{bkg,i} represents the event count of the backgrounds in the ithith bin, and NiN_{i} represents the total event count (νν¯H\nu\bar{\nu}H with Hbb¯/cc¯/ggH\to b\bar{b}/c\bar{c}/gg and backgrounds) in the ithith bin, similar for ScS_{c}, SgS_{g}, Nc,iN_{c,i}, and Ng,iN_{g,i}. The error covariance matrix is obtained from the Hessian matrix of the log-likelihood function with respect to three signal strengths. The relative accuracies of the signal strengths are the square roots of the diagonal elements of the covariance matrix. It is 0.49%/5.75%/1.82% for νν¯H(Hbb¯/cc¯/gg)\nu\bar{\nu}H(H\to b\bar{b}/c\bar{c}/gg).

3.3 qq¯Hq\bar{q}H

In this subsection, the accuracy of the qq¯H(Hbb¯/cc¯/gg)q\bar{q}H(H\to b\bar{b}/c\bar{c}/gg) signal strength is analyzed. The analysis process is similar to that in the νν¯H\nu\bar{\nu}H channel. Since the backgrounds consist of leptonic, semi-leptonic, and fully hadronic samples, the first step can be divided into three stages to select the signal events step by step.

  1. 1.

    Selection of fully hadronic events.

  2. 2.

    Selecting events with 4-jet topology.

  3. 3.

    Selection of ZH events.

qq¯Hqq¯/ggq\bar{q}Hq\bar{q}/gg 2f SW SZ WW ZZ Mixed ZH γγ\gamma\gamma S+BS(%)\frac{\sqrt{S+B}}{S}(\%)
total 527488 8.01E88.01E8 1.95E71.95E7 9.07E69.07E6 5.08E75.08E7 6.39E66.39E6 2.18E72.18E7 613008613008 4.91E84.91E8 7.10
multiplicity 527488 3.04E83.04E8 1.46E71.46E7 3.37E63.37E6 4.85E74.85E7 6.00E66.00E6 1.81E71.81E7 577930577930 4.12E84.12E8 5.39
(27,+)\in(27,+\infty)
leadLepEnleadLepEn (GeV) 527036 2,98E82,98E8 6.76E66.76E6 2.44E62.44E6 3.93E73.93E7 5.40E65.40E6 1.79E71.79E7 531411531411 4.12E84.12E8 5.31
(0,59)\in(0,59)
visEnvisEn (GeV) 510731 1.21E81.21E8 1.29E61.29E6 551105551105 2.14E72.14E7 3.06E63.06E6 1.71E71.71E7 180571180571 22643 2.52
(199,278)\in(199,278)
leadNeuEnleadNeuEn (GeV) 509623 5.68E75.68E7 716161716161 168030168030 2.04E72.04E7 2.93E62.93E6 1.65E71.65E7 176387176387 21205 1.94
(0,57)\in(0,57)
thrustthrust 460535 7.81E67.81E6 473732473732 132126132126 1.88E71.88E7 2.60E62.60E6 1.54E71.54E7 167863167863 6110 1.47
(0,0.86)\in(0,0.86)
log(Y34)-log(Y_{34}) 451468 4.90E64.90E6 181432181432 119836119836 1.74E71.74E7 2.40E62.40E6 1.45E71.45E7 165961165961 4672 1.40
(0,5.8875)\in(0,5.8875)
HiggsJetsAHiggsJetsA 326207 2.83E62.83E6 110156110156 58613 4.54E64.54E6 870276870276 3.74E63.74E6 96560 2156 1.08
(2.18,4)\in(2.18,4)
ZJetsAZJetsA 279030 1.37E61.37E6 3349133491 37101 2.39E62.39E6 496611 2.00E62.00E6 74005 1797 0.93
(1.97,4)\in(1.97,4)
ZHiggsAZHiggsA 274530 1.32E61.32E6 1702617026 33847 2.28E62.28E6 468340 1.91E61.91E6 69620 1797 0.92
(2.32,4)\in(2.32,4)
circlecircle 268271 1.20E6 10193 31567 2.13E62.13E6 424514 1.79E6 65434 00^{*} 0.90
BDT 192278 378300 40 307 271436 141446 244126 30022 00^{*} 0.57
(0.02,1)\in(0.02,1)
Table 2: The event selection of qq¯H(Hqq¯/gg)q\bar{q}H(H\to q\bar{q}/gg) is based on the integrated luminosity of 5.6ab15.6\,ab^{-1}. The γγ\gamma\gamma is the abbreviation for γγhadrons\gamma\gamma\to hadrons process. The symbol 00^{*} represents that the number of γγhadrons\gamma\gamma\to hadrons events is less than 3.09/0.00283.09/0.0028 at a confidence level of 95% according to Feldman-Cousins approach in the case of observing a zero event, where 3.09 is quoted from ref. Feldman:1997qc and 0.0028 is the scaling factor of the γγhadrons\gamma\gamma\to hadrons process.

The cutflow corresponding to these three stages is given in table 2. The γγ\gamma\gamma is the abbreviation of γγhadrons\gamma\gamma\to hadrons process. The first stage aims to suppress the leptonic and semi-leptonic backgrounds that have low multiplicity or high-energy leptons (e±/μ±e^{\pm}/\mu^{\pm}) or invisible leptons (ν/ν¯\nu/\bar{\nu}). Thus, with the cut variables of multiplicity (the first row in table 2), the leading lepton energy (leadLepEn, the second row in table 2), and the visible energy (visEn, the third row in table 2), the background statistic is reduced to 26%. The cut variable of visible energy can also veto some e+eqq¯e^{+}e^{-}\to q\bar{q} backgrounds with high-energy ISR that have escaped the detector. Step to the second stage: the cut variables of the leading neutral energy (leadNeuEn, the fourth row in table 2, aims to suppress e+eqq¯e^{+}e^{-}\to q\bar{q} with high-energy ISR detected by the detector), thrust555To evaluate the thrust of an event, first determine the thrust axis nTn_{T}, which is the direction of maximum momentum flow. The thrust is then defined as the fraction of the particle momentum that flows along the thrust axis.  ATLAS:2020vup (the fifth row in table 2), and Y34 666The Durham distance at which a three-jet system can be reconstructed into a four-jet system. (the sixth row in table 2), are used to select 4-quark samples from the full hadronic samples. The second stage reduces the remaining background by almost an order of magnitude at the cost of losing 11% of the signal events.

Since the signal contains only four jets, the particles in the final state are first forced into four jets using Durham algorithm. There are two bosons in the signal, then the four jets in the final state are paired into two di-jet systems using the pairing method of the minimization eq. (2),

χ2=(M12MB1)2σB12+(M34MB2)2σB22,\chi^{2}=\frac{(M_{12}-M_{B1})^{2}}{\sigma_{B1}^{2}}+\frac{(M_{34}-M_{B2})^{2}}{\sigma_{B2}^{2}}, (2)

where M12M_{12} and M34M_{34} are the masses of the di-jet systems and MB1M_{B1} and MB2M_{B2} are the reference masses of the Z or W or the Higgs boson. The σ\sigma is the convolution of the boson width and the detector resolution. According to CEPCStudyGroup:2018ghi , the detector resolution is 4% of the boson mass. After pairing four jets into two di-jet systems, we refer to the di-jet system with heavy invariant mass (MheavyM_{heavy}) as the heavy di-jet system and the one with light invariant mass (MlightM_{light}) as the light di-jet system. There are angular variables that can be used to separate the signal from the remaining backgrounds. Angular variables include: the angle between two jets of a light di-jet system (ZJetsA, the seventh row in table 2), the angle between two jets of a heavy di-jet system (HiggsJetsA, the eighth row in table 2), and the angle between two di-jet systems (ZHiggsA, the ninth row in table 2). These three cut variables could reduce more than 84% (from 3.97×1073.97\times 10^{7} to 6.10×1066.10\times 10^{6}) of the backgrounds. For the signal, one di-jet system should have an invariant mass near the Higgs boson and the other should have an invariant mass near the Z boson. Then, a circular selection (the tenth row in table 2) (Mheavy125)2+(Mlight91)2<=292(M_{heavy}-125)^{2}+(M_{light}-91)^{2}<=29^{2} can be used to select signal events, as shown in figure 9.

Refer to caption
Refer to caption
Figure 9: The mass distribution of two di-jet systems, the left plot refers to the signal, the right to the backgrounds.
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Figure 10: The optimized flavor tagging performance matrix for heavy di-jet system (upper-left) and light di-jet system (upper-right). The distributions of qq¯H(Hbb¯)q\bar{q}H(H\to b\bar{b}) (middle left), qq¯H(Hcc¯)q\bar{q}H(H\to c\bar{c}) (middle right), qq¯H(Hgg)q\bar{q}H(H\to gg) (bottom left), and backgrounds (bottom right).

To fully exploit the characteristics of the signal and backgrounds, a BDT method is used to suppress the backgrounds. Input variables include the cut variables mentioned above, the four-momentum of four jets, and several event shape variables ATLAS:2020vup including max-broadening777The max-broadening is related to the transverse momentum measured with respect to the thrust axis. Introduce a plane perpendicular to the thrust axis and divide the space into two hemispheres. The jet broadening of each hemisphere is defined as B=12j=1Nparticles|Pj|i=1,PinT>0Nparticles|Pi×nT|B=\frac{1}{2\sum_{j=1}^{N_{particles}}|P_{j}|}\sum_{i=1,P_{i}\cdot n_{T}>0}^{N_{particles}}|P_{i}\times n_{T}|, where PiP_{i} is the 3-momentum of particle i, nTn_{T} is the thrust axis, and PinT>0(PinT<0)P_{i}\cdot n_{T}>0(P_{i}\cdot n_{T}<0) is used to divide the particles into two hemispheres. The max-broadening is then the maximum jet broadening between these two hemispheres., C-parameter888The linearized sphericity is defined as Lab=1j=1Nparticles|Pj|i=1NparticlesPiaPib|Pi|L^{ab}=\frac{1}{\sum_{j=1}^{N_{particles}}|P_{j}|}\sum_{i=1}^{N_{particles}}\frac{P_{i}^{a}P_{i}^{b}}{|P_{i}|}, where PiP_{i} is the 3-momentum of particle i, and PiaP_{i}^{a} denotes the component a of the 3-momentum of the particle i. Then the C-parameter can be calculated as C=3(λ1λ2+λ1λ3+λ2λ3)C=3(\lambda_{1}\lambda_{2}+\lambda_{1}\lambda_{3}+\lambda_{2}\lambda_{3}), where λ\lambda is the eigenvalue of LabL^{ab}., and D-parameter999The D-parameter can be calculated as D=27λ1λ2λ3D=27\cdot\lambda_{1}\cdot\lambda_{2}\cdot\lambda_{3}. . Finally, the total SM background is reduced to 1.07 million statistics, and more than 36% of the total qq¯H(Hqq¯/gg)q\bar{q}H(H\to q\bar{q}/gg) signal events survived, resulting in a relative uncertainty of 0.57%.

After the event selection process, an optimized flavor tagging performance matrix can be found by setting an optimized working point on the distributions of b/c-likeness of two jets from the heavy di-jet system, which is shown in the top plot of figure 10. Compared to νν¯H\nu\bar{\nu}H, the diagonal elements have decreased by 13%/8%/10% for b/c/gb/c/g. In other words, the identification performance of the b/c/gb/c/g jet in the qq¯Hq\bar{q}H channel is slightly worse than that in the νν¯H\nu\bar{\nu}H channel. This is due to the visible particles decaying from the Z boson would degrade the jet clustering performance. Based on the optimized flavor tagging performance matrix, the identified flavor combinations of qq¯H(Hbb¯)q\bar{q}H(H\to b\bar{b}), qq¯H(Hcc¯)q\bar{q}H(H\to c\bar{c}), qq¯H(Hgg)q\bar{q}H(H\to gg), and backgrounds are shown in figure 10. The x and y axes represent the flavor of two jets from the heavy and light di-jet systems, respectively. The light di-jet system in the signal events corresponds to the Z-boson. Due to the imperfect flavor tagging performance, the pattern of Z-boson decay in figure 10 is not clear. Using the log-likelihood function similar to that of νν¯H\nu\bar{\nu}H, the relative accuracy for qq¯H(Hbb¯/cc¯/gg)q\bar{q}H(H\to b\bar{b}/c\bar{c}/gg) is calculated to be 0.35%/7.74%/3.96%.

3.4 Combination

To first order, the relative accuracy of Hbb¯/cc¯/ggH\to b\bar{b}/c\bar{c}/gg signal strength is measured independently in three channels, νν¯H\nu\bar{\nu}H, qq¯Hq\bar{q}H, and +H\ell^{+}\ell^{-}H. At a centre-of-mass energy of 240GeV240\,GeV and an integrated luminosity of 5.6 ab1ab^{-1}, the relative statistical accuracy of Hbb¯/cc¯/ggH\to b\bar{b}/c\bar{c}/gg signal strength can reach 0.27%/4.03%/1.56% when combined with these three channels. The results of the analysis are summarized in table 3. According to the recently released Snowmass Gao:2022lew , the CEPC operates at 240240\,GeV will integrate 20ab120\,ab^{-1} luminosity. Accordingly, the relative statistical accuracy of the Hbb¯/cc¯/ggH\to b\bar{b}/c\bar{c}/gg signal strength would be 0.14%/2.13%/0.82%.

Z decay mode Hbb¯H\to b\bar{b} Hcc¯H\to c\bar{c} HggH\to gg
Ze+eZ\to e^{+}e^{-} 1.57% 14.43% 10.31%
Zμ+μZ\to\mu^{+}\mu^{-} 1.06% 10.16% 5.23%
Zqq¯Z\to q\bar{q} 0.35% 7.74% 3.96%
Zνν¯Z\to\nu\bar{\nu} 0.49% 5.75% 1.82%
combination 0.27% 4.03% 1.56%
Table 3: The signal strength accuracies for different channels.

4 Dependence of accuracies on critical detector performances

The flavor tagging performance and the color-singlet-identification (CSI), which represents the reconstruction of a color-singlet that decays into two jets, are two critical detector performances for measuring the signal strength accuracy of Hbb¯/cc¯/ggH\to b\bar{b}/c\bar{c}/gg. In the νν¯H\nu\bar{\nu}H channel, the impact of the flavor tagging performance is analyzed as shown in subsection 4.1. In the qq¯Hq\bar{q}H channel, there are four jets from the two bosons, so the critical detector performance includes not only flavor tagging but also the CSI. Their impact on the anticipated physics reach is evaluated in subsection 4.2.

4.1 νν¯H\nu\bar{\nu}H: Flavor tagging

The flavor tagging performance can be described by the migration matrix, which is defined by the bottom plot in figure 7. We have three reference points for the migration matrix: the unitary matrix corresponding to perfect flavor tagging performance, the flat matrix (all elements are equal to one-third) corresponding to the performance without flavor tagging, and the matrix corresponding to the CEPC baseline detector, which is shown as the bottom plot in figure 7.

Refer to caption
Refer to caption
Figure 11: The dependence of νν¯H(Hbb¯/cc¯/gg)\nu\bar{\nu}H(H\to b\bar{b}/c\bar{c}/gg) signal strength accuracy on flavor tagging performance is shown in this figure, the left plot is for νν¯H(Hbb¯)\nu\bar{\nu}H(H\to b\bar{b}), and the right plot is for νν¯H(Hcc¯/gg)\nu\bar{\nu}H(H\to c\bar{c}/gg). The bigger markers correspond to the results of the CEPC baseline detector. When the key vertex detector parameters, including inner radius, material budget, and spatial resolution, are changed by a factor 0.5/2 from the baseline design (the geometry we used in this simulation), the TrmigTr_{mig} value changes accordingly from 2.35 to 2.54/2.16, shown as two vertical orange lines.

An interpolation method is used to obtain different flavor tagging performance matrices, shown as the eq. (3),

Mmig=TrmigTroptTrITropt(MIMopt)+MoptMmig=TrmigTroptTr1/3Tropt(M1/3Mopt)+Mopt\begin{split}M_{mig}&=\frac{Tr_{mig}-Tr_{opt}}{Tr_{I}-Tr_{opt}}\cdot(M_{I}-M_{opt})+M_{opt}\\ M_{mig}&=\frac{Tr_{mig}-Tr_{opt}}{Tr_{1/3}-Tr_{opt}}\cdot(M_{1/3}-M_{opt})+M_{opt}\end{split} (3)

where MIM_{I} represents the perfect flavor tagging performance matrix (identity matrix), M1/3M_{1/3} represents the matrix without flavor tagging (all elements equal 1/31/3), MoptM_{opt} represents the flavor tagging performance of the CEPC baseline detector, TrITr_{I} represents the trace of the perfect flavor tagging performance matrix, and Tr1/3Tr_{1/3} represents the trace of the matrix without flavor tagging. TrmigTr_{mig} is a variable whose value and the temporary matrix (MmigM_{mig}) have a one-to-one relationship. The value of TroptTr_{opt} ranges from Tr1/3Tr_{1/3} to TrITr_{I}. If TrmigTr_{mig} is greater than TroptTr_{opt}, use the upper formula of eq. (3). Else we use the lower formula. The value of TrmigTr_{mig} is varied from 1.0 (without flavor tagging) to 3.0 (perfect flavor tagging) in increments of 0.1. The dependence of signal strength accuracy on the flavor tagging performance is shown in figure 11. Accuracies corresponding to the baseline CEPC detector are represented by the markers at TrmigTr_{mig} = 2.34. With an ideal flavor tagging algorithm, the signal strength accuracy is 0.48%/3.53%/1.61% for νν¯H(Hbb¯/cc¯/gg)\nu\bar{\nu}H(H\to b\bar{b}/c\bar{c}/gg), which is a 2%/63%/13% improvement over the baseline CEPC detector (0.49%/5.75%/1.82%). The performance of the flavor tagging depends on the design of the vertex detector. If the key vertex detector parameters, including the inner radius, material budget, and spatial resolution, are changed by a factor 0.5/2 VTX from the baseline design (the geometry we used in this simulation), the TrmigTr_{mig} value changes accordingly from 2.35 to 2.54/2.16, as shown in figure 11.

Refer to caption
Refer to caption
Figure 12: The dependence of qq¯H(Hbb¯/cc¯/gg)q\bar{q}H(H\to b\bar{b}/c\bar{c}/gg) signal strength accuracy on flavor tagging performance is shown in this figure, the left plot is for qq¯H(Hbb¯)q\bar{q}H(H\to b\bar{b}), and the right plot is for qq¯H(Hcc¯/gg)q\bar{q}H(H\to c\bar{c}/gg). The bigger markers correspond to the results of the CEPC baseline detector. When the key vertex detector parameters, including inner radius, material budget, and spatial resolution, are changed by a factor 0.5/2 from the baseline design, the TrmigTr_{mig} value changes accordingly from 2.12 to 2.31/1.93, shown as two vertical orange lines.

4.2 qq¯Hq\bar{q}H: Flavor tagging & CSI

Similar to νν¯H\nu\bar{\nu}H, the dependence of the qq¯H(Hbb¯/cc¯/gg)q\bar{q}H(H\to b\bar{b}/c\bar{c}/gg) signal strength accuracy on the flavor tagging performance is shown in figure 12. With perfect flavor tagging performance, the relative accuracy is 0.26%/3.48%/1.41% for qq¯H(Hbb¯/cc¯/gg)q\bar{q}H(H\to b\bar{b}/c\bar{c}/gg), which is a 35%/122%/181% improvement over the baseline CEPC detector (0.35%/7.74%/3.96%). There is a significant improvement for qq¯H(Hgg)q\bar{q}H(H\to gg), because after event selection the backgrounds consist mainly of the processes of e+eqq¯/W+W/ZZ/Mixede^{+}e^{-}\to q\bar{q}/W^{+}W^{-}/ZZ/Mixed with c-jets/light-jets in the final state, while the CEPC baseline performance classifies almost all light-jets and 30% of c-jets as gluon-jets. When the key vertex detector parameters, including inner radius, material budget, and spatial resolution, are changed by a factor 0.5/2 VTX from the baseline design, the TrmigTr_{mig} value changes accordingly from 2.12 to 2.31/1.93, as shown in figure 12.

The performance of the CSI can be evaluated by the angles between the reconstructed bosons and the MC truth bosons, α1\alpha_{1} and α2\alpha_{2}, as shown in figure 13. Since the CSI evaluator used in this paper uses the MC truth information, it is only a demonstrator to illustrate the importance of an excellent CSI reconstruction.

Refer to caption
Figure 13: The definition of α1\alpha_{1} and α2\alpha_{2}.
Refer to caption
Refer to caption
Figure 14: After the whole event selection in table 2, the distributions of log10(α1)log_{10}(\alpha_{1}) versus log10(α2)log_{10}(\alpha_{2}) for signal and background are shown in the left and right plots, respectively.
Refer to caption
Refer to caption
Figure 15: The distributions of (log10(α1)+3)2+(log10(α2)+3)2(log_{10}(\alpha_{1})+3)^{2}+(log_{10}(\alpha_{2})+3)^{2}. The left plot corresponds to the signal and backgrounds after the whole event selection in table 2. The right plot corresponds to the e+eW+W4quarkse^{+}e^{-}\to W^{+}W^{-}\to 4\ quarks before and after the whole event selection in table 2 to illustrate that the event selection process was able to strongly suppress the backgrounds with good CSI performance.

After the entire event selection in table 2, the distributions of log10(α1)log_{10}(\alpha_{1}) versus log10(α2)log_{10}(\alpha_{2}) are shown in figure 14. The circle (log10(α1)+3)2+(log10(α2)+3)2=11.11(log_{10}(\alpha_{1})+3)^{2}+(log_{10}(\alpha_{2})+3)^{2}=11.11 can improve the signal-to-background ratio. The distribution of (log10(α1)+3)2+(log10(α2)+3)2(log_{10}(\alpha_{1})+3)^{2}+(log_{10}(\alpha_{2})+3)^{2} for the signal and backgrounds is shown in figure 15, which shows that most backgrounds have relatively poor CSI performance compared to the signal events. This is because the backgrounds with good CSI performance were strongly suppressed by the event selection. For example, the right plot in figure 15 shows the distributions of (log10(α1)+3)2+(log10(α2)+3)2(log_{10}(\alpha_{1})+3)^{2}+(log_{10}(\alpha_{2})+3)^{2} for the samples of e+eW+W4quarkse^{+}e^{-}\to W^{+}W^{-}\to 4\ quarks, where the red line corresponds to all samples, while the blue line corresponds to that after event selection. To illustrate the performance of the CSI, each of these two distributions is normalized to a unit area. We can see that only the backgrounds with poor CSI performance passed the event selection process. An ideal CSI performance evaluator such as the quantity shown in the left plot of figure 15 shows a potential to improve the relative accuracy of qq¯H(Hbb¯/cc¯/gg)q\bar{q}H(H\to b\bar{b}/c\bar{c}/gg) signal strength by 6%/77%/90%. This motivates future developments aimed at improving the CSI reconstruction or developing a performance estimator based on reconstructed quantities.

4.3 Possible improvements to the flavor tagging

Subsections 4.1 and 4.2 quantify the impact of flavor tagging performance on objective measurement, which strongly promotes better flavor tagging performance. Better flavor tagging performance can be pursued by optimizing the vertex detector and developing advanced reconstruction algorithms. The CEPC vertex detector is designed as a barrel-shaped structure with three concentric cylinders of double-sided layers, whose parameters are listed in table 4. The main features of the vertex detector are a single-point resolution of the first layer of better than 3μm3\,\mu m, a material budget of less than 0.15% X0 per layer, and the location of the first layer near the beam pipe with a radius of 1616\,mm. The flavor tagging algorithm used in this analysis is implemented in the LCFIPlus package and is based on a GBDT.

R (mm) sigle-point resolution (μm\mu m) material budget
Layer 1 16 2.8 0.15%/X0
Layer 2 18 6 0.15%/X0
Layer 3 37 4 0.15%/X0
Layer 4 39 4 0.15%/X0
Layer 5 58 4 0.15%/X0
Layer 6 60 4 0.15%/X0
Table 4: The baseline design parameters of the CEPC vertex system.

For the optimization of the vertex detector, a previous exercise VTX quantifies the correlation between flavor tagging performance and relevant detector properties, including inner radius, material budget, and spatial resolution of the vertex system, as shown in figure 16. This exercise shows that significant improvements can be achieved if the above properties can be reduced compared to the baseline design. Considering an optimal and a conservative scenario, as proposed by ref. VTX , where the number of these three parameters is 0.5/2 times compared to the baseline design (the geometry we used in this simulation), the TrmigTr_{mig} will be changed from 2.35 to 2.54/2.16 in the νν¯H\nu\bar{\nu}H channel, as shown in figure 11, and from 2.12 to 2.31/1.93 in the qq¯Hq\bar{q}H channel, as shown in figure 12. The details can be found in appendix B.

Refer to caption
Figure 16: The c-tagging performance with a parameter scan on the basis of the CEPC baseline. The y-axis represents the efficiency times purity of c-jets tagged in ννH,Hbb¯,cc¯,gg\nu\nu H,H\to b\bar{b},c\bar{c},gg samples. The x-axis represents the relative difference of vertex detector parameters to the CEPC baseline.

5 Discussion of systematic uncertainties

The systematic uncertainties relevant to the analyzes presented in this manuscript originate from many sources, including the measurement of the integrated luminosity, the jet energy scale, the track momentum resolution, the reconstructed invariant mass and visible energy of hadronic systems, the flavor tagging performance, the jet configuration and the CSI, some of which are discussed below.

  • According to the CEPC conceptual design report CEPCStudyGroup:2018ghi , the integrated luminosity is required to be measured with a relative accuracy of 10310^{-3} for CEPC Higgs operation and 10410^{-4} for Z-pole operation. The systematic uncertainties caused by the uncertainty of the integrated luminosity are therefore negligible for the Hcc¯/ggH\to c\bar{c}/gg signal strength measurement, while it reaches a comparable level for the Hbb¯H\to b\bar{b} measurement with an integrated luminosity of 20ab120\,ab^{-1} CEPCPhysicsStudyGroup:2022uwl ; Gao:2022lew . The accuracy requirements for the integrated luminosity measurement should be tightened to cope with the accuracy of the Hbb¯H\to b\bar{b} signal strength measurement at the 10410^{-4} level, i.e. to an accuracy of 5×1045\times 10^{-4}. Since the statistics of the main physics processes of luminosity monitoring, i.e. the small-angle Bhabha and di-photon events, scale with the integrated luminosity, this goal shall in principle be possible, while a very precise control of Lumi-Cal installation, calibration, and monitoring would be essential.

  • The event selection relies on the reconstruction of the hadronic system, in particular its momentum and energy. Therefore, understanding the jet energy scale/resolution is crucial to control the systematic uncertainty. The analysis in ref. Lai:2021rko shows that the jet energy scale can be controlled within 0.5% at the baseline CEPC detector, leading to an uncertainty of the order of 10410^{-4} on the selection efficiency mainly due to the cut on the recoil mass. These uncertainties can be further reduced by data-driven methods, i.e. by reconstructing the differential jet energy scale in-situ using semi-leptonic WW and ZZ events as well as e+eqq¯e^{+}e^{-}\to q\bar{q} events. Similarly, the track momentum scale and the photon energy scale can in principle be calibrated using candles of narrow resonances such as KS0K_{S}^{0} tai , Λ\Lambda, J/ψJ/\psi, and π0\pi^{0}, since these physics objects are abundant in hadronic events. The in-situ calibration can be applied to control the relevant systematic errors.

    Refer to caption
    Figure 17: The selection efficiency of νν¯H(Hbb¯/cc¯/gg)\nu\bar{\nu}H(H\to b\bar{b}/c\bar{c}/gg) for each cut variable.
  • The total energy and momentum of the hadronically decaying Higgs boson significantly depends on its decay modes, since the heavy flavor quarks can decay semi-leptonically. The semi-leptonic decays of b/c quarks generate neutrinos, leading to a significant deformation of the reconstructed invariant mass for Hbb¯/cc¯H\to b\bar{b}/c\bar{c} events, which in turn results in a flavor dependent efficiency in the event selection chain, as shown in figure 17. To accurately calculate the corresponding efficiencies for different flavors, it is essential to control the shape of the relevant distribution, which can be done by several methods. First, the charged lepton from the semi-leptonic decay can be identified efficiently Yu:2021pxc . Second, using physics events with a pair of b-jets (i.e. e+eqq¯e^{+}e^{-}\to q\bar{q} events, νν¯Z\nu\bar{\nu}Z events, and Z-pole events) and a restrictive selection on the b-likeness of a jet, we can obtain highly pure and inclusive b-jet samples, since the decay modes of both b-jets are in principle independent. Third, the +H\ell^{+}\ell^{-}H channel with lepton identification inside the jet also offers the possibility to control the b-invariant mass. To conclude, there are multiple ways to mitigate this effect.

  • A much more subtle systematic for the HggH\to gg measurements arises from the fact that the gluon jet has a different configuration compared to the quark jet. As shown in figure 17, the Y23 cut leads to an efficiency of 83%/82%/68% for the Hbb¯/cc¯/ggH\to b\bar{b}/c\bar{c}/gg selection. For the qq¯Hq\bar{q}H channel, as shown in figure 18, the thrust cut results in an efficiency of 87%/87%/93% for the Hbb¯/cc¯/ggH\to b\bar{b}/c\bar{c}/gg selection. The cut variables related to the jet configuration in the qq¯Hq\bar{q}H channel also include -log(Y34) and the angle of the two jets from the heavy/light di-jet system. Therefore, it is essential to understand and calibrate the spatial configurations of the jets. This requirement can be addressed using sophisticated QCD calculations and its comparison with low pile-up LHC data as well as three-jet events in CEPC Z-pole operation.

    Refer to caption
    Figure 18: The selection efficiency of qq¯H(Hbb¯/cc¯/gg)q\bar{q}H(H\to b\bar{b}/c\bar{c}/gg) for each cut variable.
  • The systematic uncertainties caused by flavor tagging performance can be characterized by the uncertainties in the flavor tagging performance matrix, i.e. the difference between the actual flavor tagging performance matrix and that obtained from simulation. Using the abundant hadronic and semi-leptonic events at CEPC, as well as the prior knowledge of the branching ratios of the W and Z boson decays into different quark flavors, we can derive the flavor-tagging performance matrix using a data-driven method. The statistics of relevant hadronic events, e.g. WW, ZZ, ISR-return-Z processes in CEPC Higgs runs, is 2-3 orders of magnitude higher than that of the Higgs signal. In addition, the CEPC is expected to acquire several 101210^{12} hadronic Z events at its Z-pole operation, 6 orders of magnitude larger than the expected number of Higgs events. These samples, especially the semi-leptonic WW events and the hadronic Z events, can be controlled with very high purity. Therefore, data-driven methods could control the systematic uncertainties caused by the flavor tagging performance to a negligible level if the following two conditions are met. The first is that the detector is well understood and maintains stable during the physics data acquisition, which requires dedicated performance analysis and detector stability analysis. The second is that the relative difference between gluon jets and quark jets on the behavior of flavor tagging is well controlled, which requires dedicated QCD studies and performance studies.

  • CSI is essential for the qq¯Hq\bar{q}H analyzes. Not only because the ideal CSI can greatly improve final accuracy, but also because the CSI induces a flavor-dependency in the event selection efficiency. The circular cut (the tenth row in table 2) (Mheavy125)2+(Mlight91)2<=292(M_{heavy}-125)^{2}+(M_{light}-91)^{2}<=29^{2} based on the information from the CSI has an efficiency of 97%/99%/99% for Hbb¯/cc¯/ggH\to b\bar{b}/c\bar{c}/gg. In this manuscript, the CSI is based on the jet clustering and jet matching procedures. To investigate the systematic uncertainty caused by the CSI, we replace the Durham algorithm with Valencia algorithm VLC and obtain efficiencies of 97%/99%/99% for Hbb¯/cc¯/ggH\to b\bar{b}/c\bar{c}/gg. The difference between the Durham and the Valencia is smaller than the MC statistical uncertainty. But the relative difference in event selection efficiency between the different flavors is in the percentage level and has to be controlled in a further step.

To conclude, we categorize the leading systematic uncertainties in these analyzes into three groups. The first group are those that are significantly smaller than the statistical uncertainties, including the reconstructed energy/momentum scale of the physics objects. The second group are those comparable to the statistical uncertainty, especially the integrated luminosity. The third group are those that can be significantly larger than the statistical uncertainty, including CSI and the jet configuration. A full quantification of the systematic uncertainty is beyond the scope of this paper and awaits real data and new methods.

6 Conclusion

We estimate the anticipated accuracy for the Hbb¯/cc¯/ggH\to b\bar{b}/c\bar{c}/gg measurements at CEPC with its nominal luminosity of 5.6ab15.6\,ab^{-1} corresponding to the CEPC CDR CEPCStudyGroup:2018ghi and 20ab120\,ab^{-1} as proposed for Snowmass 2021 Gao:2022lew . Using the CEPC CDR baseline detector, we simulated MC samples corresponding to the CEPC Higgs operation and combined the accuracies obtained in the +H\ell^{+}\ell^{-}H, νν¯H\nu\bar{\nu}H, and qq¯Hq\bar{q}H channels. We conclude that the signal strength of Hbb¯/cc¯/ggH\to b\bar{b}/c\bar{c}/gg can be measured with a statistical uncertainty of 0.27%/4.03%/1.56% and 0.14%/2.13%/0.82%, corresponding to integrated luminosities of 5.6ab15.6\,ab^{-1} and 20ab120\,ab^{-1}, respectively. In addition, we discuss the relevant systematic uncertainties, critical performance, and vertex system optimization. We also identify several critical topics that should be studied in detail in the future.

We found that the systematic uncertainty caused by the integrated luminosity is comparable to the statistical uncertainty of the Hbb¯H\to b\bar{b} signal strength measurement, that the systematic uncertainty caused by CSI and the jet configuration can be much larger than the statistical uncertainty, and that there are multiple ways to control the systematic uncertainties caused by reconstructed hadronic systems. Data-driven methods are expected to control the systematic uncertainty. Moreover, the complicated patterns and the deviations of the branching ratios of the Z-boson decay from the naive expectations in figure 10 show that flavor tagging would lead to serious systematic uncertainties that need to be controlled based on a much better understanding of fragmentation, hadronization, and gluon splitting.

The dependence of the measurement accuracy on critical detector performance aspects, specifically flavor tagging and CSI, has been analyzed. Compared with the flavor tagging performance of the baseline CEPC detector, perfect flavor tagging performance could improve the relative accuracy of the Hbb¯/cc¯/ggH\to b\bar{b}/c\bar{c}/gg signal strength by 35%/122%/181% in the qq¯Hq\bar{q}H channel and 2%/63%/13% in the νν¯H\nu\bar{\nu}H channel. An ideal CSI or a reliable evaluator for CSI performance can significantly improve the physics reach, which motivates us to pay more attention to CSI.

From our analysis of relevant performance and systematic uncertainties, we conclude that the critical detector and reconstruction performance of flavor tagging and CSI has a very strong impact on the anticipated precision. Therefore, we would like to encourage the design and optimization of the vertex system towards better precision, smaller inner radius, and lower material budget. The development of advanced reconstruction algorithms, probably synchronized QCD studies, to achieve a better CSI performance. The per-mille accuracy also places high demands on systematic control, especially on the stability of detector operation. We also note a significant difference in spatial configuration between the quark and gluon jets, which can lead to significant systematic uncertainties. Dedicated studies of the theoretical calculation of QCD and comparison with the available data are therefore crucial to control these uncertainties.

Appendix A Cross section, expected and simulated event number, and scaling factor

The table 5 lists the cross section, the number of expected events, the number of simulated events, and the scaling factors used in this analysis. The scaling factor is defined as the simulated statistic divided by the expected statistic. The single-Z process consists of an electron-positron pair and an on-shell Z boson in the final state. The single-W process consists of a e±e^{\pm} together with its neutrino and an on-shell W boson in the final state. The ZZ and WW processes consist of two on-shell bosons decaying into four fermions. The mixed process consists of two mutually charge-conjugated pairs in the final state, which could be from either the virtual WW or the ZZ.

name channel X-section expected simulated scaling factor
fbfb million k %
ZH Zνν,HiggsinclusivedecayZ\to\nu\nu,Higgs\ inclusive\ decay 46.29 0.26 217 83
Ze+e,HiggsinclusivedecayZ\to e^{+}e^{-},Higgs\ inclusive\ decay 7.04 0.04 87 221
Zμ+μ,HiggsinclusivedecayZ\to\mu^{+}\mu^{-},Higgs\ inclusive\ decay 6.77 0.04 72 191
Zτ+τ,HiggsinclusivedecayZ\to\tau^{+}\tau^{-},Higgs\ inclusive\ decay 6.75 0.04 82 217
Zqq¯,HiggsinclusivedecayZ\to q\bar{q},Higgs\ inclusive\ decay 136.81 0.77 566 74
ZZ Zcc¯,Zdd¯/bb¯Z\to c\bar{c},Z\to d\bar{d}/b\bar{b} 98.97 0.55 123 22
ZZ4downquarksZZ\to 4\ down\ quarks 233.46 1.311.31 288 22
ZZ4upquarksZZ\to 4\ up\ quarks 85.68 0.48 102 21
Zuu¯,Zss¯/bb¯Z\to u\bar{u},Z\to s\bar{s}/b\bar{b} 98.56 0.55 120 22
Zμ+μ,Zμ+μZ\to\mu^{+}\mu^{-},Z\to\mu^{+}\mu^{-} 15.56 0.09 23 26
Zτ+τ,Zτ+τZ\to\tau^{+}\tau^{-},Z\to\tau^{+}\tau^{-} 4.61 0.03 25 97
Zμ+μ,ZντντZ\to\mu^{+}\mu^{-},Z\to\nu_{\tau}\nu_{\tau} 19.38 0.11 24 22
Zτ+τ,Zμ+μZ\to\tau^{+}\tau^{-},Z\to\mu^{+}\mu^{-} 18.65 0.10 22 21
Zτ+τ,ZντντZ\to\tau^{+}\tau^{-},Z\to\nu{\tau}\nu{\tau} 9.61 0.05 25 46
Zμ+μ,ZdownquarksZ\to\mu^{+}\mu^{-},Z\to down\ quarks 136.14 0.76 644 85
Zμ+μ,ZupquarksZ\to\mu^{+}\mu^{-},Z\to up\ quarks 87.39 0.49 110 23
Zνν,ZdownquarksZ\to\nu\nu,Z\to down\ quarks 139.71 0.78 175 22
Zνν,ZupquarksZ\to\nu\nu,Z\to up\ quarks 84.38 0.47 105 22
Zτ+τ,ZdownquarksZ\to\tau^{+}\tau^{-},Z\to down\ quarks 67.31 0.38 312 83
Zτ+τ,ZupquarksZ\to\tau^{+}\tau^{-},Z\to up\ quarks 41.56 0.23 193 83
WW ccbsccbs 5.89 0.03 24 75
ccdsccds 170.18 0.95 203 21
cusdcusd 3478.89 19.5 2668 14
uusduusd 170.45 0.95 194 4.92
WW4leptonWW\to 4-lepton 403.66 2.26 488 20
Wmuνμ,WqqW\to mu\nu_{\mu},W\to qq 2423.43 13.6 9215 68
Wtauντ,WqqW\to tau\nu_{\tau},W\to qq 2423.56 13.6 2745 20
SW eνe,Wμνμe\nu_{e},W\to\mu\nu_{\mu} 436.70 2.44 538 22
eνe,Wτντe\nu_{e},W\to\tau\nu_{\tau} 435.93 2.44 535 22
eνe,Wqqe\nu_{e},W\to qq 2612.62 14.63 9233 63
SZ e+e,Ze+ee^{+}e^{-},Z\to e^{+}e^{-} 78.49 0.44 97 22
e+e,Zμ+μe^{+}e^{-},Z\to\mu^{+}\mu^{-} 845.81 4.74 520 11
e+e,Zννe^{+}e^{-},Z\to\nu\nu 28.94 0.16 36 22
e+e,Zτ+τe^{+}e^{-},Z\to\tau^{+}\tau^{-} 147.28 0.82 180 22
e+e,Zdownquarkse^{+}e^{-},Z\to down\ quarks 125.83 0.70 153 22
e+e,Zupquarkse^{+}e^{-},Z\to up\ quarks 190.21 1.061.06 231 22
ν+ν,Zμ+μ\nu^{+}\nu^{-},Z\to\mu^{+}\mu^{-} 43.42 0.24 37 15
ν+ν,Zτ+τ\nu^{+}\nu^{-},Z\to\tau^{+}\tau^{-} 14.57 0.08 22 27
ν+ν,Zdownquarks\nu^{+}\nu^{-},Z\to down\ quarks 90.03 0.50 90 18
ν+ν,Zupquarks\nu^{+}\nu^{-},Z\to up\ quarks 55.59 0.31 71 23
mix ZZ/WWμμνμνμZZ/WW\to\mu\mu\nu_{\mu}\nu_{\mu} 221.10 1.24 263 21
ZZ/WWττντντZZ/WW\to\tau\tau\nu_{\tau}\nu_{\tau} 211.18 1.18 262 83
ZZ/WWccssZZ/WW\to ccss 1607.55 9.00 1966 22
ZZ/WWuuddZZ/WW\to uudd 1610.32 9.02 1604 18
SW/SZeeνeνeSW/SZ\to ee\nu_{e}\nu_{e} 249.48 1.40 307 22
2f e+ee^{+}e^{-} 24770.90 138.72 314 0.23
μ+μ\mu^{+}\mu^{-} 5332.71 29.86 278 0.93
τ+τ\tau^{+}\tau^{-} 4752.89 26.62 746 2.80
qq¯q\bar{q} 54106.86 303.00 7437 2.45
γγhadrons\gamma\gamma\to hadrons 87670 490.95 1366 0.28
Table 5: The cross section, expected event number, simulated event number, and scaling factor of the signal and various backgrounds.

Appendix B Dependence of flavor tagging performance on vertex detector design

The relationship between c-tagging efficiency times purity (ϵp\epsilon\cdot p) and TrmigTr_{mig} in the qq¯Hq\bar{q}H channel is fitted with an empirical formula as Trmig=1.11log10(ϵp)+3.23Tr_{mig}=1.11\cdot log_{10}(\epsilon\cdot p)+3.23. The value of ϵp\epsilon\cdot p ranges from 0.02 to 0.8. Considering the relationship between ϵp\epsilon\cdot p and vertex detector parameters shown in figure 16, the empirical formula for TrmigTr_{mig} and vertex detector parameters is

Trmig=2.12+0.05log2Rmaterial0Rmaterial+0.04log2Rresolution0Rresolution+0.10log2Rradius0Rradius,Tr_{mig}=2.12+0.05\cdot log_{2}\frac{R_{material}^{0}}{R_{material}}+0.04\cdot log_{2}\frac{R_{resolution}^{0}}{R_{resolution}}+0.10\cdot log_{2}\frac{R_{radius}^{0}}{R_{radius}}, (4)

where Rmaterial0R_{material}^{0} is the default material budget and RmaterialR_{material} is the modified material budget, and similarly for the other parameters. In the νν¯H\nu\bar{\nu}H channel, the relationship between c-tagging efficiency times purity and TrmigTr_{mig} is fitted to an empirical formula as Trmig=1.12log10(ϵp)+3.28Tr_{mig}=1.12\cdot log_{10}(\epsilon\cdot p)+3.28. The value of ϵp\epsilon\cdot p ranges from 0.02 to 0.8. The empirical formula for TrmigTr_{mig} and the parameters of the vertex detector is

Trmig=2.35+0.05log2Rmaterial0Rmaterial+0.04log2Rresolution0Rresolution+0.10log2Rradius0Rradius.Tr_{mig}=2.35+0.05\cdot log_{2}\frac{R_{material}^{0}}{R_{material}}+0.04\cdot log_{2}\frac{R_{resolution}^{0}}{R_{resolution}}+0.10\cdot log_{2}\frac{R_{radius}^{0}}{R_{radius}}. (5)
Acknowledgements.
We thank Chengdong FU, Gang LI, and Xianghu ZHAO for providing the simulation tools and samples. We thank Hao Liang, Dan Yu, and Yudong Wang for useful discussions. This project is supported by the International Partnership Program of Chinese Academy of Sciences (Grant No. 113111KYSB20190030), the Innovative Scientific Program of Institute of High Energy Physics.

References

  • (1) A. Abbrescia, M. AbdusSalam et al., FCC-ee: The Lepton Collider, Eur. Phys. J. Spec. Top. 228, 261-623 (2019).
  • (2) H. Aihara et al. [ILC], The International Linear Collider. A Global Project, [arXiv:1901.09829 [hep-ex]].
  • (3) A. Robson, P. N. Burrows, N. Catalan Lasheras, L. Linssen, M. Petric, D. Schulte, E. Sicking, S. Stapnes and W. Wuensch, The Compact Linear e+e- Collider (CLIC): Accelerator and Detector, [arXiv:1812.07987 [physics.acc-ph]].
  • (4) J. B. Guimarães da Costa et al. [CEPC Study Group], CEPC Conceptual Design Report: Volume 2 - Physics & Detector, [arXiv:1811.10545 [hep-ex]].
  • (5) M. Cepeda, S. Gori, P. Ilten et al., Higgs Physics at the HL-LHC and HE-LHC, [arXiv:1902.00134 [hep-ph]].
  • (6) R. K. Ellis, B. Heinemann, J. de Blas, M. Cepeda, C. Grojean, F. Maltoni, A. Nisati, E. Petit, R. Rattazzi and W. Verkerke, et al. Physics Briefing Book: Input for the European Strategy for Particle Physics Update 2020, [arXiv:1910.11775 [hep-ex]].
  • (7) European Strategy Group, Deliberation document on the 2020 Update of the European Strategy for Particle Physics (Brochure), CERN-ESU-016, doi:10.17181/ESU2020Deliberation
  • (8) A. Abada et al. [FCC], FCC-ee: The Lepton Collider: Future Circular Collider Conceptual Design Report Volume 2, Eur. Phys. J. ST 228 (2019) no.2, 261-623 doi:10.1140/epjst/e2019-900045-4
  • (9) CEPC Study Group, CEPC Conceptual Design Report: Volume 1 - Accelerator, [arXiv:1809.00285 [physics.acc-ph]].
  • (10) F. An, Y. Bai, C. Chen, X. Chen, Z. Chen, J. Guimaraes da Costa, Z. Cui, Y. Fang, C. Fu and J. Gao, et al. Precision Higgs physics at the CEPC, Chin. Phys. C 43, no.4, 043002 (2019) doi:10.1088/1674-1137/43/4/043002 [arXiv:1810.09037 [hep-ex]].
  • (11) Y. Bai, C. Chen, Y. Fang, G. Li, M. Ruan, J. Y. Shi, B. Wang, P. Y. Kong, B. Y. Lan and Z. F. Liu, Measurements of decay branching fractions of Hbb¯/cc¯/ggH\to b\bar{b}/c\bar{c}/gg in associated (e+e/μ+μ)H(e^{+}e^{-}/\mu^{+}\mu^{-})H production at the CEPC, Chin. Phys. C 44 (2020) no.1, 013001 doi:10.1088/1674-1137/44/1/013001 [arXiv:1905.12903 [hep-ex]].
  • (12) A. Buckley, G. Callea, A. J. Larkoski and S. Marzani, An Optimal Observable for Color Singlet Identification, SciPost Phys. doi:10.21468/SciPostPhys.9.2.026
  • (13) Y. Zhu and M. Ruan, Performance study of the separation of the full hadronic WW and ZZ events at the CEPC, [arXiv:1812.09478 [hep-ex]].
  • (14) W. Kilian, T. Ohl, J. Reuter, WHIZARD: Simulating Multi-Particle Processes at LHC and ILC, Eur.Phys.J.C 71 (2011) 1742 [arXiv: 0708.4233 [hep-ph]]
  • (15) T. Sjostrand, S. Mrenna and P. Z. Skands, PYTHIA 6.4 Physics and Manual, doi:10.1088/1126-6708/2006/05/026 [arXiv:hep-ph/0603175 [hep-ph]].
  • (16) P. Moras de Freitas et al., MOKKA: A detailed Geant4 simulation for the international linear collider detectors, https://flcwiki.desy.de/Mokka
  • (17) F. Gaede, S. Aplin, R. Glattauer, C. Rosemann, and G. Voutsinas, Track reconstruction at the ILC: the ILD tracking software, J. Phys. Conf. Ser., Vol. 513, P. 022011, 2014
  • (18) M. Ruan, Arbor, a new approach of the Particle Flow Algorithm, [arXiv: 1403.4784 [physics.ins-det]]
  • (19) C. Bierlich, S. Chakraborty, N. Desai, L. Gellersen, I. Helenius, P. Ilten, L. Lönnblad, S. Mrenna, S. Prestel and C. T. Preuss, et al. A comprehensive guide to the physics and usage of PYTHIA 8.3, [arXiv:2203.11601 [hep-ph]].
  • (20) D. Buskulic et al. An experimental study of γγhadrons\gamma\gamma\rightarrow hadrons at LEP Physics Letters B, Elsevier, 1993, 313, pp.509-519. <in2p3-00004532>
  • (21) I. Helenius, Photon-photon and photon-hadron processes in Pythia 8, CERN Proc. 1, 119 (2018) doi:10.23727/CERN-Proceedings-2018-001.119 [arXiv:1708.09759 [hep-ph]].
  • (22) G. J. Feldman and R. D. Cousins, A Unified approach to the classical statistical analysis of small signals, Phys. Rev. D 57, 3873-3889 (1998) doi:10.1103/PhysRevD.57.3873 [arXiv:physics/9711021 [physics.data-an]].
  • (23) A. Hoecker, P. Speckmayer, J. Stelzer, J. Therhaag, E. von Toerne, H. Voss, and D. Dannheim, TMVA-Toolkit for multivariatedata analysis, [arXiv:physics/0703039 [physics.data-an]]
  • (24) S. Catani, Y. L. Dokshitzer, M. Olsson, G. Turnock and B. R. Webber, New clustering algorithm for multi - jet cross-sections in e+ e- annihilation, Phys. Lett. B 269 (1991), 432-438 doi:10.1016/0370-2693(91)90196-W
  • (25) M. Cacciari, G. P. Salam and G. Soyez, FastJet User Manual, Eur. Phys. J. C 72 (2012), 1896, pg.22
  • (26) Taikan Suehara, Tomohiko Tanabe, LCFIPlus, A Framework for jet Analysis in Linear Collider Studies, [arXiv: 1506.08371 [physics.ins-det]]
  • (27) C. Patrignani et al. [Particle Data Group], Review of Particle Physics, Chin. Phys. C, 40, 100001 (2016), pg.523
  • (28) L. Xia, Study of constraint and impact of a nuisance parameter in maximum likelihood method, [arXiv:1805.03961v4 [physics.data-an]].
  • (29) G. Aad et al. [ATLAS], Measurement of hadronic event shapes in high-pT multijet final states at s\sqrt{s} = 13 TeV with the ATLAS detector, JHEP 01 (2021), 188 doi:10.1007/JHEP01(2021)188 [arXiv:2007.12600 [hep-ex]].
  • (30) J. Gao [CEPC Accelerator Study Group], Snowmass2021 White Paper AF3-CEPC, [arXiv:2203.09451 [physics.acc-ph]].
  • (31) Z. Wu, G. Li, D. Yu, C. Fu, Q. Ouyang, M. Ruan, Study of vertex optimization at the CEPC, Journal of Instrumentation. 13. T09002-T09002, doi:10.1088/1748-0221/13/09/T09002
  • (32) H. Cheng et al. [CEPC Physics Study Group], The Physics potential of the CEPC. Prepared for the US Snowmass Community Planning Exercise (Snowmass 2021), [arXiv:2205.08553 [hep-ph]].
  • (33) P. Z. Lai, M. Ruan and C. M. Kuo, Jet performance at the circular electron-positron collider, JINST 16, no.07, P07037 (2021) doi:10.1088/1748-0221/16/07/P07037 [arXiv:2104.05029 [hep-ex]].
  • (34) T. Zheng et al., Reconstructing KS0K^{0}_{S} and Λ\varLambda in the CEPC baseline detector, The European Physical Journal Plus, doi:10.1140/epjp/s13360-020-00272-4
  • (35) D. Yu, T. Zheng and M. Ruan, Lepton identification performance in Jets at a future electron positron Higgs Z factory, doi:10.1088/1748-0221/16/06/P06013 [arXiv:2105.01246 [hep-ex]].
  • (36) M. Boronat et al. A robust jet reconstruction algorithm for high-energy lepton colliders, doi:10.1016/j.physletb.2015.08.055 [arXiv:1404.4294[hep-ex]].