This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

The Belle Collaboration


Study of the muon decay-in-flight in the 𝝉𝝁𝝂¯𝝁𝝂𝝉\tau^{-}\to\mu^{-}\bar{\nu}_{\mu}\nu_{\tau} decay to measure the Michel parameter 𝝃\xi^{\prime}

D. Bodrov  0000-0001-5279-4787    P. Pakhlov 0000-0001-7426-4824    I. Adachi 0000-0003-2287-0173    H. Aihara  0000-0002-1907-5964    S. Al Said 0000-0002-4895-3869    D. M. Asner 0000-0002-1586-5790    H. Atmacan 0000-0003-2435-501X    T. Aushev  0000-0002-6347-7055    R. Ayad 0000-0003-3466-9290    V. Babu  0000-0003-0419-6912    Sw. Banerjee  0000-0001-8852-2409    P. Behera  0000-0002-1527-2266    K. Belous 0000-0003-0014-2589    J. Bennett 0000-0002-5440-2668    M. Bessner  0000-0003-1776-0439    B. Bhuyan  0000-0001-6254-3594    T. Bilka 0000-0003-1449-6986    D. Biswas  0000-0002-7543-3471    A. Bobrov  0000-0001-5735-8386    A. Bondar 0000-0002-5089-5338    J. Borah  0000-0003-2990-1913    A. Bozek 0000-0002-5915-1319    M. Bračko 0000-0002-2495-0524    P. Branchini  0000-0002-2270-9673    T. E. Browder 0000-0001-7357-9007    A. Budano 0000-0002-0856-1131    M. Campajola 0000-0003-2518-7134    D. Červenkov  0000-0002-1865-741X    M.-C. Chang 0000-0002-8650-6058    B. G. Cheon  0000-0002-8803-4429    K. Chilikin 0000-0001-7620-2053    K. Cho  0000-0003-1705-7399    S.-J. Cho 0000-0002-1673-5664    S.-K. Choi 0000-0003-2747-8277    Y. Choi 0000-0003-3499-7948    S. Choudhury 0000-0001-9841-0216    D. Cinabro 0000-0001-7347-6585    S. Das  0000-0001-6857-966X    G. De Nardo  0000-0002-2047-9675    G. De Pietro 0000-0001-8442-107X    R. Dhamija 0000-0001-7052-3163    F. Di Capua 0000-0001-9076-5936    J. Dingfelder 0000-0001-5767-2121    Z. Doležal 0000-0002-5662-3675    T. V. Dong 0000-0003-3043-1939    D. Epifanov  0000-0001-8656-2693    T. Ferber  0000-0002-6849-0427    D. Ferlewicz 0000-0002-4374-1234    B. G. Fulsom  0000-0002-5862-9739    V. Gaur  0000-0002-8880-6134    A. Garmash  0000-0003-2599-1405    A. Giri 0000-0002-8895-0128    P. Goldenzweig 0000-0001-8785-847X    E. Graziani  0000-0001-8602-5652    D. Greenwald 0000-0001-6964-8399    T. Gu 0000-0002-1470-6536    Y. Guan 0000-0002-5541-2278    K. Gudkova 0000-0002-5858-3187    C. Hadjivasiliou 0000-0002-2234-0001    S. Halder  0000-0002-6280-494X    K. Hayasaka  0000-0002-6347-433X    H. Hayashii 0000-0002-5138-5903    M. T. Hedges  0000-0001-6504-1872    D. Herrmann 0000-0001-9772-9989    W.-S. Hou  0000-0002-4260-5118    C.-L. Hsu 0000-0002-1641-430X    T. Iijima  0000-0002-4271-711X    K. Inami 0000-0003-2765-7072    N. Ipsita  0000-0002-2927-3366    A. Ishikawa 0000-0002-3561-5633    R. Itoh 0000-0003-1590-0266    M. Iwasaki 0000-0002-9402-7559    W. W. Jacobs  0000-0002-9996-6336    E.-J. Jang 0000-0002-1935-9887    Q. P. Ji  0000-0003-2963-2565    S. Jia 0000-0001-8176-8545    Y. Jin 0000-0002-7323-0830    K. K. Joo 0000-0002-5515-0087    D. Kalita  0000-0003-3054-1222    A. B. Kaliyar 0000-0002-2211-619X    K. H. Kang 0000-0002-6816-0751    T. Kawasaki  0000-0002-4089-5238    C. Kiesling  0000-0002-2209-535X    C. H. Kim 0000-0002-5743-7698    D. Y. Kim 0000-0001-8125-9070    K.-H. Kim  0000-0002-4659-1112    Y.-K. Kim  0000-0002-9695-8103    H. Kindo  0000-0002-6756-3591    K. Kinoshita 0000-0001-7175-4182    P. Kodyš  0000-0002-8644-2349    A. Korobov  0000-0001-5959-8172    S. Korpar 0000-0003-0971-0968    P. Križan  0000-0002-4967-7675    P. Krokovny  0000-0002-1236-4667    T. Kuhr 0000-0001-6251-8049    M. Kumar 0000-0002-6627-9708    R. Kumar 0000-0002-6277-2626    K. Kumara 0000-0003-1572-5365    Y.-J. Kwon  0000-0001-9448-5691    J. S. Lange  0000-0003-0234-0474    S. C. Lee  0000-0002-9835-1006    J. Li  0000-0001-5520-5394    L. K. Li 0000-0002-7366-1307    Y. Li  0000-0002-4413-6247    J. Libby  0000-0002-1219-3247    K. Lieret 0000-0003-2792-7511    Y.-R. Lin 0000-0003-0864-6693    D. Liventsev  0000-0003-3416-0056    Y. Ma  0000-0001-8412-8308    M. Masuda  0000-0002-7109-5583    T. Matsuda 0000-0003-4673-570X    S. K. Maurya  0000-0002-7764-5777    F. Meier 0000-0002-6088-0412    M. Merola 0000-0002-7082-8108    F. Metzner 0000-0002-0128-264X    K. Miyabayashi 0000-0003-4352-734X    R. Mizuk  0000-0002-2209-6969    G. B. Mohanty 0000-0001-6850-7666    R. Mussa  0000-0002-0294-9071    I. Nakamura 0000-0002-7640-5456    M. Nakao  0000-0001-8424-7075    D. Narwal  0000-0001-6585-7767    Z. Natkaniec 0000-0003-0486-9291    A. Natochii 0000-0002-1076-814X    L. Nayak 0000-0002-7739-914X    M. Nayak 0000-0002-2572-4692    N. K. Nisar  0000-0001-9562-1253    S. Nishida 0000-0001-6373-2346    S. Ogawa  0000-0002-7310-5079    H. Ono  0000-0003-4486-0064    P. Oskin  0000-0002-7524-0936    G. Pakhlova  0000-0001-7518-3022    S. Pardi  0000-0001-7994-0537    H. Park 0000-0001-6087-2052    J. Park 0000-0001-6520-0028    S.-H. Park 0000-0001-6019-6218    S. Patra  0000-0002-4114-1091    S. Paul 0000-0002-8813-0437    T. K. Pedlar 0000-0001-9839-7373    R. Pestotnik 0000-0003-1804-9470    L. E. Piilonen  0000-0001-6836-0748    T. Podobnik 0000-0002-6131-819X    E. Prencipe  0000-0002-9465-2493    M. T. Prim 0000-0002-1407-7450    A. Rabusov  0000-0001-8189-7398    G. Russo  0000-0001-5823-4393    S. Sandilya  0000-0002-4199-4369    A. Sangal 0000-0001-5853-349X    L. Santelj 0000-0003-3904-2956    V. Savinov  0000-0002-9184-2830    G. Schnell  0000-0002-7336-3246    C. Schwanda  0000-0003-4844-5028    Y. Seino  0000-0002-8378-4255    K. Senyo 0000-0002-1615-9118    W. Shan  0000-0003-2811-2218    M. Shapkin 0000-0002-4098-9592    C. Sharma 0000-0002-1312-0429    J.-G. Shiu  0000-0002-8478-5639    J. B. Singh  0000-0001-9029-2462    A. Sokolov  0000-0002-9420-0091    E. Solovieva  0000-0002-5735-4059    M. Starič  0000-0001-8751-5944    Z. S. Stottler 0000-0002-1898-5333    M. Sumihama  0000-0002-8954-0585    M. Takizawa  0000-0001-8225-3973    U. Tamponi 0000-0001-6651-0706    K. Tanida  0000-0002-8255-3746    F. Tenchini  0000-0003-3469-9377    R. Tiwary  0000-0002-5887-1883    K. Trabelsi 0000-0001-6567-3036    M. Uchida  0000-0003-4904-6168    T. Uglov  0000-0002-4944-1830    Y. Unno 0000-0003-3355-765X    K. Uno 0000-0002-2209-8198    S. Uno 0000-0002-3401-0480    S. E. Vahsen  0000-0003-1685-9824    G. Varner 0000-0002-0302-8151    A. Vinokurova  0000-0003-4220-8056    A. Vossen  0000-0003-0983-4936    D. Wang  0000-0003-1485-2143    E. Wang 0000-0001-6391-5118    M.-Z. Wang 0000-0002-0979-8341    S. Watanuki  0000-0002-5241-6628    X. Xu  0000-0001-5096-1182    B. D. Yabsley 0000-0002-2680-0474    W. Yan  0000-0003-0713-0871    S. B. Yang 0000-0002-9543-7971    J. Yelton  0000-0001-8840-3346    J. H. Yin 0000-0002-1479-9349    C. Z. Yuan  0000-0002-1652-6686    Y. Yusa  0000-0002-4001-9748    Z. P. Zhang  0000-0001-6140-2044    V. Zhilich 0000-0002-0907-5565    V. Zhukova 0000-0002-8253-641X
Abstract

We present the first measurement of the Michel parameter ξ\xi^{\prime} in the τμν¯μντ\tau^{-}\to\mu^{-}\bar{\nu}_{\mu}\nu_{\tau} decay using the full data sample of 988fb1988\,\text{fb}^{-1} collected by the Belle detector operating at the KEKB asymmetric energy e+ee^{+}e^{-} collider. The method is based on the reconstruction of the μeν¯eνμ\mu^{-}\to e^{-}\bar{\nu}_{e}\nu_{\mu} decay-in-flight in the Belle central drift chamber and relies on the correlation between muon spin and its daughter electron momentum. We study the main sources of the background that can imitate the signal decay, such as kaon and pion decays-in-flight and charged particle scattering on the detector material. Highly efficient methods of their suppression are developed and applied to select 165 signal-candidate events. We obtain ξ=0.22±0.94±0.42\xi^{\prime}=0.22\pm 0.94\pm 0.42 where the first uncertainty is statistical, and the second one is systematic. The result is in agreement with the Standard Model prediction of ξ=1\xi^{\prime}=1.

pacs:
13.35.Dx, 12.15.Ji, 14.60.Fg

I Introduction

In the Standard Model (SM), the τ\tau lepton decay proceeds through a weak charged current, whose amplitude can be approximated with high accuracy by the four-fermion interaction with the VAV-A Lorentz structure. A deviation from this structure would indicate physics beyond the SM, which can be caused by an anomalous coupling of the WW boson with the τ\tau lepton, a new gauge or charged Higgs bosons contribution, etc [1, 2, 3, 4]. The presence of massive neutrinos can also modify experimental observables, leading to a deviation from the SM prediction [5].

The most general form of the Lorentz invariant, local, derivative-free, lepton-number-conserving four-fermion interaction Hamiltonian [6] leads to the following matrix element of the τν¯ντ\tau^{-}\to\ell^{-}\bar{\nu}_{\ell}\nu_{\tau}111Charge conjugation is implied throughout the paper unless otherwise indicated. decay (=e\ell=e or μ\mu) written in the form of helicity projections [7, 8, 9]:

M=4GF2λ=S,V,Tε,ω=L,Rgεωλ¯ε|Γλ|(ν)α(ν¯τ)β|Γλ|τω,\displaystyle M=\dfrac{4G_{F}}{\sqrt{2}}\sum_{\begin{subarray}{c}\lambda=S,V,T\\ \varepsilon,\omega=L,R\end{subarray}}g^{\lambda}_{\varepsilon\omega}\left\langle\bar{\ell}_{\varepsilon}\left|\Gamma^{\lambda}\right|(\nu_{\ell})_{\alpha}\right\rangle\left\langle(\bar{\nu}_{\tau})_{\beta}\left|\Gamma_{\lambda}\right|\tau_{\omega}\right\rangle, (1)

where

ΓS=1,ΓV=γμ,ΓT=i22(γμγνγνγμ);\displaystyle\Gamma^{S}=1,\quad\Gamma^{V}=\gamma^{\mu},\quad\Gamma^{T}=\dfrac{i}{2\sqrt{2}}(\gamma^{\mu}\gamma^{\nu}-\gamma^{\nu}\gamma^{\mu}); (2)

SS, VV, and TT denote scalar, vector, and tensor interaction, respectively; ε,ω=L,R\varepsilon,\omega=L,R means left- and right-handed leptons, respectively. Each set of indices λ\lambda, ε\varepsilon, and ω\omega uniquely determines the neutrino handedness α\alpha and β\beta. The total strength of the weak interaction in Eq. (1) is given by GFG_{F}, while gεωλg^{\lambda}_{\varepsilon\omega} are normalized as

ε,ω=L,R(14|gεωS|2+|gεωV|2+3|gεωT|2)1.\displaystyle\sum_{\begin{subarray}{c}\varepsilon,\omega=L,R\end{subarray}}\left(\frac{1}{4}|g^{S}_{\varepsilon\omega}|^{2}+|g^{V}_{\varepsilon\omega}|^{2}+3|g^{T}_{\varepsilon\omega}|^{2}\right)\equiv 1. (3)

It is convenient to express the observables in the lepton decay in terms of the Michel parameters (MPs), which are bilinear combinations of the coupling constants gεωλg^{\lambda}_{\varepsilon\omega}. The MPs are described in detail elsewhere [10].

At present, in τ\tau decays four Michel parameters, ρ\rho, η\eta, ξ\xi, and ξδ\xi\delta, have been measured with accuracies at the level of a few percent [11], and the obtained values ρ=0.745±0.008\rho=0.745\pm 0.008, η=0.013±0.020\eta=0.013\pm 0.020, ξ=0.985±0.030\xi=0.985\pm 0.030, and ξδ=0.746±0.021\xi\delta=0.746\pm 0.021 are in agreement with the SM prediction of ρ=3/4\rho=3/4, η=0\eta=0, ξ=1\xi=1, and ξδ=3/4\xi\delta=3/4. These parameters describe the differential decay width, integrated over the neutrinos momenta and summed over the daughter lepton spin. Measurements of the remaining Michel parameters, ξ\xi^{\prime}, ξ′′\xi^{\prime\prime}, η′′\eta^{\prime\prime}, α/A\alpha^{\prime}/A, and β/A\beta^{\prime}/A, requires knowledge of the daughter lepton polarization, and no measurements of them have yet been performed. The only exception is two parameters, ξκ\xi\kappa and η¯\bar{\eta}, obtained in the radiative leptonic τ\tau decays by the Belle collaboration [12]. These parameters are related to the Michel parameters ξ\xi^{\prime} and ξ′′\xi^{\prime\prime} through linear combinations with the parameters ξ\xi, ξδ\xi\delta, and ρ\rho: ξ=ξ4ξκ+8/3ξδ\xi^{\prime}=-\xi-4\xi\kappa+8/3\xi\delta and ξ′′=16/3ρ4η¯3\xi^{\prime\prime}=16/3\rho-4\bar{\eta}-3. Substituting parameters ξ\xi and ξδ\xi\delta with their SM values and ξκ\xi\kappa with the value for the radiative muonic τ\tau decay from Ref. [12], one obtains ξ=2.2±2.4\xi^{\prime}=-2.2\pm 2.4. However, this measurement still suffers from very large uncertainties: physically allowed ξ\xi^{\prime} values range from 1-1 to 11 (in SM, it is equal to 11).

In this paper, we present the first direct measurement of the Michel parameter ξ\xi^{\prime} in the τμν¯μντ\tau^{-}\to\mu^{-}\bar{\nu}_{\mu}\nu_{\tau} decay. This parameter determines the longitudinal polarization of muons PLP_{L} and enters the term of the τμν¯μντ\tau^{-}\to\mu^{-}\bar{\nu}_{\mu}\nu_{\tau} differential decay width that does not depend on the τ\tau lepton polarization. The parameter ξ\xi^{\prime} is written in terms of the coupling constants gεωλg^{\lambda}_{\varepsilon\omega} as

ξ=12ω=L,R(14|gRωS|2+|gRωV|2+3|gRωT|2).\displaystyle\xi^{\prime}=1-2\sum_{\begin{subarray}{c}\omega=L,R\end{subarray}}\left(\frac{1}{4}|g^{S}_{R\omega}|^{2}+|g^{V}_{R\omega}|^{2}+3|g^{T}_{R\omega}|^{2}\right). (4)

Thus, a measurement of ξ\xi^{\prime} provides the necessary information required to calculate the probability of an unpolarized τ\tau lepton to decay to a right-handed muon: QRμ=(1ξ)/2Q^{\mu}_{R}=(1-\xi^{\prime})/2. This paper is accompanied by a Letter in Physical Review Letters [13].

II Method

II.1 Differential decay width

The method of the muon polarization measurement is based on the μeν¯eνμ\mu^{-}\to e^{-}\bar{\nu}_{e}\nu_{\mu} decay reconstruction since the electron momentum in the muon rest frame correlates with the muon spin. Initially, the idea was suggested in Ref. [14], where it was proposed to use stopped muons. Recently, it was proposed to use the muon decay-in-flight (kink) in the tracking system of the detector to measure ξ\xi^{\prime} in the τμν¯μντ\tau^{-}\to\mu^{-}\bar{\nu}_{\mu}\nu_{\tau} decay in a future experiment at the Super Charm-Tau Factory [15, 16]. In this paper, we rely on the adaptation of this method for the application at the BB-factories from Ref. [17].

The differential decay width of the cascade decay τμ(eν¯eνμ)ν¯μντ\tau^{-}\to\mu^{-}(\to e^{-}\bar{\nu}_{e}\nu_{\mu})\bar{\nu}_{\mu}\nu_{\tau} obtained in Ref. [17] follows

d3Γdxdydcosθe=μeνν12Γτμνν13x02y2x2x02\displaystyle\dfrac{d^{3}\Gamma}{dx\,dy\,d\!\cos{\theta_{e}}}=\mathcal{B}_{\mu\to e\nu\nu}\dfrac{12\Gamma_{\tau\to\mu\nu\nu}}{1-3x_{0}^{2}}y^{2}\sqrt{x^{2}-x_{0}^{2}}
×[(32y)FIS(x)+(2y1)FIP(x)cosθe].\displaystyle\times\left[(3-2y)F_{IS}(x)+(2y-1)F_{IP}(x)\cos{\theta_{e}}\right]. (5)

Here Γτμνν\Gamma_{\tau\to\mu\nu\nu} is the partial width of the τμν¯μντ\tau^{-}\to\mu^{-}\bar{\nu}_{\mu}\nu_{\tau} decay; μeνν\mathcal{B}_{\mu\to e\nu\nu} is the branching fraction of the μeν¯eνμ\mu^{-}\to e^{-}\bar{\nu}_{e}\nu_{\mu} decay; x=Eμ/Wμτx=E_{\mu}/W_{\mu\tau} is the reduced muon energy in the τ\tau rest frame [Wμτ=(mμ2+mτ2)/(2mτ)W_{\mu\tau}=(m^{2}_{\mu}+m^{2}_{\tau})/(2m_{\tau}) is the maximum muon energy]; x0=mμ/Wμτx_{0}=m_{\mu}/W_{\mu\tau} is the reduced muon mass; and y=2Ee/mμy=2E_{e}/m_{\mu} is the ratio of the electron energy to its maximum value in the muon rest frame. Functions FIS(x)F_{IS}(x) and FIP(x)F_{IP}(x) are expressed in terms of Michel parameters and depend only on xx:

FIS(x)=x(1x)+29ρ(4x23xx02)+ηx0(1x),FIP(x)=154x2x02[9ξ(2x3+x022)+4ξ(δ34)(4x3x022)].\displaystyle\begin{aligned} F_{IS}(x)&=x(1-x)+\dfrac{2}{9}\rho(4x^{2}-3x-x_{0}^{2})+\eta x_{0}(1-x),\\ F_{IP}(x)&=\dfrac{1}{54}\sqrt{x^{2}-x_{0}^{2}}\left[-9\xi^{\prime}\left(2x-3+\dfrac{x_{0}^{2}}{2}\right)\right.\\ &\qquad\qquad\quad\left.+4\xi\left(\delta-\dfrac{3}{4}\right)\left(4x-3-\dfrac{x_{0}^{2}}{2}\right)\right].\end{aligned} (6)

Since ρ\rho, η\eta, ξ\xi, and ξδ\xi\delta are measured very precisely, we fix their values to the SM expectations;222It is checked that this assumption has a negligible effect. thus, only ξ\xi^{\prime} in Eqs. (II.1) and (6) is to be determined.

The variable θe\theta_{e} is the angle between nμ\vec{n}_{\mu} and ne\vec{n}_{e}^{\prime}, where nμ\vec{n}_{\mu} is the direction opposite to the τ\tau lepton momentum in the muon rest frame at the muon production vertex, and ne\vec{n}_{e}^{\prime} is the direction of the electron in the muon rest frame at the muon decay vertex. The former vector is represented in the conventional coordinate system introduced in Ref. [17] as (x¯1,x¯2,x¯3)(\bar{x}_{1},~{}\bar{x}_{2},~{}\bar{x}_{3}) while ne\vec{n}_{e}^{\prime} is represented in the coordinate system obtained from the initial one by rotation through an angle ϕ\phi (the muon momentum angle of rotation in the magnetic field of the Belle detector before the decay). The procedure of the coordinate system rotation and θe\theta_{e} calculation is explained in detail in Ref. [17].

The angle θe\theta_{e} has a simple physical meaning when the muon decays immediately at the production vertex: this is the angle between mother and daughter charged leptons in the muon rest frame. Once the muon propagates in the magnetic field of the detector, its momentum in the laboratory frame and spin in the muon rest frame are rotated through the same angle ϕ\phi (assuming gμ20g_{\mu}-2\approx 0 without loss of precision [11]). The rotation of the coordinate system in each event is designed to compensate for the effect of the magnetic field, bringing the event to the case of the instantaneous muon decay.

II.2 𝝉\tau lepton momentum reconstruction

For the ξ\xi^{\prime} measurement, a knowledge of the τ\tau lepton momentum is essential. While the τ\tau energy is known from the beam energy (up to the initial-state radiation), it is not feasible to reconstruct the true direction of the τ\tau momentum due to neutrinos in the final state. However, it is possible to find the region where the τ\tau lepton momentum is directed using the second (tagging) τ\tau lepton in the event [18]. The method is based on the kinematics of the τ+τ\tau^{+}\tau^{-}-pair production and decay in the center-of-mass (c.m.) frame.

For hadronic modes of the tagging τ\tau, the angle between τ\tau lepton and daughter hadron momenta is the following:

cosψ=2EτEhmτ2mh22pτph.\displaystyle\cos{\psi}=\dfrac{2E_{\tau}E_{h}-m_{\tau}^{2}-m_{h}^{2}}{2p_{\tau}p_{h}}. (7)

Here EτE_{\tau} and pτp_{\tau} are the τ\tau lepton energy and momentum magnitude in the c.m. frame; EhE_{h}, php_{h}, and mhm_{h} are the hadron system energy, momentum magnitude, and invariant mass, respectively. We use all one-prong and three-prong modes of the tagging τ\tau, including τ++νν¯τ\tau^{+}\to\ell^{+}\nu_{\ell}\bar{\nu}_{\tau} (=e\ell=e or μ\mu); however, we treat the leptonic mode as a hadronic one for simplification.

For the signal τμν¯μντ\tau^{-}\to\mu^{-}\bar{\nu}_{\mu}\nu_{\tau} decay, the angle between τ\tau lepton and daughter muon momenta is restricted to

2EτEμmτ2mμ22pτpμcosχEτEμmτmμpτpμ.\displaystyle\dfrac{2E_{\tau}E_{\mu}-m_{\tau}^{2}-m_{\mu}^{2}}{2p_{\tau}p_{\mu}}\leq\cos{\chi}\leq\dfrac{E_{\tau}E_{\mu}-m_{\tau}m_{\mu}}{p_{\tau}p_{\mu}}. (8)

Here, EμE_{\mu} and pμp_{\mu} are the muon energy and momentum in the c.m. frame.

Thus, the true τ+τ\tau^{+}\tau^{-} pair production axis lies on the generatrix of a cone with an apex angle of 2ψ2\psi and inside a cone with an apex angle of 2χ2\chi (see Fig. 1).

Refer to caption
Figure 1: Geometric interpretation of the τ\tau lepton momentum reconstruction.

This restricts the region of possible τ\tau lepton directions to an arc (Φ1,Φ2)(\Phi_{1},\,\Phi_{2}), where Φ1\Phi_{1} and Φ2\Phi_{2} are defined as follows

Φ1=π+arcsincosψcosα+cosχsinψsinα,Φ2=2πarcsincosψcosα+cosχsinψsinα.\displaystyle\begin{aligned} \Phi_{1}&=\pi+\arcsin{\dfrac{\cos{\psi}\cos{\alpha}+\cos{\chi}}{\sin{\psi}\sin{\alpha}}},\\ \Phi_{2}&=2\pi-\arcsin{\dfrac{\cos{\psi}\cos{\alpha}+\cos{\chi}}{\sin{\psi}\sin{\alpha}}}.\end{aligned} (9)

Here α\alpha is an angle between pμ\vec{p}_{\mu} and ph\vec{p}_{h}. For simplicity, we use the average value of Φ=(Φ1+Φ2)/2=π/2\Phi=(\Phi_{1}+\Phi_{2})/2=-\pi/2 instead of averaging Eq. (II.1) over (Φ1,Φ2)(\Phi_{1},\,\Phi_{2}). This approximation has a negligible impact on the ξ\xi^{\prime} measurement: the increase of the statistical uncertainty is less than 0.010.01.

II.3 Decay vertex reconstruction

Track reconstruction at Belle is optimized for long-lived particles that originate close to the interaction point of the beams (IP) and does not contain dedicated algorithms for identifying charged particle decays in flight. However, tracks that do not point to IP are also reconstructed with a considerable efficiency. In our case, the muon track is reconstructed first, and it may absorb some hits produced by the daughter electron, thereby smearing the muon momentum resolution. The remaining hits are used to reconstruct the electron track.

We define the decay vertex as a point of the closest approach of muon and electron tracks in the region of their endpoint and starting point, respectively.

III The data sample and the Belle detector

This analysis is based on a data sample taken at or near the Υ(1S)\Upsilon(1S), Υ(2S)\Upsilon(2S), Υ(3S)\Upsilon(3S), Υ(4S)\Upsilon(4S), and Υ(5S)\Upsilon(5S) resonances with an integrated luminosity of 988fb1988\,\text{fb}^{-1} corresponding to about 912×106912\times 10^{6} τ+τ\tau^{+}\tau^{-} pairs [19]. The data are collected with the Belle detector [20] at the KEKB asymmetric-energy e+ee^{+}e^{-} collider [21, *[][andreferencestherein.]Abe:2013kxa].

The Belle detector is a large-solid-angle magnetic spectrometer that consists of a silicon vertex detector (SVD), a 50-layer central drift chamber (CDC), an array of aerogel threshold Cherenkov counters (ACC), a barrel-like arrangement of time-of-flight scintillation counters (TOF), and an electromagnetic calorimeter comprised of CsI(Tl) crystals (ECL) located inside a super-conducting solenoid coil that provides a 1.5 T magnetic field. An iron flux-return located outside of the coil is instrumented to detect KL0K_{L}^{0} mesons and to identify muons (KLM).

The most critical subdetector for this study is CDC [23]. It has the following dimensions: the length is 2400mm2400\,\text{mm}, and the inner and outer radii are 8383 and 874mm874\,\text{mm}, respectively. This size is large enough to reliably reconstruct both daughter electron and mother muon tracks.

To study the background processes, optimize the selection criteria, and obtain the fit function, signal and background Monte Carlo (MC) samples are used.

A signal MC sample of e+eτ+τe^{+}e^{-}\to\tau^{+}\tau^{-} with the following τμ(eν¯eνμ)ν¯μντ\tau^{-}\to\mu^{-}(\to e^{-}\bar{\nu}_{e}\nu_{\mu})\bar{\nu}_{\mu}\nu_{\tau} cascade decay is 50\sim 50 times larger than the data. The production and subsequent decay of τ+τ\tau^{+}\tau^{-} pairs are generated with KKMC [24] and TAUOLA [25, 26] generators, respectively, and decay products are propagated by GEANT3 [27] to simulate the detector response. The μeν¯eνμ\mu^{-}\to e^{-}\bar{\nu}_{e}\nu_{\mu} decay is also generated by GEANT3, assuming muons are unpolarized as if ξ=0\xi^{\prime}=0. To speed up the signal MC sample generation, we reduce the muon lifetime in GEANT3 by 100. This procedure is justified and only slightly biases the distribution of the muon decay length because the CDC size is much smaller than the average flight distance of the muon from the τ\tau decay, which is of the order of a kilometer. We evaluate the effect of this reduction of the muon lifetime in the MC generation and quote a systematic uncertainty associated with it.

An example of the e+eτ+τe^{+}e^{-}\to\tau^{+}\tau^{-} MC event display with the μeν¯eνμ\mu^{-}\to e^{-}\bar{\nu}_{e}\nu_{\mu} kink in the CDC is presented in Fig. 2. The μeν¯eνμ\mu^{-}\to e^{-}\bar{\nu}_{e}\nu_{\mu} decay that occurred in the central volume of the CDC is clearly observed as a kinked track due to the change of the trajectory curvature since the daughter electron from the μeν¯eνμ\mu^{-}\to e^{-}\bar{\nu}_{e}\nu_{\mu} decay has a smaller momentum in the laboratory frame compared to the mother muon. Both electron and muon trajectories are reconstructed as separate tracks by the Belle track reconstruction algorithm.

The background consists of τ+τ\tau^{+}\tau^{-}-pair events

Refer to caption
Figure 2: Event display of a MC event e+eτ+τ(π+ππ+ν¯τ)(μν¯μντ)e^{+}e^{-}\to\tau^{+}\tau^{-}\to(\pi^{+}\pi^{-}\pi^{+}\bar{\nu}_{\tau})(\mu^{-}\bar{\nu}_{\mu}\nu_{\tau}) with μeν¯eνμ\mu^{-}\to e^{-}\bar{\nu}_{e}\nu_{\mu} decay in the CDC (the arrow points to the decay vertex). The Belle detector, without the KLM, is shown projected onto xxyy plane.

without a μeν¯eνμ\mu^{-}\to e^{-}\bar{\nu}_{e}\nu_{\mu} decay and non-τ+τ\tau^{+}\tau^{-}-pair events. The MC sample for the former contribution is generated the same way as the signal, with an exception of the μeν¯eνμ\mu^{-}\to e^{-}\bar{\nu}_{e}\nu_{\mu} decay generation step. The non-τ+τ\tau^{+}\tau^{-}-pair background consists of the dimuon e+eμ+μe^{+}e^{-}\to\mu^{+}\mu^{-} process, e+eqq¯e^{+}e^{-}\to q\bar{q} (q=u,d,sq=u,\,d,\,s, and cc) continuum and e+eΥ(4S)BB¯e^{+}e^{-}\to\Upsilon(4S)\to B\bar{B} events, two-photon mediated processes (e+ee+e+,e+eqq¯e^{+}e^{-}\to e^{+}e^{-}\ell^{+}\ell^{-},\,e^{+}e^{-}q\bar{q}, where =e,μ\ell=e,\,\mu and q=u,d,sq=u,\,d,\,s, and cc), and Bhabha scattering generated with KKMC, EvtGen [28], AAFH [29], and BHLUMI [30] generators, respectively. Final-state radiation is simulated using the PHOTOS [31] package for all charged final-state particles.

A list of the background MC samples is presented in Table 1 with the ratio of the number of generated events NMCgenN^{\text{gen}}_{\text{MC}} to the expected number of corresponding events in data (product of the integrated luminosity dataint\mathcal{L}^{\text{int}}_{\text{data}} and the process cross section σproc\sigma_{\text{proc}}).

Table 1: Background MC samples with their size.
       Processes NMCgen/(dataintσproc)N^{\text{gen}}_{\text{MC}}/(\mathcal{L}^{\text{int}}_{\text{data}}\sigma_{\text{proc}})
       e+eτ+τe^{+}e^{-}\to\tau^{+}\tau^{-} background 4.5
       e+eμ+μe^{+}e^{-}\to\mu^{+}\mu^{-} 4.4
       e+eqq¯e^{+}e^{-}\to q\bar{q} (q=u,d,sq=u,\,d,\,s, cc) 5.8
       e+eΥ(4S)BB¯e^{+}e^{-}\to\Upsilon(4S)\to B\bar{B} 10.2
       e+ee+e+e^{+}e^{-}\to e^{+}e^{-}\ell^{+}\ell^{-} (=e,μ\ell=e,\,\mu) 6.9
       e+ee+eqq¯e^{+}e^{-}\to e^{+}e^{-}q\bar{q} (q=u,dq=u,\,d) 7.5
       e+ee+eqq¯e^{+}e^{-}\to e^{+}e^{-}q\bar{q} (q=s,cq=s,\,c) 8.1
       Bhabha scattering 0.2

IV Event selection

The event selection is performed in three steps. The first step is the preselection of candidates in τ+τ\tau^{+}\tau^{-} events with the τ+τ\tau^{+}\tau^{-}-pair decay topology of interest. The second step is dedicated to the kink candidate selection. In the last step, we apply the BDT (boosted decision tree classifier) machine learning algorithm [32, 33] to select signal event candidates and suppress the kink background.

IV.1 Preselection

In the first step, τ+τ\tau^{+}\tau^{-}-pair event candidates are required to pass the preliminary selection criteria. They are used to select the τ+τ\tau^{+}\tau^{-}-pair decay topology and suppress the contribution from Bhabha scattering, e+eμ+μe^{+}e^{-}\to\mu^{+}\mu^{-}, two-photon production, e+eqq¯e^{+}e^{-}\to q\bar{q} (q=uq=u, dd, ss, or cc), and BB¯B\bar{B} events.

In the c.m. frame, the event is divided by the plane perpendicular to the thrust vector nT\vec{n}_{T} into two hemispheres. The vector nT\vec{n}_{T} is defined as follows

T=maxnTi|pinT|i|pi|.\displaystyle T=\max_{\vec{n}_{T}}\frac{\sum_{i}|\vec{p}_{i}\cdot\vec{n}_{T}|}{\sum_{i}|\vec{p}_{i}|}. (10)

Here pi\vec{p}_{i} is the momentum of the iith track; the summation is over all tracks in the event. The signal hemisphere is determined by the muon candidate momentum direction. The complementary one is called tagging hemisphere.

In the present analysis, the decay mode of the second τ\tau lepton is not important. Therefore, our selection includes only the information about the event topology formed by charged tracks from the IP. In the signal hemisphere, we require only one track from the IP with the impact parameters in the rϕr\phi-plane and along the zz-axis (the direction opposite to the e+e^{+} beam) to be drsig<2cmdr_{\text{sig}}<2\,\text{cm} and |dzsig|<4cm|dz_{\text{sig}}|<4\,\text{cm}, respectively. We also require one secondary electron candidate track in the signal hemisphere; however, at this step, the parameters of this track are not used. As the τ\tau lepton decays dominantly into one or three charged tracks in the final state, in the tagging hemisphere, we require one (topology 1–1) or three (topology 1–3) charged tracks from the IP with their impact parameters to be drtag<0.5cmdr_{\text{tag}}<0.5\,\text{cm} and |dztag|<2cm|dz_{\text{tag}}|<2\,\text{cm}. The total charge of the event is required to be zero.

Some events may contain photons, for example, from π0\pi^{0}s. They are selected with the energy requirement Eγ>50MeVE_{\gamma}>50\,\text{MeV}. In the signal hemisphere, the maximum photon energy and the total sum of the photon energies are limited to be less than 300MeV300\,\,{\mathrm{\mbox{MeV}}} and 400MeV400\,\,{\mathrm{\mbox{MeV}}}, respectively, and the π0\pi^{0} candidates (a combination of two photons with |M(γγ)mπ0|<15MeV/c2|M(\gamma\gamma)-m_{\pi^{0}}|<15\,\,{\mathrm{\mbox{MeV}}/c^{2}}, corresponding to approximately ±3σ\pm 3\sigma window in the resolution) are vetoed.

For topology 1–1, the primary backgrounds are Bhabha scattering, two-photon interactions, e+eμ+μe^{+}e^{-}\to\mu^{+}\mu^{-}, and e+eqq¯e^{+}e^{-}\to q\bar{q} (q=uq=u, dd, ss, or cc). For topology 1–3, the main background is e+eqq¯e^{+}e^{-}\to q\bar{q} (q=uq=u, dd, ss, or cc). To suppress the contribution of these processes, additional requirements are used. They are based on the fact that the e+eτ+τe^{+}e^{-}\to\tau^{+}\tau^{-} events with the τ+τ\tau^{+}\tau^{-}-pair subsequent decay are characterized by a large missing energy (EmissE_{\text{miss}}) and missing momentum (pmiss\vec{p}_{\text{miss}}) due to undetected neutrinos. The missing four-momentum (pmiss,Emiss)(\vec{p}_{\text{miss}},\,E_{\text{miss}}) is defined as follows

Pmiss=PbeamPtrk(IP)Pγ,\displaystyle P_{\text{miss}}=P^{*}_{\text{beam}}-P^{*}_{\text{trk(IP)}}-P^{*}_{\gamma}, (11)

where PbeamP^{*}_{\text{beam}} is the beam four-momentum in the c.m. frame, Ptrk(IP)P^{*}_{\text{trk(IP)}} is a sum of four-momenta of all tracks from the τ+τ\tau^{+}\tau^{-}-pair in the c.m. frame, and PγP^{*}_{\gamma} is a sum of four-momenta of all photons in the c.m. frame. Another feature of τ+τ\tau^{+}\tau^{-} events is a nearly uniform distribution of cosθmiss\cos{\theta_{\text{miss}}}, where θmiss\theta_{\text{miss}} is the angle between the pmiss\vec{p}_{\text{miss}} and zz-axis.

Thus, we apply the requirements on the missing mass (mmiss2=Pmiss2m_{\text{miss}}^{2}=P_{\text{miss}}^{2}) 1GeV/c2<mmiss<7GeV/c21\,\,{\mathrm{\mbox{GeV}}/c^{2}}<m_{\text{miss}}<7\,\,{\mathrm{\mbox{GeV}}/c^{2}}, missing angle π/6<θmiss<5π/6\pi/6<\theta_{\text{miss}}<5\pi/6, thrust magnitude 0.85<T<0.990.85<T<0.99, and invariant mass of the tag-side tracks mtrktag<1.8GeV/c2m^{\text{tag}}_{\text{trk}}<1.8\,\,{\mathrm{\mbox{GeV}}/c^{2}}.

To suppress the remaining Bhabha scattering contribution, we apply an electron veto for the tag-side track for topology 1–1 using identification based on the information from the CDC, ACC, and ECL [34]. We require its likelihood ratio (e/x)=e/(e+x)\mathcal{R}(e/x)=\mathcal{L}_{e}/(\mathcal{L}_{e}+\mathcal{L}_{x}) to be less than 0.40.4, where e\mathcal{L}_{e} and x\mathcal{L}_{x} are the likelihood values of the track for the electron and non-electron hypotheses, respectively. This requirement rejects about 80% of events with an electron on the tag side.

IV.2 Kink selection

In this subsection, we describe the preselection of candidates for events with the μeν¯eνμ\mu^{-}\to e^{-}\bar{\nu}_{e}\nu_{\mu} decay in the CDC. As mentioned above, the daughter electron track originating from the muon decay in the signal hemisphere is required to infer the muon polarization. To suppress random combinations with tracks from IP, we require the electron candidate impact parameter in the rϕr\phi-plane to be dre>4cmdr_{e}>4\,\text{cm}. To reconstruct the μeν¯eνμ\mu^{-}\to e^{-}\bar{\nu}_{e}\nu_{\mu} decay inside the CDC, both the muon track and the electron track have to be reconstructed, leaving enough hits in the tracker. The last point of the muon track and the first point of the electron track must be inside the CDC, detached at least 10cm10\,\text{cm} from its walls.

The track helix is parametrized by five parameters, whose determination requires at least five hits in the CDC. It is also important to discard fake tracks; thus, it is required for the total number of the CDC hits to be larger than 77 for the electron candidates and larger than 1010 for the muon candidates. Both tracks from the μeν¯eνμ\mu^{-}\to e^{-}\bar{\nu}_{e}\nu_{\mu} decay are shorter than the average track of the nondecayed particle from IP; therefore, we require the number of their CDC hits to be less than 4040. Since the decayed muon does not leave the drift chamber, the absence of associated hits in the outer TOF, ECL, and KLM systems is required. The electron tracks originate outside the SVD and are stopped in the ECL; therefore, for them, we veto signals from the SVD and KLM systems.

Finally, the distance between the muon and electron tracks at the decay vertex is required to be less than 5cm5\,\text{cm}. This requirement is loose enough to keep almost 100% of kink events while rejecting random combinations of tracks.

The overwhelming majority of events that passed these selection criteria have the form of a track kink. One of these processes is μeν¯eνμ\mu^{-}\to e^{-}\bar{\nu}_{e}\nu_{\mu}, and the rest are backgrounds, which mimic the signal. They are light meson decays (πμν¯μ\pi^{-}\to\mu^{-}\bar{\nu}_{\mu}, Kμν¯μK^{-}\to\mu^{-}\bar{\nu}_{\mu}, Kπ0μν¯μK^{-}\to\pi^{0}\mu^{-}\bar{\nu}_{\mu}, Kπ0eν¯eK^{-}\to\pi^{0}e^{-}\bar{\nu}_{e}, Kππ0K^{-}\to\pi^{-}\pi^{0}, Kππ+πK^{-}\to\pi^{-}\pi^{+}\pi^{-}, Kππ0π0K^{-}\to\pi^{-}\pi^{0}\pi^{0}), and electron scattering, muon scattering, and hadron scattering. In Table 2, the signal and the main background processes are listed with their relative contributions. About 20%20\% of pion decay events, 30%30\% of kaon decay events, and 30%30\% of hadron scattering events come from e+eqq¯e^{+}e^{-}\to q\bar{q}, while all other events are mainly from e+eτ+τe^{+}e^{-}\to\tau^{+}\tau^{-}.

Table 2: Relative contribution of the signal and background processes after the kink selection and before applying the BDT requirement.
       Type Contribution (%)
       μeν¯eνμ\mu^{-}\to e^{-}\bar{\nu}_{e}\nu_{\mu} 3.2
       πμν¯μ\pi^{-}\to\mu^{-}\bar{\nu}_{\mu} 22.4
       Kππ0K^{-}\to\pi^{-}\pi^{0} 3.3
       K3K^{-}\to 3 body 4.6
       Kμν¯μK^{-}\to\mu^{-}\bar{\nu}_{\mu} 45.9
       ee-scattering 9.5
       μ\mu-scattering 1.1
       hadron scattering 10.0

The kinks, formed by a decay-in-flight, are characterized by daughter particle kinematics in the mother particle rest frame determined by the momentum magnitude and emission angle. These two variables are only defined for the correct pair of mass hypotheses assigned to the tracks, e.g., for μeν¯eνμ\mu^{-}\to e^{-}\bar{\nu}_{e}\nu_{\mu}, they are electron and muon mass hypotheses assigned to the daughter and mother particles, respectively. To indicate which pair is used in the particular case, we introduce the following notation: pp1p2p_{p_{1}p_{2}} and θp1p2\theta_{p_{1}p_{2}} mean the daughter particle momentum and emission angle in the mother particle rest frame with p1p_{1} and p2p_{2} mass hypotheses assigned to the daughter and mother tracks, respectively. Here we measure the daughter particle emission angle from the direction of the mother particle in the laboratory frame because this angle determines the efficiency to reconstruct a decay-in-flight. The efficiency to reconstruct the daughter track from a kink has a maximum for the daughter particles emitted perpendicular to the mother particle direction, while it drops for daughter particle emitted along the muon direction.

In the present study, we use three pairs (p1,p2)(p_{1},\,p_{2}): (e,μ)(e,\,\mu), (π,K)(\pi,\,K), and (μ,π)(\mu,\,\pi). For these mass hypotheses, we plot pp1p2p_{p_{1}p_{2}} and cosθp1p2\cos{\theta_{p_{1}p_{2}}} distributions in Fig. 3. A good agreement between the MC simulation (filled histograms) and the data (points with errors) is observed.

Both Table 2 and Fig. 3 show that the largest contribution to the background comes from the pion and kaon two-body decays. These processes are characterized by a peak in the momentum distribution of the daughter particle in the rest frame of the decayed one, which is clearly observed for the Kμν¯μK^{-}\to\mu^{-}\bar{\nu}_{\mu} and Kππ0K^{-}\to\pi^{-}\pi^{0} decays in Fig. 3(c) and for the πμν¯μ\pi^{-}\to\mu^{-}\bar{\nu}_{\mu} decay in Fig. 3(e), proving the correctness of the applied kink selection procedure. Before the selection based on the BDT, the signal contribution is small and hardly visible in Fig. 3. The signal shape has no sharp structures due to a three-body decay, and it is further smeared in the variables calculated with the wrong pair of mass hypotheses.

We define the signal region for peμ<70MeV/cp_{e\mu}<70\,\,{\mathrm{\mbox{MeV}}/c}. As can be seen from Fig. 3(a), the largest background contribution to this region is from the pion decay and electron scattering. However, in the region for the πμν¯μ\pi^{-}\to\mu^{-}\bar{\nu}_{\mu} decay, pμπ<100MeV/cp_{\mu\pi}<100\,\,{\mathrm{\mbox{MeV}}/c}, there are almost no μeν¯eνμ\mu^{-}\to e^{-}\bar{\nu}_{e}\nu_{\mu} events [see Fig. 3(e)], which makes it possible to effectively suppress the πμν¯μ\pi^{-}\to\mu^{-}\bar{\nu}_{\mu} background, as well as a significant part of the electron scattering events.

Refer to caption
Figure 3: Momentum pp1p2p_{p_{1}p_{2}} and angular cosθp1p2\cos{\theta_{p_{1}p_{2}}} distributions for the daughter particle (the mass hypothesis p1p_{1}) in the mother particle (the mass hypothesis p2p_{2}) rest frame. (a) peμp_{e\mu}, (b) cosθeμ\cos{\theta_{e\mu}}, (c) pπKp_{\pi K}, (d) cosθπK\cos{\theta_{\pi K}}, (e) pμπp_{\mu\pi}, and (f) cosθμπ\cos{\theta_{\mu\pi}}.
Refer to caption
Figure 4: OBDTO_{\text{BDT}} distribution for the signal and background (training and test samples).

IV.3 BDT based signal selection

To further suppress the background, we apply the BDT machine learning (ML) classification algorithm. To separate the signal from the background, we select twelve features based on the physics of the background processes. The first two features are pμπp_{\mu\pi} and pπKp_{\pi K}. The next group of five features is responsible for the particle identification (PID) of muon and electron candidates. They are defined as likelihood ratios (/x)=/(+x)\mathcal{R}(\ell/x)=\mathcal{L}_{\ell}/(\mathcal{L}_{\ell}+\mathcal{L}_{x}), where =μ\ell=\mu or ee, and \mathcal{L}_{\ell} and x\mathcal{L}_{x} are the likelihood values of the track for the muon (electron) and non-muon (non-electron) hypotheses, respectively. For muon candidates, we use PID based on the dE/dxdE/dx losses inside the CDC against electron, pion, kaon, and proton hypotheses (x=ex=e, π\pi, KK, and pp, respectively). For electron candidates, PID is based on the dE/dxdE/dx losses inside the CDC and ECL information; here only (e/μ)\mathcal{R}(e/\mu) is used. Two more features are related to the decay vertex; they are the zz-coordinate of the last point of the mother particle track and the distance between the daughter and mother tracks at the decay vertex. Finally, to suppress the residual contribution from e+eqq¯e^{+}e^{-}\to q\bar{q}, we use mmissm_{\text{miss}}, cosθmiss\cos{\theta_{\text{miss}}}, and thrust magnitude as separation variables.

Although the cosθp1p2\cos{\theta_{p_{1}p_{2}}} variables show a good separation power (see Fig. 3), we do not use them in the BDT because they are strongly correlated with cosθe\cos{\theta_{e}} (the main variable to fit ξ\xi^{\prime}) and, therefore, bias the ξ\xi^{\prime} measurement with poorly controlled systematics.

The distribution of the BDT output variable OBDTO_{\text{BDT}} is shown in Fig. 4 for signal and background for training and test samples. The optimal selection of OBDT>0.0979O_{\text{BDT}}>0.0979 is obtained by maximizing the ratio Nsig/Nsig+NbckgN_{\text{sig}}/\sqrt{N_{\text{sig}}+N_{\text{bckg}}}, where NsigN_{\text{sig}} is the number of selected signal events, and NbckgN_{\text{bckg}} is the number of selected background events. The obtained signal selection efficiency is εsig80%\varepsilon_{\text{sig}}\approx 80\%, while the background is suppressed by a factor of fifty.

To illustrate the performance of BDT, we plot the electron candidate momentum in the muon rest frame shown in Fig. 5. The absence of the Belle track reconstruction algorithm optimization for the kink events leads to a wide tail above the kinematic threshold of 53MeV/c53\,\text{MeV}/c in the μeν¯eνμ\mu^{-}\to e^{-}\bar{\nu}_{e}\nu_{\mu} decay. The relative contribution of the signal and background processes after the BDT application is listed in Table 3. About 6%6\% of the μeν¯eνμ\mu^{-}\to e^{-}\bar{\nu}_{e}\nu_{\mu} decays come from the non-τ+τ\tau^{+}\tau^{-} events.

Refer to caption
Figure 5: Momentum distribution for the electron and muon mass hypotheses for the daughter and mother particles, respectively. The dashed line shows the 53MeV/c53\,\text{MeV}/c threshold.
Table 3: Relative contribution of the signal and background processes after the BDT application.
       Type Contribution (%)
       μeν¯eνμ\mu^{-}\to e^{-}\bar{\nu}_{e}\nu_{\mu} 77.8
       πμν¯μ\pi^{-}\to\mu^{-}\bar{\nu}_{\mu} 2.2
       K3K^{-}\to 3 body 4.3
       Kμν¯μK^{-}\to\mu^{-}\bar{\nu}_{\mu} 3.7
       ee-scattering 9.6
       μ\mu-scattering 0.2
       hadron scattering 2.2

Finally, the number of the reconstructed signal μeν¯eνμ\mu^{-}\to e^{-}\bar{\nu}_{e}\nu_{\mu} decays and background events are estimated from the MC simulation to be 139±2139\pm 2 and 50±550\pm 5, respectively, where the uncertainty is due to the limited size of the MC samples. In the data, 165165 signal-candidate events pass all the applied selection criteria.

V Background study

In the present study, the background suppression and determination of the fit function are based on the MC simulation; thus, it is important to control the differences between the MC samples and the data and take them into account as systematic uncertainties. Therefore, we conduct a study of background processes in the data and the MC simulation using large pure samples with different types of kink candidates (pion and kaon decays, hadron and electron scattering).

Light meson decays are selected in two ways. The first method is based on the BDT described in the previous section, where we mark πμν¯μ\pi^{-}\to\mu^{-}\bar{\nu}_{\mu} or Kμν¯μK^{-}\to\mu^{-}\bar{\nu}_{\mu} decay as a signal. The samples obtained in this way have a purity close to unity.

The distributions of pμπp_{\mu\pi} for the selected πμν¯μ\pi^{-}\to\mu^{-}\bar{\nu}_{\mu} sample and pμKp_{\mu K} for the selected Kμν¯μK^{-}\to\mu^{-}\bar{\nu}_{\mu} sample are shown in Fig. 6 and Fig. 7(a), respectively. In the former plot, the pμπp_{\mu\pi} distribution in the MC sample is shifted to the higher muon momentum compared to the data.

Refer to caption
Figure 6: pμπp_{\mu\pi} distribution for the πμν¯μ\pi^{-}\to\mu^{-}\bar{\nu}_{\mu} event candidates selected with the BDT from the τ\tau sample.

This effect is related to the imperfection of the track reconstruction algorithm in the case of a kink and is especially pronounced in the pion decay due to the low energy release. In Fig. 7(a), we observe that the muon momentum peak has a larger width in the data indicating a better resolution in the MC simulation, while kaon momentum distribution in the laboratory frame pKp_{K} plotted in Fig. 7(b) shows an agreement between the MC simulation and the data within statistical uncertainties of both samples.

Refer to caption
Figure 7: (a) pμKp_{\mu K} and (b) pKp_{K} distributions for the Kμν¯μK^{-}\to\mu^{-}\bar{\nu}_{\mu} event candidates selected with the BDT from the τ\tau sample.
Refer to caption
Figure 8: D+D^{*+} sample. (a) pμπp_{\mu\pi} distribution for pion kinks; (b) pπKp_{\pi K}, (c) pμKp_{\mu K}, and (d) pKp_{K} distributions for kaon kinks.

It is also important to control systematics caused by the BDT application; thus, background processes have to be studied in samples obtained without ML algorithms. To obtain a high-purity kink sample without BDT, we use the decay chain D+D0(Kπ+)π+D^{*+}\to D^{0}(\to K^{-}\pi^{+})\pi^{+} since there is a large sample of D+D^{*+} mesons collected by the Belle detector, and both decays, D+D0π+D^{*+}\to D^{0}\pi^{+} and D0Kπ+D^{0}\to K^{-}\pi^{+}, are well-studied. In these events, it is possible to reconstruct the light meson decay-in-flight and tag the kink type. A detailed description of the D+D^{*+} sample selection is given in Appendix A.

For the πμν¯μ\pi^{-}\to\mu^{-}\bar{\nu}_{\mu} kinks selected from the D+D^{*+} sample, we plot the pμπp_{\mu\pi} distribution in Fig. 8(a). It is similar to Fig. 6, although the statistics are several times smaller.

Concerning kaon kinks from the D+D^{*+} sample, they include both kaon two-body and three-body decays and a large number of events with hadron scattering. To illustrate the abundance of selected processes, we plot the pπKp_{\pi K} distribution in Fig. 8(b). Here we observe a relatively large contribution of Kππ0K^{-}\to\pi^{-}\pi^{0} decays compared to Fig. 3(c), where such events are suppressed by requirements for photons and π0\pi^{0}s. For this kaon decay mode, we confirm the agreement between the MC simulation and the data within statistical uncertainties. For Kμν¯μK^{-}\to\mu^{-}\bar{\nu}_{\mu} decays, we observe a discrepancy; therefore, to study this process in more detail, we plot the pμKp_{\mu K} distribution in Fig. 8(c). Here the discrepancy in the muon momentum peak width for Kμν¯μK^{-}\to\mu^{-}\bar{\nu}_{\mu} decays is observed to be similar to one in Fig. 7(a) for the τ\tau sample and thus confirms this to be a systematic effect.

Kaon kinks from the D+D^{*+} sample also include hadron scattering events [e.g., Fig. 8(b)]. This process is typical for slow hadrons, as can be seen from Fig. 8(d), where pKp_{K} is plotted. For the first two bins with data, we observe a significant discrepancy between the data and MC samples, while an agreement is observed for kaon kinks from the τ\tau sample [Fig. 7(b)]. Another confirmation that the MC simulation does not reproduce hadron scattering is an underestimation of the events number in the hadron scattering region observed in Fig. 8(b). The difference between the MC simulation and the data is expected since this process is not perfectly described by GEANT3. For larger pKp_{K}, the MC simulation reproduces the data within statistical uncertainties for both D+D^{*+} and τ\tau samples.

The electron scattering process makes a significant contribution to the background. The study of this process is based on the sample obtained from the γ\gamma-conversion on the detector material in the IP vicinity. The selection of the γ\gamma-conversion sample is described in detail in Appendix B. To illustrate the electron scattering process, we use the same pair of mass hypotheses as in the fit of the μeν¯eνμ\mu^{-}\to e^{-}\bar{\nu}_{e}\nu_{\mu} process. The peμp_{e\mu} distribution is shown in Fig. 9. A discrepancy between the MC and data samples is observed in the shape of the electron spectrum and taken into account in the systematics.

Refer to caption
Figure 9: peμp_{e\mu} distribution for the electron scattering kinks selected from the γ\gamma-conversion sample.

In conclusion, a complete study of the main background processes is done. All observed discrepancies are taken into account as systematic uncertainties. In addition, the discussed samples also provide important information about secondary and primary tracks in events that contain kinks for all main particle types except primary muon. This information is also used in systematics estimation, as it is described in the corresponding section below.

VI Fit function and fit result

According to Eq. (II.1), the term proportional to ξ\xi^{\prime} depends on cosθe\cos{\theta_{e}}, yy, and xx. Since the dependence on xx is very weak,333If x02x_{0}^{2} is neglected in Eq. (II.1), the dependence on xx factorizes. we can integrate over it without loss of sensitivity. In contrast, the dependence on yy is strong, and integration over it dramatically decreases the sensitivity to ξ\xi^{\prime}; thus, we perform a two-dimensional (2D) fit on the (y,cosθe)(y,c)(y,\,\cos{\theta_{e}})\equiv(y,\,c) distribution using an unbinned maximum-likelihood method.

The likelihood function is

=i=1n𝒫(yi,ci;ξ),\displaystyle\mathcal{L}=\prod\limits_{i=1}^{n}\mathcal{P}(y_{i},\,c_{i};\,\xi^{\prime}), (12)

where nn denotes the number of data events, and 𝒫(y,c;ξ)\mathcal{P}(y,\,c;\,\xi^{\prime}) is a probability density function (PDF)

𝒫(y,c;ξ)=p𝒫sig(y,c;ξ)+(1p)𝒫bckg(y,c).\displaystyle\begin{aligned} \mathcal{P}(y,\,c;\,\xi^{\prime})=p\,\mathcal{P}_{\text{sig}}(y,\,c;\,\xi^{\prime})+(1-p)\,\mathcal{P}_{\text{bckg}}(y,\,c).\end{aligned} (13)

Here p=Nsig/(Nsig+Nbckg)=0.74p=N_{\text{sig}}/(N_{\text{sig}}+N_{\text{bckg}})=0.74 is the signal purity, the ratio of the number of signal events NsigN_{\text{sig}} to the total number of events Nsig+NbckgN_{\text{sig}}+N_{\text{bckg}}, 𝒫sig(y,c;ξ)\mathcal{P}_{\text{sig}}(y,\,c;\,\xi^{\prime}) is a signal PDF, and 𝒫bckg(y,c)\mathcal{P}_{\text{bckg}}(y,\,c) is a background PDF. The signal purity and both PDFs are obtained from the MC simulation.

The signal PDF can be determined from the theoretical PDF 𝒫th(y~,c~,𝒛;ξ)\mathcal{P}_{\text{th}}(\tilde{y},\,\tilde{c},\,\boldsymbol{z};\,\xi^{\prime}) by applying efficiency corrections and performing a convolution with the detector resolution:

𝒫sig(y,c;ξ)=1N(ξ)𝒫th(y~,c~,𝒛;ξ)×η(y~,c~,𝒛)g(y,c,y~,c~,𝒛)d𝒛dy~dc~,N(ξ)=𝒫th(y~,c~,𝒛;ξ)η(y~,c~,𝒛)×g(y,c,y~,c~,𝒛)d𝒛dy~dc~dydc.\displaystyle\begin{aligned} \mathcal{P}_{\text{sig}}(y,\,c;\,\xi^{\prime})&=\dfrac{1}{N(\xi^{\prime})}\int\mathcal{P}_{\text{th}}(\tilde{y},\,\tilde{c},\,\boldsymbol{z};\,\xi^{\prime})\\ &\times\eta(\tilde{y},\,\tilde{c},\,\boldsymbol{z})\,g(y,\,c,\,\tilde{y},\,\tilde{c},\,\boldsymbol{z})\,d\boldsymbol{z}\,d\tilde{y}\,d{\tilde{c}},\\ N(\xi^{\prime})&=\int\mathcal{P}_{\text{th}}(\tilde{y},\,\tilde{c},\,\boldsymbol{z};\,\xi^{\prime})\,\eta(\tilde{y},\,\tilde{c},\,\boldsymbol{z})\\ &\times g(y,\,c,\,\tilde{y},\,\tilde{c},\,\boldsymbol{z})\,d\boldsymbol{z}\,d\tilde{y}\,d{\tilde{c}}\,dy\,dc.\end{aligned} (14)

Here (y~,c~)(y~,cosθ~e)(\tilde{y},\,\tilde{c})\equiv(\tilde{y},\,\cos{\tilde{\theta}_{e}}) are “true” physical quantities of our interest, and 𝒛\boldsymbol{z} is a vector of the rest of the true physical variables not used in the fit. Functions η(y~,c~,𝒛)\eta(\tilde{y},\,\tilde{c},\,\boldsymbol{z}) and g(y,c,y~,c~,𝒛)g(y,\,c,\,\tilde{y},\,\tilde{c},\,\boldsymbol{z}) are the efficiency and resolution functions, respectively. The theoretical PDF, 𝒫th(y~,c~,𝒛;ξ)\mathcal{P}_{\text{th}}(\tilde{y},\,\tilde{c},\,\boldsymbol{z};\,\xi^{\prime}), is obtained from the differential decay width given by Eq. (II.1).

Both efficiency and resolution functions are too complicated to express in analytic form; thus, it is almost impossible to calculate the 𝒫sig(y,c;ξ)\mathcal{P}_{\text{sig}}(y,\,c;\,\xi^{\prime}) function given in Eq. (14) analytically. Fortunately, the theoretical PDF is linear in ξ\xi^{\prime}; therefore, we rewrite it as follows

𝒫th(y~,c~,𝒛;ξ)=A(y~,c~,𝒛)+ξB(y~,c~,𝒛).\displaystyle\mathcal{P}_{\text{th}}(\tilde{y},\,\tilde{c},\,\boldsymbol{z};\,\xi^{\prime})=A(\tilde{y},\,\tilde{c},\,\boldsymbol{z})+\xi^{\prime}B(\tilde{y},\,\tilde{c},\,\boldsymbol{z}). (15)

Using this form, we rewrite Eq. (14):

𝒫sig(y,c;ξ)=A¯(y,c)+ξB¯(y,c)A~+ξB~,\displaystyle\mathcal{P}_{\text{sig}}(y,\,c;\,\xi^{\prime})=\dfrac{\bar{A}(y,\,c)+\xi^{\prime}\bar{B}(y,\,c)}{\tilde{A}+\xi^{\prime}\tilde{B}}, (16)

where

A¯(y,c)=A(y~,c~,𝒛)η(y~,c~,𝒛)×g(y,c,y~,c~,𝒛)d𝒛dy~dc~,B¯(y,c)=B(y~,c~,𝒛)η(y~,c~,𝒛)×g(y,c,y~,c~,𝒛)d𝒛dy~dc~,A~=A¯(y,c)𝑑y𝑑c,B~=B¯(y,c)𝑑y𝑑c.\displaystyle\begin{aligned} \bar{A}(y,\,c)&=\int A(\tilde{y},\,\tilde{c},\,\boldsymbol{z})\,\eta(\tilde{y},\,\tilde{c},\,\boldsymbol{z})\\ &\qquad\qquad\times g(y,\,c,\,\tilde{y},\,\tilde{c},\,\boldsymbol{z})\,d\boldsymbol{z}\,d\tilde{y}\,d\tilde{c},\\ \bar{B}(y,\,c)&=\int B(\tilde{y},\,\tilde{c},\,\boldsymbol{z})\eta(\tilde{y},\,\tilde{c},\,\boldsymbol{z})\\ &\qquad\qquad\times g(y,\,c,\,\tilde{y},\,\tilde{c},\,\boldsymbol{z})\,d\boldsymbol{z}\,d\tilde{y}\,d\tilde{c},\\ \tilde{A}&=\int\bar{A}(y,\,c)\,dy\,dc,\quad\tilde{B}=\int\bar{B}(y,\,c)\,dy\,dc.\end{aligned} (17)

In this study, the dependence of the signal PDF normalization on ξ\xi^{\prime} is negligible as A~B~\tilde{A}\gg\tilde{B}.

To calculate A¯(y,c)/A~\bar{A}(y,\,c)/\tilde{A} and B¯(y,c)/A~\bar{B}(y,\,c)/\tilde{A}, we use two MC samples generated with ξ=1\xi^{\prime}=1 and ξ=1\xi^{\prime}=-1. Their distributions in (y,c)(y,\,c) are determined exactly by the following PDFs: 𝒫+1(y,c)=𝒫sig(y,c;+1)\mathcal{P}_{+1}(y,\,c)=\mathcal{P}_{\text{sig}}(y,\,c;\,+1) and 𝒫1(y,c)=𝒫sig(y,c;1)\mathcal{P}_{-1}(y,\,c)=\mathcal{P}_{\text{sig}}(y,\,c;\,-1), respectively, providing

𝒫sig(y,c;ξ)\displaystyle\mathcal{P}_{\text{sig}}(y,\,c;\,\xi^{\prime})\! =\displaystyle= 12{𝒫+1(y,c)+𝒫1(y,c)\displaystyle\!\dfrac{1}{2}\left\{\mathcal{P}_{+1}(y,\,c)+\mathcal{P}_{-1}(y,\,c)\right. (18)
+ξ[𝒫+1(y,c)𝒫1(y,c)]}.\displaystyle\left.+\xi^{\prime}\left[\mathcal{P}_{+1}(y,\,c)-\mathcal{P}_{-1}(y,\,c)\right]\right\}.

All the PDFs can be obtained in the form of 2D histograms of (y,c)(y,\,c) or in the form of smooth functions describing the distributions in (y,c)(y,\,c). Since the signal MC sample statistics is large, 2D histograms of (y,c)(y,\,c) can already be considered as almost smooth functions (there are no statistically significant fluctuations), which can be used in the fit without loss of accuracy. Thus, for simplicity and naturalness, we obtain 𝒫±1(y,c)\mathcal{P}_{\pm 1}(y,\,c) in the form of the 2D histogram of 10×1010\times 10 bins with an interpolation of the intermediate values. Alternatively, we use a smooth function to evaluate the systematic uncertainty as it is described in Sec. VII.4.

In contrast to the signal, the background MC sample has modest statistics, and there is no feasibility to increase it. Therefore, 𝒫bckg(y,c)\mathcal{P}_{\text{bckg}}(y,\,c) is obtained from the approximation of a 6×66\times 6-bin histogram of (y,c)(y,\,c) distribution by a smooth parametric function so that χ2/n.d.f.1\chi^{2}/\text{n.d.f.}\approx 1.

The fit procedure is tested on ensembles of 1000 statistically independent simulated samples of the size expected in the data with eleven ξ\xi^{\prime} seed values from 1-1 to 1 in steps of 0.20.2, and no statistically significant biases are observed.

Finally, the fit to the data yielded ξ=0.22±0.94\xi^{\prime}=0.22\pm 0.94. The projections of the data and the fit function onto the yy and cosθe\cos{\theta_{e}} axes are shown in Fig. 10.

Refer to caption
Figure 10: The fit of the data with ξ=0.22±0.94\xi^{\prime}=0.22\pm 0.94. Points with errors correspond to the data, the solid line histogram corresponds to the fit function, and the shadowed area corresponds to the background function. (a) projection onto yy, (b) projection onto cosθe\cos{\theta_{e}}.

The variation of the 2[ln(ξ)ln(ξfit)]-2\left[\ln{\mathcal{L}(\xi^{\prime})}-\ln{\mathcal{L}(\xi^{\prime}_{\text{fit}})}\right] as a function of the ξ\xi^{\prime} value is shown in Fig. 11.

Refer to caption
Figure 11: The variation of the 2[ln(ξ)ln(ξfit)]-2\left[\ln{\mathcal{L}(\xi^{\prime})}-\ln{\mathcal{L}(\xi^{\prime}_{\text{fit}})}\right] as a function of the ξ\xi^{\prime} value.

The ξ=1\xi^{\prime}=-1 scenario is more than one standard deviation away from the measured ξ\xi^{\prime} value.

For a more detailed illustration of the fit, we plot three slices in yy (0<y<0.520<y<0.52, 0.52<y<0.780.52<y<0.78, and 0.78<y<1.30.78<y<1.3) projected onto cosθe\cos{\theta_{e}} (Fig. 12). In addition to the fit function with ξ=0.22\xi^{\prime}=0.22, we show fit functions with ξ=1\xi^{\prime}=-1 (dashed) and with ξ=1\xi^{\prime}=1 (dash-dotted). For y<0.52y<0.52, there is almost no sensitivity to ξ\xi^{\prime}, while for 0.78<y<1.30.78<y<1.3, the sensitivity is maximum. This behavior is expected from the theoretical function given by Eq. (II.1). The total χ2\chi^{2} for the fit projections shown in Fig. 12 is 3131 with n.d.f.=29\text{n.d.f.}=29, demonstrating that the fit describes the data well. The total χ2\chi^{2} for the projections shown in Fig. 12 for the function with ξ=1\xi^{\prime}=-1 is 3737 and for the function with ξ=1\xi^{\prime}=1 is 3030.

Refer to caption
Figure 12: Projection onto cosθe\cos{\theta_{e}} for slices in yy: (a) 0<y<0.520<y<0.52, (b) 0.52<y<0.780.52<y<0.78, and (c) 0.78<y<1.30.78<y<1.3. Points with errors correspond to the data, the solid line corresponds to the ξ=0.22\xi^{\prime}=0.22 fit function, the dashed line corresponds to the ξ=1\xi^{\prime}=-1 fit function, the dash-dotted line corresponds to the ξ=1\xi^{\prime}=1 fit function, and the shadowed area corresponds to the background function.

VII Systematic uncertainties

The systematic uncertainties are taken into account by assuming the most conservative approach. To estimate them, we generate for each source of the systematics an ensemble of 1000 toy MC samples with eleven ξ\xi^{\prime} seed values from 1-1 to 1 in steps of 0.20.2 and the same statistics as estimated from the signal and background MC samples. Each sample is generated according to the 2D distribution in yy and cosθe\cos{\theta_{e}} obtained from variation of the signal and background distributions within the expected uncertainties (observed discrepancies between the MC simulation and the data described in the previous sections). Then all samples are fitted with the default PDF function, and the average of obtained ξ\xi^{\prime} values over 1000 samples is calculated. The maximum difference between these mean values and the default one is taken as a systematic uncertainty.

We also estimate the systematic uncertainties from the data by varying the PDF functions used in the fit to obtain the difference between a new ξ\xi^{\prime} value and the default one. We use this method as a crosscheck since it is less robust for systematics evaluation and always gives a lower value compared to the estimation from toy MC samples.

In this study, we distinguish four main categories of the systematic error sources: “background,” “PID in BDT,” “signal,” and “fit procedure.”

VII.1 Background systematics

This category of systematic errors includes the uncertainties in the expected background fraction of each type used in the fit PDF as well as the particular background shape.

The signal purity pp is obtained from the MC simulation with the statistical uncertainty of 0.020.02 induced by the limited number of signal and background MC events. While the signal MC sample is generated with large statistics, the size of the background MC sample is moderate, thus making a major contribution. The observed discrepancies between the data and simulation lead to an additional systematical uncertainty of 0.020.02 in the purity value. The variation of pp within the combined error results in the systematic uncertainty of 0.100.10.

The PDF shape and relative contribution of each type of the background processes are the sources of the systematics because the MC simulation does not reproduce data perfectly. The main background contamination comes from the kaon and pion decays, electron and hadron scattering. For each of them, we conducted a small dedicated study described in Sec. V. Prepared pure background samples allow us to observe discrepancies between the MC simulation and the data in both normalization and shape. To take these discrepancies into account, we reweight each type of the background MC sample based on particular kinematical characteristics. For the reweighted background sample, we obtain new values of the background smooth function parameters. The estimated systematic uncertainties are the following: 0.050.05 for πμν¯μ\pi^{-}\to\mu^{-}\bar{\nu}_{\mu}, 0.060.06 for K3K^{-}\to 3 body, 0.050.05 for Kμν¯μK^{-}\to\mu^{-}\bar{\nu}_{\mu}, 0.100.10 for ee-scattering, and 0.110.11 for hadron scattering. The statistical uncertainties of the parameters of the background PDF do not have much impact on the shape and have already been taken into account in the signal purity systematics.

The combined background systematic uncertainty is 0.200.20.

VII.2 PID in BDT systematics

To estimate the effects of PID usage in the BDT, we take advantage of the availability of various tagged kinks in the data selected without BDT application. This systematic uncertainty contains two separate contributions: PID uncertainties of primary muons and daughter electrons.

The PID uncertainty of the daughter track is easier to analyze since we have tagged secondary electrons from the electron scattering (γ\gamma-conversion sample), muons from kaon decays (D+D^{*+} sample), and pions from kaon decays (D+D^{*+} sample). We reweight the (e/μ)\mathcal{R}(e/\mu) distribution for both the signal and background according to the weights obtained from the corresponding sample and then apply a new PDF for toy MC sample generation. The obtained systematic uncertainty is 0.130.13.

To identify muons, we use PID with all four pairs of particle hypotheses (muon against electron, pion, kaon, or proton) in the BDT. To simplify, we evaluate the systematics of PID for all of them separately. Although they are correlated, the separate analysis only increases the systematic uncertainty.

For all kink mother particle types except for the muon, we have a clean sample providing the corresponding PID distribution in the data (electrons from the γ\gamma-conversion sample, kaons and pions from the D+D^{*+} sample). Muons do not have a suitable sample; therefore, we treat them as pions instead. This replacement is justified since we use only dE/dxdE/dx losses, and they are almost the same for the muon and pion mass hypotheses.

We reweight the (μ/x)\mathcal{R}(\mu/x) distribution (x=ex=e, π\pi, KK, or pp) for both the signal and background and then apply a new PDF for toy MC sample generation. The obtained uncertainties are 0.130.13 for (μ/e)\mathcal{R}(\mu/e), 0.090.09 for (μ/π)\mathcal{R}(\mu/\pi), 0.100.10 for (μ/K)\mathcal{R}(\mu/K), and 0.060.06 for (μ/p)\mathcal{R}(\mu/p).

The combined PID in BDT systematic uncertainty is 0.240.24.

VII.3 Signal PDF systematics

Here, we study all sources of systematic uncertainties related to the signal PDF 𝒫sig\mathcal{P}_{\text{sig}}. These include signal reconstruction efficiency depending on the muon laboratory momentum pμp_{\mu} and the electron emission angle in the muon rest frame θeμ\theta_{e\mu}, electron momentum resolution in the muon rest frame, and also the systematics of the signal MC sample generation method.

The systematic uncertainty due to the discrepancy in reconstruction efficiency between the data and the MC simulation consists of two different contributions: one is related to the trigger efficiency of the selected topology, and the other is due to the kink reconstruction efficiency. The trigger efficiency uncertainty results in a small discrepancy between primary muon momentum distributions in the MC and data samples. We obtain weights for the estimation from the sample of τμν¯μντ\tau^{-}\to\mu^{-}\bar{\nu}_{\mu}\nu_{\tau} decays without μeν¯eνμ\mu^{-}\to e^{-}\bar{\nu}_{e}\nu_{\mu} kink. This systematic uncertainty is 0.080.08.

The reconstruction efficiency also strongly depends on the electron emission angle θeμ\theta_{e\mu}. We control this effect using the largest selected background sample of Kμν¯μK^{-}\to\mu^{-}\bar{\nu}_{\mu} decays (from D+D^{*+} decays). Since kaons are pseudoscalars, their decay angular distribution is uniform. Therefore, after reconstruction, the cosθμK\cos{\theta_{\mu K}} distribution represents the kink reconstruction efficiency. Applying weights from the kaon decay to our signal, we estimate the systematic uncertainty to be 0.090.09.

To take into account the systematics of the electron momentum resolution in the muon rest frame, we also exploit the Kμν¯μK^{-}\to\mu^{-}\bar{\nu}_{\mu} decay sample. We use kaons selected from the τ\tau sample since here we observe a discrepancy in pμKp_{\mu K} resolution slightly larger compared to kaon kinks from the D+D^{*+} sample. We estimate systematic uncertainty to be 0.030.03.

The systematic uncertainty induced by the muon lifetime reduction for the signal MC sample generation is 0.070.07. It is estimated by comparing the signal MC sample to the ten times smaller MC sample generated with the default muon lifetime.

The combined signal PDF systematic uncertainty is 0.140.14.

VII.4 Fit procedure systematics

To estimate the systematic uncertainty of the fit procedure, we compare the fit results of the Michel parameter ξ\xi^{\prime} for two different 𝒫sig\mathcal{P}_{\text{sig}}. The first one is the default signal PDF in the form of a histogram. The second one is obtained from the MC (y,cosθe)(y,\,\cos{\theta_{e}}) distribution by smoothing a 16×1616\times 16-bin 2D histogram with a parametric function. We estimate the systematic uncertainty of this source to be 0.250.25.

In addition, we check the difference in the result by varying the bin width of the default 𝒫sig\mathcal{P}_{\text{sig}}. It is impossible to vary the bin size much since with too fine binning, a few empty bins appear, leading to a bias of the fit, while with too rough binning, the sensitivity suffers due to the fitting function sharpness in some regions. Thus, 8×88\times 8 and 12×1212\times 12 net is used to check the variation. The obtained difference is small and does not exceed 0.130.13.

In conclusion, we consider 0.250.25 as a systematic uncertainty of the fit procedure.

VII.5 Systematics summary

Finally, the combined overall systematic uncertainty of the Michel parameter ξ\xi^{\prime} measurement is estimated to be σξ=0.42\sigma_{\xi^{\prime}}=0.42. In Table 4, we summarize the results of the systematic uncertainty estimation for all sources.

Table 4: Sources of the systematic uncertainties of the Michel parameter ξ\xi^{\prime} measurement (absolute values).
       Source Uncertainty
Background
       Purity (pp) 0.100.10
       πμν¯μ\pi^{-}\to\mu^{-}\bar{\nu}_{\mu} MC 0.050.05
       K3K^{-}\to 3 body MC 0.060.06
       Kμν¯μK^{-}\to\mu^{-}\bar{\nu}_{\mu} MC 0.050.05
       ee-scattering MC 0.100.10
       hadron scattering MC 0.110.11
PID in BDT
       (e/μ)\mathcal{R}(e/\mu) 0.13
       (μ/e)\mathcal{R}(\mu/e) 0.13
       (μ/π)\mathcal{R}(\mu/\pi) 0.09
       (μ/K)\mathcal{R}(\mu/K) 0.10
       (μ/p)\mathcal{R}(\mu/p) 0.06
Signal PDF
       pμp_{\mu} efficiency 0.08
       cosθeμ\cos{\theta_{e\mu}} efficiency 0.09
       peμp_{e\mu} resolution 0.03
       Signal MC generation 0.07
Fit procedure
       Fit function 0.25
       Total 0.420.42

Systematic uncertainty is significantly smaller than the statistical one in this analysis, demonstrating the potential of the applied method. The method allows for a significant improvement in accuracy in the near future in already working experiments or those being under development. Thus, the task of control of systematic uncertainty with increasing statistics is worth considering.

A qualitative consideration that the statistical uncertainty will dominate the systematic one in similar analyses in the near future experiments is discussed in detail in Ref. [17]. This analysis confirms in practice the validity of that conclusion: most of the systematic sources are controlled with large independent samples, and no limiting factors for further improvements in accuracy have yet been observed.

VIII Result

We measure the Michel parameter ξ\xi^{\prime} to be

ξ=0.22±0.94±0.42,\displaystyle\xi^{\prime}=0.22\pm 0.94\pm 0.42, (19)

where the first uncertainty is statistical and the second one is systematic. This result is consistent with the Standard Model prediction of ξSM=1\xi^{\prime}_{\text{SM}}=1. The combined uncertainty, σξ=1.03\sigma_{\xi^{\prime}}=1.03, is more than two times smaller compared to the previous Belle result ξ=2.2±2.4\xi^{\prime}=-2.2\pm 2.4 calculated from ξκ\xi\kappa obtained in the study of the τμν¯μντγ\tau^{-}\to\mu^{-}\bar{\nu}_{\mu}\nu_{\tau}\gamma decay [12].

Based on the gained experience, it is possible to improve the result in the near future in the Belle II experiment [35], taking into account the upgraded detector with an enlarged CDC and the implementation of improved tracking algorithms. In particular, the kink reconstruction algorithm implementation will provide a better momentum resolution, which is important for both background suppression and sensitivity increase (smeared by the resolution otherwise).

IX Conclusion

In summary, we report the first direct measurement of the Michel parameter ξ\xi^{\prime} in the τμν¯μντ\tau^{-}\to\mu^{-}\bar{\nu}_{\mu}\nu_{\tau} decay with the full Belle data sample using the μeν¯eνμ\mu^{-}\to e^{-}\bar{\nu}_{e}\nu_{\mu} decay-in-flight in the Belle drift chamber. The obtained value of ξ=0.22±0.94±0.42\xi^{\prime}=0.22\pm 0.94\pm 0.42, where the first uncertainty is statistical and the second one is systematic, is in agreement with the Standard Model prediction ξSM=1\xi^{\prime}_{\text{SM}}=1.

ACKNOWLEDGMENTS

This work, based on data collected using the Belle detector, which was operated until June 2010, was supported by the Ministry of Education, Culture, Sports, Science, and Technology (MEXT) of Japan, the Japan Society for the Promotion of Science (JSPS), and the Tau-Lepton Physics Research Center of Nagoya University; the Australian Research Council including grants DP210101900, DP210102831, DE220100462, LE210100098, LE230100085; Austrian Federal Ministry of Education, Science and Research (FWF) and FWF Austrian Science Fund No. P 31361-N36; the National Natural Science Foundation of China under Contracts No. 11675166, No. 11705209; No. 11975076; No. 12135005; No. 12175041; No. 12161141008; Key Research Program of Frontier Sciences, Chinese Academy of Sciences (CAS), Grant No. QYZDJ-SSW-SLH011; Project ZR2022JQ02 supported by Shandong Provincial Natural Science Foundation; the Ministry of Education, Youth and Sports of the Czech Republic under Contract No. LTT17020; the Czech Science Foundation Grant No. 22-18469S; Horizon 2020 ERC Advanced Grant No. 884719 and ERC Starting Grant No. 947006 “InterLeptons” (European Union); the Carl Zeiss Foundation, the Deutsche Forschungsgemeinschaft, the Excellence Cluster Universe, and the VolkswagenStiftung; the Department of Atomic Energy (Project Identification No. RTI 4002) and the Department of Science and Technology of India; the Istituto Nazionale di Fisica Nucleare of Italy; National Research Foundation (NRF) of Korea Grant Nos. 2016R1D1A1B02012900, 2018R1A2B3003643, 2018R1A6A1A06024970, RS202200197659, 2019R1I1A3A01058933, 2021R1A6A1A03043957, 2021R1F1A1060423, 2021R1F1A1064008, 2022R1A2C1003993; Radiation Science Research Institute, Foreign Large-size Research Facility Application Supporting project, the Global Science Experimental Data Hub Center of the Korea Institute of Science and Technology Information and KREONET/GLORIAD; the Polish Ministry of Science and Higher Education and the National Science Center; the Ministry of Science and Higher Education of the Russian Federation, Agreement 14.W03.31.0026, and the HSE University Basic Research Program, Moscow; University of Tabuk research grants S-1440-0321, S-0256-1438, and S-0280-1439 (Saudi Arabia); the Slovenian Research Agency Grant Nos. J1-9124 and P1-0135; Ikerbasque, Basque Foundation for Science, Spain; the Swiss National Science Foundation; the Ministry of Education and the Ministry of Science and Technology of Taiwan; and the United States Department of Energy and the National Science Foundation. These acknowledgements are not to be interpreted as an endorsement of any statement made by any of our institutes, funding agencies, governments, or their representatives. We thank the KEKB group for the excellent operation of the accelerator; the KEK cryogenics group for the efficient operation of the solenoid; and the KEK computer group and the Pacific Northwest National Laboratory (PNNL) Environmental Molecular Sciences Laboratory (EMSL) computing group for strong computing support; and the National Institute of Informatics, and Science Information NETwork 6 (SINET6) for valuable network support.

Appendix A Kink events selection in the decay chain 𝑫+𝑫𝟎(𝑲𝝅+)𝝅+D^{*+}\to D^{0}(\to K^{-}\pi^{+})\pi^{+}

We reconstruct D+D^{*+} candidates in the decay chain D+D0(Kπ+)π+D^{*+}\to D^{0}(\to K^{-}\pi^{+})\pi^{+}. The following selection criteria on the Kπ+K^{-}\pi^{+} and Kπ+π+K^{-}\pi^{+}\pi^{+} invariant masses are used: 1.82GeV/c2<M(Kπ+)<1.9GeV/c21.82\,\,{\mathrm{\mbox{GeV}}/c^{2}}<M(K^{-}\pi^{+})<1.9\,\,{\mathrm{\mbox{GeV}}/c^{2}} and |M(Kπ+π+)M(Kπ+)+MPDG(D0)MPDG(D+)|<3MeV/c2|M(K^{-}\pi^{+}\pi^{+})-M(K^{-}\pi^{+})+M^{\text{PDG}}(D^{0})-M^{\text{PDG}}(D^{*+})|<3\,\text{MeV}/c^{2}, providing a large sample of D+D^{*+} candidates. The momentum of the D+D^{*+} candidates in the c.m. frame is limited at pD+>2.3GeV/cp_{D^{*+}}>2.3\,\,{\mathrm{\mbox{GeV}}/c} since our MC simulation of e+eΥ(4S)BB¯e^{+}e^{-}\to\Upsilon(4S)\to B\bar{B} does not reproduce Kπ+π+K^{-}\pi^{+}\pi^{+} invariant mass distribution well for both the D+D^{*+} signal and combinatorial background. For larger momentum, D+D^{*+} are produced in the continuum, and our MC simulation of e+eqq¯e^{+}e^{-}\to q\bar{q} describes the combinatorial background well. However, there is a discrepancy between the data and MC samples in the D+D^{*+} peak since the effects of the cc-quark fragmentation were not properly accounted for in the MC simulation. The fragmentation is based mainly on the momentum spectrum; therefore, we reweight the MC sample with a real D+D0π+D^{*+}\to D^{0}\pi^{+} decay in bins of its momentum. The following procedure is used: the M(Kπ+π+)M(K^{-}\pi^{+}\pi^{+}) distribution in data is fitted in bins of pD+p_{D^{*+}}, and the number of D+D^{*+} mesons is obtained. After that, we determine the weight for the MC event with real D+D^{*+} as w(pD+)=ND+data(pD+)/ND+MC(pD+)w(p_{D^{*+}})=N^{\text{data}}_{D^{*+}}(p_{D^{*+}})/N^{\text{MC}}_{D^{*+}}(p_{D^{*+}}). We perform this procedure with D+D^{*+} candidates reconstructed before any kink selection.

For further event selection, we require one of the D0D^{0} daughter tracks to pass the kink selection criteria described in Sec. IV.2. The second track is identified using the information from the CDC, TOF, and ACC combined to form likelihood i\mathcal{L}_{i} (i=πi=\pi or KK). To select the pion (kaon) kink, we require (K/π)=K/(K+π)>0.6\mathcal{R}(K/\pi)=\mathcal{L}_{K}/(\mathcal{L}_{K}+\mathcal{L}_{\pi})>0.6 [(π/K)>0.6\mathcal{R}(\pi/K)>0.6] for KK^{-} (π+\pi^{+}) from the D0D^{0} meson.

To illustrate the result of the selection, we plot the Kπ+K^{-}\pi^{+} invariant mass for pion and kaon kinks in Fig. 13(a) and (b), respectively. As can be seen, each sample consists of the corresponding kinks, as well as hadron scattering events.

Appendix B Kink events selection in the 𝜸\gamma-conversion process

We select γ\gamma-conversion events from 1–1 and 1–3 topology τ+τ\tau^{+}\tau^{-} pairs sample prepared according to the preselection criteria described in Sec. IV.1. Although this preselection limits available statistics, the kink reconstruction efficiency here is similar to one in the main analysis.

The conversion is reconstructed on the one-track side from two oppositely charged tracks. To suppress background from other VV-shaped processes like KS0K^{0}_{S} decay, the invariant mass of e+ee^{+}e^{-} pair me+em_{e^{+}e^{-}} is required to be less than 40MeV/c240\,\text{MeV}/c^{2}. Since γ\gamma-conversion occurs on the detector material, the radius of the conversion vertex in the rϕr\phi-plane has to be larger than 2cm2\,\text{cm}. To suppress a random combination of the tracks, the distance between two tracks in projection onto the zz-axis is required to be less than 5cm5\,\text{cm}. The daughter electron is reconstructed as a kink with the selection criteria described in Sec. IV.2. Finally, using the identification of the daughter positron (e/x)>0.8\mathcal{R}(e/x)>0.8, we obtain a clean sample of identified electron scattering events.

To illustrate the result of the described procedure, we plot the e+ee^{+}e^{-}-pair invariant mass in Fig. 14(a) and the radius of the conversion vertex in the rϕr\phi-plane in Fig. 14(b). The localization of the me+em_{e^{+}e^{-}} in the zero region is as expected. In the distribution of the γ\gamma-conversion vertex, the SVD structure is clearly observed. The selected sample consists of pure electron scattering events.

Refer to caption
Figure 13: Kπ+K^{-}\pi^{+} invariant mass for (a) π+\pi^{+} kink candidates and (b) KK^{-} kink candidates.
Refer to caption
Figure 14: (a) invariant mass me+em_{e^{+}e^{-}} and (b) radius of the conversion vertex for the selected γ\gamma-conversion events, where one of the electron is reconstructed as an electron scattering kink.

References