This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

thanks: These two authors contributed equally.thanks: These two authors contributed equally.

Covariance Analysis of Impulsive Streaking

Jun Wang1,2,3    Zhaoheng Guo2,3,4    Erik Isele1,2,3    Philip H. Bucksbaum1,2,3    Agostino Marinelli1,3    James P. Cryan1,3 [email protected]    Taran Driver1,3 [email protected] 1Stanford PULSE Institute, SLAC National Accelerator Laboratory, Menlo Park, CA 94025, USA 2Department of Applied Physics, Stanford University, Stanford, CA 94305, USA 3SLAC National Accelerator Laboratory, Menlo Park, CA 94025, USA 4Paul Scherrer Institute, Villigen, Switzerland
Abstract

We present a comprehensive framework of modeling covariance in angular streaking experiments. Within the impulsive streaking regime, the displacement of electron momentum distribution (MD) provides a tight connection between the dressing-free MD and the dressed MD. Such connection establishes universal structures in the composition of streaking covariance that are common across different MDs, regardless of their exact shape. Building on this robust framework, we have developed methods for retrieving temporal information from angular streaking measurements. By providing a detailed understanding of the covariance structure in angular streaking experiments, our work enables more accurate and robust temporal measurements in a wide range of experimental scenarios.

I Introduction

The motion of electrons in molecules and condensed phase systems takes place on the attosecond timescale. It is now possible to generate light pulses and trains of pulses with sub-femtosecond (i.e. attosecond) duration. These technical developments have launched the field of experimental attosecond science [1]. Even with access to attosecond pulses, measuring electron motion with attosecond time resolution is a significant technical challenge. Time-resolved measurements require the ability to synchronize an attosecond pulse with a second event, such as the interaction with a second laser pulse, with sub-femtosecond precision. One method that has proven highly successful involves combining the attosecond pulse with a longer duration infrared laser pulse, and using the oscillating electric field of the infrared laser pulse to map time onto a measurable quantity such as the momentum of an emitted electron [2]. The term ‘attosecond streaking’ has been coined to describe this class of experiments, as the action of the field on the electron is analogous to the action of the time-varying voltage in a streak camera [3, 2]. Since the period of oscillation of an infrared laser pulse is on the few-femtosecond timescale and the action of the field on an electron depends on the phase of the oscillation, attosecond streaking experiments have emerged as a powerful probe of attosecond electron dynamics [4], including measurements of photoemission delays [5], Auger-Meitner decay [6], and characterization of attosecond pulses  [7, 8].

The principal of laser streaking measurements relies on a time reference for the ultrafast process, which can be provided either by the precise timing stability (few-attosecond) between the attosecond pulse and the infrared field, or through a single-shot self-referencing technique. The timing stability can be achieved when the attosecond pulse has been produced by high harmonic generation (HHG), since the HHG emission is naturally synchronized with the infrared driving field. Meanwhile, attosecond x-ray free-electron lasers (XFELs) have many advantageous properties: continuous tunability of photon energy, peak powers that can approach one terawatt, and roughly Fourier-limited pulse durations [9, 10]. The inherent timing jitter between an infrared laser and the XFEL pulses is typically larger than the infrared period, in contrast to HHG-based sources. To make use of the exceptional properties of XFELs in experiments approaching the attosecond regime, approaches that employ a single-shot time reference signal are required. Such approaches have been explored recently, demonstrating the ability to achieve time resolution better than an optical cycle [11, 12, 13, 14, 15, 16]. The single-shot reference signal can be provided by photoelectrons produced from prompt ionization. When the duration of the XFEL pulse is much shorter than the infrared laser period, the streaking interaction can be treated in the impulsive regime, in which the streaking laser imparts the same momentum shift to all photoelectrons. The momentum shift of the prompt electrons therefore provides a single-shot reference for the relative x-ray/laser arrival time. Such a time reference has been employed to achieve attosecond timing resolution for measurements of Auger-Meitner decay by characterizing the photoelectrons’ momentum shift on a single-shot basis [11, 12, 17]. The single-shot quantification of the momentum shift places stringent conditions on the single-shot data and risks systematic error.

Correlation-based approaches have also been employed to overcome the inherent timing jitter in laser/x-ray measurements [18, 14, 15]. Such a technique has recently been used to extract the photoemission delay in x-ray ionization [16] and measure the delay between two attosecond pulses [13]. Here we provide a formal treatment for this covariance analysis, focusing on measurements with a circularly polarized infrared dressing field, or “angular streaking”. In addition to circumventing the requirement for single-shot analysis, this correlation-based analysis bypasses the requirement for the complex modeling of the streaking process in the retrieval of photoemission delay.

In section II, we introduce our mathematical model for the covariance analysis of impulsive angular streaking, in the presence of large jitter in the x-ray/infrared relative timing. In section III, we describe and compare several methods for extracting the time delay between two impulsive processes. These methods can be used to measure the relative photoemission delay between photoionization events produced by the same pulse, or the delay between attosecond pulses. In section IV we discuss additional considerations for the interpretation and design of attosecond angular streaking measurements. In section V we generalize our analysis to the case of an emission process with a complex (non-impulsive) profile in the time domain.

II Model

To describe the x-ray-matter interaction in the presence of an intense laser-field with vector potential 𝑨(t)\bm{A}(t), we employ the strong-field approximation to write the probability amplitude for observing a photoelectron with momentum 𝒑\bm{p} [19]:

b(𝒑;𝑨)=t0𝑑teiΦ(t;𝒑,𝑨)G(t;𝒑e𝑨(t)),b\left(\bm{p};\bm{A}\right)=\int_{t_{0}}^{\infty}dte^{-i\Phi(t;\bm{p},\bm{A})}G(t;\bm{p}-e\bm{A}(t))~{}, (1)

where Φ(t;𝒑,𝑨)t+𝑑t(𝒑e𝑨(t))2/(2m)\Phi(t;\bm{p},\bm{A})\equiv\int_{t}^{+\infty}dt^{\prime}(\bm{p}-e\bm{A}(t^{\prime}))^{2}/(2m\hbar) is the so-called Volkov phase [20], e<0e<0 and mm are the charge and mass of electron, respectively, and G(t;𝒑)G(t;\bm{p}^{\prime}) describes the electron source term for electrons with momentum 𝒑\bm{p}^{\prime} entering the continuum at time tt, and t0t_{0} is a fixed initial time before the onset of x-ray pulse. Equation (1) can be simplified using the stationary phase approximation to yield,

b(𝒑;𝑨)tsT(𝒑,𝑨)C𝒑(ts)G(ts,𝒑e𝑨(ts)),~{}b(\bm{p};\bm{A})\simeq\sum_{t_{\mathrm{s}}\in T(\bm{p},\bm{A})}C_{\bm{p}}(t_{\mathrm{s}})G(t_{\mathrm{s}},\bm{p}-e\bm{A}(t_{\mathrm{s}}))~{}, (2)

where T(𝒑,𝑨)T(\bm{p},\bm{A}) is the collection of all tst_{\mathrm{s}} that satisfy the stationary phase condition 0=(𝒑e𝑨(ts))2/2+ddt|t=tsargG(t,𝒑e𝑨(t))0=(\bm{p}-e\bm{A}(t_{\mathrm{s}}))^{2}/2+\left.\frac{d}{dt}\right|_{t=t_{s}}\arg G(t,\bm{p}-e\bm{A}(t)), and C𝒑(ts)C_{\bm{p}}(t_{s}) is a weighting factor for each stationary point. If the duration of source-term GG is much shorter than the streaking laser period TLT_{L}, a unique tst_{\mathrm{s}} dominates for all 𝒑\bm{p}. Since the derivative of the phase arg[G]\arg[G] describes an energy, we can write ddt|t=tsarg[G]=E0\left.\frac{d}{dt}\right|_{t=t_{s}}\arg[G]=-E_{0}. With this substitution, the stationary phase condition becomes (𝒑e𝑨(ts))2/2=E0(\bm{p}-e\bm{A}(t_{\mathrm{s}}))^{2}/2=E_{0}, which we recognize as an equation of motion for a classical electron in an external field, with kinetic energy E0E_{0} at time tst_{s}. Revisiting Eqn. (2) and the stationary phase condition, this uniqueness of tst_{\mathrm{s}} enables an approximation, demonstrating that the dressed photoelectron momentum distribution (MD) |b(𝒑;𝑨)|2|b(\bm{p};\bm{A})|^{2} can be approximated by a displacement of the dressing-free (i.e. 𝑨=0\bm{A}=0) MD:

|b(𝒑;𝑨)|2|b(𝒑𝒌;𝑨=0)|2,|b(\bm{p};\bm{A})|^{2}\simeq|b(\bm{p}-\bm{k};\bm{A}=0)|^{2}~{}, (3)

where 𝒌e(𝑨(ts)+τd𝑨dt|ts)\bm{k}\equiv e(\bm{A}(t_{s})+\tau\left.\frac{d\bm{A}}{dt}\right|_{t_{s}}) is the momentum shift, and τ=2𝒑2argG|ts\tau=\langle\frac{\partial^{2}}{\partial\bm{p}^{\prime 2}}\left.\arg G\right|_{t_{s}}\rangle is the momentum-averaged photoemission delay. We call this the impulsive streaking regime and label 𝒌\bm{k} the streaking vector.

Refer to caption
Figure 1: Illustration of impulsive regime of angular streaking. (a) In the presence of the circularly polarized streaking laser field (red), a pulse (purple) much shorter than the laser period ionizes the sample molecules, from which electrons are emitted (black dashed). (b) Streaked photoelectron momentum distribution (MD) at the pz=0p_{z}=0 slice simulated with strong-field approximation (SFA). A hydrogen atom IP=13.6I_{P}{=}13.6 eV is ionized by a 40.840.8 eV, 200200 as Gaussian pulse in a λ=1.85μm,|𝑨|=0.06a.u.\lambda=1.85\,\mathrm{\mu m},|\bm{A}|=0.06\,\mathrm{a.u.} dressing field. The arrow indicates e𝑨e\bm{A} at the ionizing pulse’s peak instant, with 10x-magnified length. (c) Difference of (b) from the dressing-free MD I0I^{0}, i.e. the MD when the streaking field is mistimed from the ionization pulse. (d) The dressing-free MD I0I^{0} displaced by 𝒌\bm{k}. (e) Difference between (b) and (d).

For the remainder of this work, we consider the case of a circularly polarized dressing laser propagating along the z^\hat{z}-direction,

𝑨(t)=A0(t)[cos(ωLt)x^+sin(ωLt)y^],\bm{A}(t)=A_{0}(t)\left[\cos\left(\omega_{L}t\right)\hat{x}+\sin\left(\omega_{L}t\right)\hat{y}\right], (4)

where A0(t)A_{0}(t) is the slowly varying envelope of the laser field assumed to be much longer than the laser period TLT_{L}, and ωL=2π/TL\omega_{L}=2\pi/T_{L} is the angular frequency. We consider the two-dimensional (2D) momentum distribution I(𝒓)I(\bm{r}) of the electrons, where 𝒓\bm{r} denotes the momentum in the xyxy-plane. Depending on the measurement scheme, I(𝒓)I(\bm{r}) can either be the projection of |b(𝒑)|2|b(\bm{p})|^{2} along the z^\hat{z} direction [21] or the slice of |b(𝒑)|2|b(\bm{p})|^{2} at the pz=0p_{z}=0 plane [8]. The model and methods introduced in this work are applicable to both measurement schemes, and a quantitative comparison is given in Sec. IV.

The defining property of the impulsive regime (Eqn. (3)) is the displacement of the dressing-free MD I0(𝒓)I^{0}(\bm{r}) by the streaking vector: I(𝒓;𝑨)I0(𝒓𝒌)𝒟𝒌I0(𝒓)I(\bm{r};\bm{A})\simeq I^{0}(\bm{r}-\bm{k})\equiv\mathcal{D}_{\bm{k}}I^{0}(\bm{r}), where we define the displacement operator 𝒟𝒌\mathcal{D}_{\bm{k}}. Throughout this work, the superscript “0” indicates the MD is dressing-free (𝑨=0\bm{A}=0), e.g. I0(𝒓)I^{0}(\bm{r}). An example MD simulated using Eqn. (1) is shown in Fig. 1(b). In this case, the photoelectrons are produced from direct ionization G(t,𝒑)=ieiIPt/D(𝒑)EX(t)G(t,\bm{p}^{\prime})=-ie^{iI_{P}t/\hbar}D(\bm{p}^{\prime})E_{X}(t), where IPI_{P} is the ionization potential, D(𝒑)D(\bm{p}^{\prime}) the dipole along the polarization of the ionizing pulse EX(t)E_{X}(t). As shown in Fig. 1(c-d), the effect of the vector potential on the photoelectron MD is well approximated by the displacement operator, to the 1% level in this case (panel (e)).

As discussed in Sec. I, most experimental photoelectron MDs contain multiple photoemission features. We generally denote two non-overlapping features in the MD as X(𝒓)X(\bm{r}) and Y(𝒓)Y(\bm{r}), such as the photoelectrons produced by two ionizing pulses with different wavelengths, separated in time by less than TLT_{L} [13]. Due to instabilities in the relative delay between the ionizing and streaking pulses, both XX and YY vary randomly. To extract the relative timing information between the features, we compute the covariance between XX and YY, i.e. the two-point function defined as

C[X,Y](𝒓q,𝒓p)𝔼[X(𝒓q)Y(𝒓p)]𝔼[X(𝒓q)]𝔼[Y(𝒓p)],C[X,Y](\bm{r}_{q},\bm{r}_{p})\equiv\mathbb{E}\left[X(\bm{r}_{q})Y(\bm{r}_{p})\right]-\mathbb{E}\left[X(\bm{r}_{q})\right]\mathbb{E}\left[Y(\bm{r}_{p})\right]~{}, (5)

where 𝔼[]\mathbb{E}[\cdot] refers to the expectation over all fluctuations, which is also written as 𝔼[]\langle\cdot\rangle{\equiv}\mathbb{E}\left[\cdot\right] for simplicity. We choose the feature X(𝒓)X(\bm{r}) to be the result of an impulsive process (Eqn. 3), which provides a timing reference for any general ionization feature, Y(𝒓)Y(\bm{r}). We denote the streaking vector of XX as 𝒌=𝒌X=k(cosκ𝒆x+sinκ𝒆y)\bm{k}=\bm{k}_{X}=k(\cos\kappa\,\bm{e}_{x}+\sin\kappa\,\bm{e}_{y}), with amplitude kk and direction κ\kappa.

The key to encoding the relative timing between XX and YY into the covariance is the randomness of the streaking direction κ\kappa. The distribution of κ\kappa is generally considered to be uniform 𝒰[0,2π]\mathcal{U}[0,2\pi], since the arrival time jitter spans many periods of the dressing laser [22]. Although κ\kappa is random, the relative timing between XX and YY is determined by the underlying atomic or molecular physics. The streaking amplitude kk is determined by the spatial overlap and the intensity of dressing laser, which can be assumed statistically independent from κ\kappa. The arrival time jitters may also vary kk, but since the duration of A0(t)A_{0}(t) is much longer than TLT_{L}, we can still assume kk and κ\kappa are independent. In the following, we refer to the covariance that purely results from the variations in the streaking vector 𝒌\bm{k} as the “streaking covariance”:

K[X,Y]C[𝔼[X|𝒌],𝔼[Y|𝒌]],K[X,Y]\equiv C[\mathbb{E}\left[X|\bm{k}\right],\mathbb{E}\left[Y|\bm{k}\right]]~{}, (6)

where 𝔼[|𝒌]\mathbb{E}\left[\cdot|\bm{k}\right] is the conditional expectation operator given the streaking vector, i.e. the average over all other parameters except 𝒌\bm{k}. In this way, 𝔼[X|𝒌]\mathbb{E}\left[X|\bm{k}\right] and 𝔼[Y|𝒌]\mathbb{E}\left[Y|\bm{k}\right] are two random functions that only vary with 𝒌\bm{k}, thus K[X,Y]0K[X,Y]\neq 0 if and only if 𝒌\bm{k} fluctuates.

In Sec. II.1-IV, we demonstrate the covariance analysis in the situation where YY is also impulsive. The angle ϕ\phi between the two streaking vectors, 𝒌Y\bm{k}_{Y} and 𝒌X=𝒌\bm{k}_{X}=\bm{k}, corresponds to a time delay ϕ/ωL\phi/\omega_{L} of the feature YY referenced to XX. If we assume a counter-clockwise rotation of the vector potential as specified in Eqn. (4), ϕ>0\phi>0 means that YY is delayed from XX. Typically the time delay is shorter than TLT_{L}, much shorter than the evolution of the envelope function A0(t)A_{0}(t), so the two features share the same magnitude of vector potential |𝑨(ts)||\bm{A}(t_{s})|. An interesting consequence of the stationary phase approximation in Eqn. (3) is the difference in streaking amplitudes |𝒌X|=k|\bm{k}_{X}|{=}k and |𝒌Y||\bm{k}_{Y}| for two photoelectron features dressed by the same streaking field. Combining Eqn. (4), which describes the vector potential of a circularly polarized dressing field, with the momentum shift from the stationary phase approximation 𝒌=e(𝑨(ts)+τd𝑨dt|ts)\bm{k}=e(\bm{A}(t_{s})+\tau\left.\frac{d\bm{A}}{dt}\right|_{t_{s}}), we find that |𝒌Y|=λk|\bm{k}_{Y}|=\lambda k, where the factor λ(1+(ωLτY)2)/(1+(ωLτX)2)\lambda\equiv\sqrt{(1+(\omega_{L}\tau_{Y})^{2})/(1+(\omega_{L}\tau_{X})^{2})} accounts for the difference between |𝒌X||\bm{k}_{X}| and |𝒌Y||\bm{k}_{Y}| resulting from the relative photoemission delay τYτX\tau_{Y}-\tau_{X} [23]. When |τYτX|ωL1|\tau_{Y}-\tau_{X}|\omega_{L}\ll 1 rad, this factor is λ1\lambda\simeq 1, and as |τXτY||\tau_{X}-\tau_{Y}| increases, the two streaking amplitudes diverge. In this way, the streaking vector of YY is given by 𝒌Y=λk(cos(κ+ϕ)𝒆x+sin(κ+ϕ)𝒆y)\bm{k}_{Y}=\lambda k(\cos(\kappa+\phi)\,\bm{e}_{x}+\sin(\kappa+\phi)\,\bm{e}_{y}).

II.1 Impulsive Streaking Covariance

We first derive the impulsive model for K[X,Y]K[X,Y] in the absence of machine fluctuations, i.e. when the only variation in the measurement is 𝒌\bm{k}, in which case K[X,Y]=C[X,Y]K[X,Y]=C[X,Y]. As illustrated in Eqn. (3), in the impulsive regime X(𝒓)X0(𝒓𝒌X)X(\bm{r})\simeq X^{0}(\bm{r}-\bm{k}_{X}), and Y(𝒓)Y0(𝒓𝒌Y)Y(\bm{r})\simeq Y^{0}(\bm{r}-\bm{k}_{Y}). Substituting the Taylor expansions of XX and YY into the definition of covariance Eqn. (5), we find the leading order is

C[X,Y]C[𝒟𝒌XX0,𝒟𝒌YY0]=k2λ2(X0)TR(ϕ)Y0+o(k4),C[X,Y]\simeq C[{\mathcal{D}}_{\bm{k}_{X}}{X^{0}},{\mathcal{D}}_{\bm{k}_{Y}}{Y^{0}}]\\ =\frac{\langle k^{2}\rangle\lambda}{2}(\nabla{X^{0}})^{T}~{}R(-\phi)~{}\nabla{Y^{0}}+o(k^{4})~{}, (7)

where \nabla is the 2D momentum gradient operator, R(ϕ)R(-\phi) is the 2-by-2 matrix representing the counterclockwise rotation of ϕ-\phi in the xyxy-plane. This leading order is the inner-product between X0\nabla{X^{0}} and R(ϕ)Y0R(-\phi)\nabla{Y^{0}}, also referred to as the gradient inner-product (GIP), and the delay is encoded in the rotation.

In the impulsive regime, the streaking covariance defined in Eqn. (6) is given by a bilinear operation on the unstreaked distributions, C[𝒟𝒌X,𝒟𝒌Y]X0Y0C[{\mathcal{D}}_{\bm{k}_{X}},{\mathcal{D}}_{\bm{k}_{Y}}]{X^{0}}{Y^{0}}. As shown in Eqn. (7), the leading order in this bilinear operator C[𝒟𝒌X,𝒟𝒌Y]C[{\mathcal{D}}_{\bm{k}_{X}},{\mathcal{D}}_{\bm{k}_{Y}}] can be written as

MGIPk2λ2(X0)TR(ϕ)Y0=k22G^(ϕ)X0Y0,M_{\mathrm{GIP}}\equiv\frac{\langle k^{2}\rangle\lambda}{2}(\nabla{X^{0}})^{T}\,R(-\phi)\,\nabla{Y^{0}}=\frac{\langle k^{2}\rangle}{2}\hat{G}(\phi)X^{0}Y^{0}, (8)

where G^(ϕ)λi,jRij(ϕ)ij\hat{G}(\phi)\equiv\lambda\sum_{i,j}R_{ij}(-\phi)\partial_{i}\otimes\partial_{j}, with the direct product defined as ijrq,irp,j\partial_{i}\otimes\partial_{j}\equiv\frac{\partial}{\partial r_{q,i}}\frac{\partial}{\partial r_{p,j}} to perform partial differentiation with respect to momentum coordinates 𝒓q\bm{r}_{q} and 𝒓p\bm{r}_{p}, respectively. The full expansion of C[𝒟𝒌X,𝒟𝒌Y]C[{\mathcal{D}}_{\bm{k}_{X}},{\mathcal{D}}_{\bm{k}_{Y}}] is

C[𝒟𝒌X,𝒟𝒌Y]=N=0+2NγN(G^(ϕ)+H^)Nn=0+m=0+γnγm(2)n(λ22)m,C[{\mathcal{D}}_{\bm{k}_{X}},{\mathcal{D}}_{\bm{k}_{Y}}]=\sum_{N=0}^{+\infty}2^{N}\gamma_{N}(\hat{G}(\phi)+\hat{H})^{N}-\sum_{n=0}^{+\infty}\sum_{m=0}^{+\infty}\gamma_{n}\gamma_{m}(\nabla^{2})^{n}\otimes(\lambda^{2}\nabla^{2})^{m}~{}, (9)

where H^(2I^+I^λ22)/2\hat{H}\equiv\left(\nabla^{2}\otimes\hat{I}+\hat{I}\otimes\lambda^{2}\nabla^{2}\right)/2, I^\hat{I} is the identity operator, and γnk2n/(22n(n!)2)\gamma_{n}\equiv\langle k^{2n}\rangle/(2^{2n}(n!)^{2}). According to Eqn. (9), the streaking covariance expands into multiple direct products between the nXn_{X}-th order partial derivative of XX and the nYn_{Y}-th order partial derivative of YY. We number them by the differentiation orders (nX+nY)(n_{X}+n_{Y}), e.g. the GIP term is the (1+1) order. The next orders are (1+3), (3+1), and (2+2), since all terms’ (nX+nY)(n_{X}+n_{Y}) are even numbers, as shown by Eqn. (9).

Refer to caption
Figure 2: Notation conventions used in this work. Two-dimensional MD of the continuum electron features are represented by X(𝒓)X(\bm{r}) (orange shades) and Y(𝒓)Y(\bm{r}) (green shades). Reference feature XX is always in the impulsive regime, with streaking vector 𝒌=𝒌X\bm{k}=\bm{k}_{X} (orange arrow). When feature YY is also impulsive, the angle between the two streaking vectors (arrows in respective colors) defines ϕ\phi. Angular bins are labeled by their central angular positions θq,θp\theta_{q},\theta_{p} measured from +x+x direction, with radial boundaries marked as rings in respective colors. Xq,YpX_{q},Y_{p} denote the integral of X(𝒓),Y(𝒓)X(\bm{r}),Y(\bm{r}) in the angular bins at θq,θq\theta_{q},\theta_{q}, respectively. The red arc indicates the convention for the chirality of the dressing vector potential 𝑨(t)\bm{A}(t).

In a measurement, we record electron yield with finite momentum resolution. We consider the covariance between the electron yield measured in two arbitrary regions QQ and PP. Due to the bilinearity of covariance, this is identical to the regional integral of the covariance between the densities XX and YY:

C[Qd2𝒓X(𝒓),Pd2𝒓Y(𝒓)]=Qd2𝒓Pd2𝒓′′C[X,Y](𝒓,𝒓′′).C\left[\int_{Q}d^{2}\bm{r}X(\bm{r}),\int_{P}d^{2}\bm{r}Y(\bm{r})\right]\\ =\int_{Q}d^{2}\bm{r}^{\prime}\int_{P}d^{2}\bm{r}^{\prime\prime}C[X,Y](\bm{r}^{\prime},\bm{r}^{\prime\prime})~{}. (10)

As illustrated in Fig. 2, we define two sets of 2D momentum regions of interest (ROIs): {Qq}q=1NQ\{Q_{q}\}_{q=1}^{N_{Q}} for XX and {Pp}p=1NP\{P_{p}\}_{p=1}^{N_{P}} for YY. We use subscripts q,pq,p on X,YX,Y as a shorthand for the regional integrals, e.g. XqQqX(𝒓)d2𝒓X_{q}\equiv\int_{Q_{q}}X(\bm{r})d^{2}\bm{r}, as illustrated in Fig. 2. Fig. 3(a) shows an exemplary visualization of K[X,Y]K[X,Y], where the covariance has been calculated between NQ=NP=180N_{Q}{=}N_{P}{=}180 ROIs on the pz=0p_{z}{=}0 plane, each 22^{\circ}-wide. The 180 angular bins are on a ring pmin<|𝒓|<pmaxp_{\mathrm{min}}<|\bm{r}|<p_{\mathrm{max}} and labelled by their central angular positions θq\theta_{q} or θp\theta_{p}, thus they are also referred to as angular bins. In Fig. 3, both photoelectron features XX and YY are simulated in the same ionization process as Fig. 1 but detected separately, and we set λ=1\lambda=1 to simulate the scenarios where the difference in photoemission delays is much shorter than the dressing field period ω|τXτY|1\omega|\tau_{X}-\tau_{Y}|\ll 1. As shown in the ϕ=π/3\phi=\pi/3 example in Fig. 3(a), one feature of K[X,Y]K[X,Y] is that the most positive part is around θpθqϕ\theta_{p}-\theta_{q}\sim\phi. At small streaking amplitudes, K[X,Y]K[X,Y] is well approximated by the GIP, as shown in Fig. 3(b-c). The the relative error MGIPK[X,Y]2/K[X,Y]2\|M_{\mathrm{GIP}}-K[X,Y]\|_{2}/\|K[X,Y]\|_{2} between panel (b) and (a) is evaluated as 0.030.03, which we use to quantify the accuracy of the GIP. The accuracy of the GIP model is insensitive to ϕ\phi, but it degrades as streaking amplitude kk increases, due to the increased contribution from the higher order terms. As shown by the solid curve in Fig. 3(c), the root-mean-squared (rms) relative error over ϕ[π,+π]\phi\in[-\pi,+\pi] becomes comparable to unity once kk exceeds the width of the photoelectron feature in momentum Δp\Delta p (quantified as the full-width at half-maximum, FWHM). For panels (a-b), the lower boundary pminp_{\mathrm{min}} of the ROIs is set to the maximum point of the unstreaked MD gradient pMGargmaxr𝑑θ|I0|p_{\mathrm{MG}}\equiv\arg\max_{r}\int d\theta|\nabla I^{0}|  (red solid line in Fig. 3(d)), and the upper boundary is set to where the streaked MD has fallen to zero. This choice of limit for pminp_{\min} optimizes the accuracy of the GIP model, because pMGp_{\mathrm{MG}} is close to the zero point of 2I0\nabla^{2}I^{0}. As shown in Fig. 3(c-d), when we alternate pminp_{\min} by a fraction of Δp\Delta p, the relative error is increased when k/Δp<1k/\Delta p<1.

Refer to caption
Figure 3: Streaking covariance compared to its leading order with streaking amplitude kk. Both features XX and YY are simulated under the same condition as in Fig. 1, throughout (a-e). (a) Streaking covariance K[X,Y]K[X,Y] with ϕ=π/3\phi=\pi/3. The lower radial boundary of the ROIs pminp_{\min} is at the maximal gradient line pMGp_{\mathrm{MG}}, see main text, and the upper boundary is pmax=2.15p_{\max}{=}2.15 a.u. (b) The leading order in K[X,Y]K[X,Y], i.e. the gradient inner-product (GIP, see Eqn. (7)). (c) Difference between (a) and (b), values scaled 10x for visibility. Colorbar is shared among (a-c). (d) Relative error of the GIP, depending on the normalized streaking amplitude k/Δpk/\Delta p, with Δp\Delta p denoting the momentum width FWHM (see main text). Colors correspond to the ROI lower bound pminp_{\min} marked in (e), among which the red corresponds to pmin=pMGp_{\min}{=}p_{\mathrm{MG}}. The relative error between (a) and (b) is marked as the red star. (e) Average differential MD from the unstreaked MD of each photoelectron feature 𝔼[X]𝔼[X0]\mathbb{E}\left[X\right]-\mathbb{E}\left[X^{0}\right]. Values are normalized to maxX0\max X^{0}. Radial momentum pr=|𝒓|p_{r}{=}|\bm{r}|. (f) Relative error of the GIP for various normalized radii pc/Δpp_{c}/\Delta p of the photoelectron feature, with pminp_{\min} chosen at the corresponding pMGp_{\mathrm{MG}} in each case of pc/Δpp_{c}/\Delta p.

For the photoelectron features produced in direct ionization, the discrepancy between the GIP model and the streaking covariance K[X,Y]K[X,Y] generally depends on three characteristic dimensionless quantities: (1) the size of the momentum shift normalized to the momentum spread of the photoelectron feature k/Δpk/\Delta p, (2) the normalized radius pc/Δpp_{c}/\Delta p, with pcp_{c} denoting the central momentum of the photoelectron feature, and (3) the ratio of the ionizing pulse duration to the streaking laser period ΔtX/TL\Delta t_{X}/T_{L}. In the impulsive regime ΔtXTL\Delta t_{X}\ll T_{L}, we find that the relative error is predominantly a function of k/Δpk/\Delta p, with a much weaker dependence on pc/Δpp_{c}/\Delta p, as shown in Fig. 3(f). In the case shown in panels (a-e), both XX and YY have a normalized radius of pc/Δp=6.6p_{c}/\Delta p=6.6, and when we vary pc/Δpp_{c}/\Delta p by changing ΔtX\Delta t_{X} and/or the photon energy of the ionizing pulse, the relative error curve does not significantly change as long as we remain in the impulsive regime (ΔtXTL\Delta t_{X}\ll T_{L}). The two features XX and YY can have different ΔpX,ΔpY\Delta p_{X},\Delta p_{Y}, but we note that by defining Δp\Delta p as the geometric mean Δp=ΔpXΔpY\Delta p=\sqrt{\Delta p_{X}\Delta p_{Y}}, the accuracy of the GIP can be well described by k/Δpk/\Delta p.

II.2 Contribution from Machine Fluctuations

We differentiate the shot-to-shot fluctuations in angular streaking experiments into two categories, the fluctuation of the streaking vector 𝒌\bm{k}, and everything else. The fluctuations of these other parameters (or “machine fluctuations”), such as pulse energy and central photon energy of the ionizing pulse, also affect the single-shot MD, but the relative timing between the ionizing and dressing pulses typically does not depend on these parameters. Thus in this work, machine fluctuations are assumed to be statistically independent from the random streaking vector. The streaking covariance K[X,Y]K[X,Y] defined in Eqn. (6) is therefore simplified by the impulsive condition as K[X,Y]C[𝒟𝒌XX0,𝒟𝒌YY0]K[X,Y]\simeq C[\mathcal{D}_{\bm{k}_{X}}\langle X^{0}\rangle,\mathcal{D}_{\bm{k}_{Y}}\langle Y^{0}\rangle]. We note that the expected dressing-free MDs X0,Y0\langle X^{0}\rangle,\langle Y^{0}\rangle are fixed, free from shot-to-shot variations, so the treatment of K[X,Y]K[X,Y] introduced in Sec. II.1 are also applicable when machine fluctuations are present, by simply replacing the stable X0,Y0X^{0},Y^{0} with X0,Y0\langle X^{0}\rangle,\langle Y^{0}\rangle respectively.

Both machine fluctuations and fluctuations in 𝒌\bm{k} contribute to the covariance C[X,Y]C[X,Y] defined in Eqn. (5). According to the law of total covariance [24], C[X,Y]=K[X,Y]+L[X,Y]C[X,Y]=K[X,Y]+L[X,Y] consists of the streaking covariance K[X,Y]K[X,Y] defined in Eqn. (6) and the contribution from machine fluctuations:

L[X,Y]𝔼[C[X,Y|𝒌]]𝔼[𝒟𝒌X𝒟𝒌Y]C[X0,Y0],L[X,Y]\equiv\mathbb{E}\left[C[X,Y|\bm{k}]\right]\simeq\mathbb{E}\left[{\mathcal{D}}_{\bm{k}_{X}}\otimes{\mathcal{D}}_{\bm{k}_{Y}}\right]C[X^{0},Y^{0}]~{}, (11)

where in the approximation, we used the impulsive streaking condition and the statistical independence between machine fluctuations and streaking vector.

III Methods to Retrieve Delay

Figure 4 illustrates three approaches for retrieving the relative angle ϕ\phi, which is directly proportional to the time delay between XX and YY, from the measured momentum distribution. Measuring NsN_{s} laser shots, we obtain a sample of electron yield MDs: X(i),Y(i),1iNsX^{(i)},Y^{(i)},1{\leq}i{\leq}N_{s}. The sample covariance between the regional yields XqX_{q} and YpY_{p}, (CXY)qp(XqYp¯Xq¯Yp¯)Ns/(Ns1)(\mathrm{C}_{XY})_{qp}\equiv(\overline{X_{q}Y_{p}}-\overline{X_{q}}~{}\overline{Y_{p}})N_{s}/(N_{s}-1), with ¯\overline{\bullet} the sample mean over shots, gives an estimate of the underlying covariance C[Xq,Yp]C[X_{q},Y_{p}]. As introduced in Sec. I, a key feature of the covariance analysis is leveraging the angular isotropy of the 𝒌\bm{k} distribution to circumvent the need for single-shot knowledge of 𝒌\bm{k}. While this feature avoids errors introduced control or quantification of 𝒌\bm{k}, it prevents forming a sample estimator according to K[X,Y]C[𝔼[X|𝒌],𝔼[Y|𝒌]]K[X,Y]\equiv C[\mathbb{E}\left[X|\bm{k}\right],\mathbb{E}\left[Y|\bm{k}\right]], as an estimator of the conditional expectation 𝔼[|𝒌]\mathbb{E}\left[\cdot|\bm{k}\right] would require binning shots by vector 𝒌\bm{k}. Thus our general strategy is to remove the contribution of machine fluctuations from the sample covariance CXY,CXX\mathrm{C}_{XY},\mathrm{C}_{XX} and CYY\mathrm{C}_{YY} to obtain a sample estimate of the streaking covariance (or “K-estimators”) denoted as KXY,KXX\mathrm{K}_{XY},\mathrm{K}_{XX} and KYY\mathrm{K}_{YY}. These modelling and removal procedures are detailed in Sec. IV.1. Throughout this work, the sample estimates are denoted with subscripts (e.g. CXY\mathrm{C}_{XY}), whereas the underlying covariance and streaking covariance are with brackets (e.g. C[X,Y]C[X,Y]).

Refer to caption
Figure 4: Overview of delay retrieval methods. Two photoelectron features are imaged in momentum space over many shots (top left). Streaking vectors of the two features (colored arrows at center) vary from shot to shot, but the angle ϕ\phi between the two directions remains stable. Colored rings delineate the ROI of angular bins for each feature. Based on the electron yield measured over many shots, one obtains a sample estimate of the streaking covariance matrices KXX,KXY\mathrm{K}_{XX},\mathrm{K}_{XY} and KYY\mathrm{K}_{YY} (bottom left). Two approaches to gradient reconstruction (top right) are introduced: numerical differentiation and rank-reduction, see main text. With either approach, the gradients (top right inset) support the GIP model for delay retrieval from the measured KXY\mathrm{K}_{XY}. Alternatively, a gradient-free method (bottom right) retrieves the delay from the sample correlation matrix CorrXY\mathrm{Corr}_{XY}.

The delay retrieval methods are generally separated into two classes, gradient-based (involving reconstruction of the MD gradient) and gradient-free. In the gradient-based methods, we reconstruct the density gradients of the dressing-free MD XqQqX0d2𝒓\nabla X_{q}\equiv\int_{Q_{q}}\nabla\langle X^{0}\rangle d^{2}\bm{r} and YpPpY0d2𝒓\nabla Y_{p}\equiv\int_{P_{p}}\nabla\langle Y^{0}\rangle d^{2}\bm{r}, in order to fit the GIP model (Eqn. (8)) to the sample streaking covariance KXY\mathrm{K}_{XY}. In the gradient-free method, we use the K-estimators to calculate a sample correlation matrix CorrXY\mathrm{Corr}_{XY} and leverage part of CorrXY\mathrm{Corr}_{XY} to circumvent reconstruction of the gradients. The procedures and benefits of these different approaches are described below. As introduced in Sec. I, one use case of these approaches is to measure the relative photoemission delay between two features produced by the same ionizing pulse τYτX\tau_{Y}-\tau_{X}. Although λ\lambda also depends on τYτX\tau_{Y}-\tau_{X}, it is much less sensitive than ϕ\phi when |τYτX|1/ωL|\tau_{Y}-\tau_{X}|\lesssim 1/\omega_{L}, so we focus on retrieving ϕ\phi.

III.1 Gradient-based Methods

To retrieve the delay, the GIP model of K[X,Y]K[X,Y] relies on the gradients Xq,Yp\nabla X_{q},\nabla Y_{p}:

ϕfit=argminq,p|k2λ2(Xq)TR(ϕ)Yp(KXY)qp|2.\phi_{\mathrm{fit}}=\mathrm{argmin}\sum_{q,p}\left|\frac{\langle k^{2}\rangle\lambda}{2}(\nabla X_{q})^{T}R(-\phi)\nabla Y_{p}-(\mathrm{K}_{XY})_{qp}\right|^{2}~{}. (12)

One way to reconstruct the gradients Xq,Yp\nabla X_{q},\nabla Y_{p} is through numerical differentiation (ND). Dressing-free shots provide the sample mean of the measured distributions X0¯\overline{X^{0}} and Y0¯\overline{Y^{0}} to estimate the expectations X0\langle X^{0}\rangle and Y0\langle Y^{0}\rangle, respectively. With adequate momentum resolution, we can use a finite difference scheme to numerically differentiate X0¯\overline{X^{0}}, and then integrate over QqQ_{q} to measure the gradient Xq\nabla X_{q}. An example is shown in Fig. 5(a) for the case in Fig. 1 &3. Obtaining the gradients Xq,Yp\nabla X_{q},\nabla Y_{p} from numerical differentiation, we fit the GIP model to the sample streaking covariance KXY\mathrm{K}_{XY}, with the factor k2λ\langle k^{2}\rangle\lambda treated as a free parameter in Eqn. (12). Because the amplitude ratio λ\lambda has been absorbed into the free scaling parameter, not knowing λ\lambda does not affect this method. We benchmark the accuracy of the ND method in the noiseless limit: i.e. with sufficient number of shots NsN_{s}\to\infty to suppress statistical noise, complete removal of machine fluctuations, and ignoring readout noise in the electron yield. Thus we equate KXY\mathrm{K}_{XY} to the underlying K[X,Y]K[X,Y] and obtain the delay retrieval error Δϕ=ϕfitϕ\Delta\phi=\phi_{\mathrm{fit}}-\phi at various “ground-truth” ϕ\phi, Fig. 5(b). This systematic error is zero when ϕ\phi is a multiple of π/2\pi/2 and reaches a maximum at ϕ±π/4\phi\,{\sim}\pm\pi/4. The rms error over ϕ\phi increases with kk due to the increase in higher-order terms, as shown in panel (c), but within k<Δpk<\Delta p this rms error is <0.03rad=1.7<0.03\,\mathrm{rad}{=}1.7^{\circ}, which converts to a time delay error <30as<30\,\mathrm{as} for a 2μm2\,\mathrm{\mu m}-wavelength dressing field.

Refer to caption
Figure 5: Accuracy of gradient-based delay retrieval methods in the noiseless limit. The simulation conditions, measurement scheme, including ROI boundaries, are the same as in Fig. 3(a), except with ϕ\phi and k/Δpk/\Delta p scanned. (a-c) With gradient reconstructed by numerical differentiation of the dressing-free MD. (a) Reconstructed gradient field. (b) Delay retrieval error ϕfitϕ\phi_{\mathrm{fit}}-\phi with the ND method, depending on the ground-truth ϕ\phi. Normalized streaking amplitudes k/Δpk/\Delta p are encoded by the color of traces. (c) The retrieval error in (b) root-mean-squared over ϕ\phi. (d) Gradient reconstructed with RR, at k/Δpk/\Delta p=0.5. (e-f) Same as (b-c) but with RR gradients.

Another way to reconstruct the gradient is “rank-reduction” (RR), described in Algorithm 1, which employs a low-rank approximation of K[X,X]K[X,X] and K[Y,Y]K[Y,Y] to estimate the gradient. This method can be used when each set of ROIs complete a loop on the high-momentum side of the corresponding photoelectron feature, e.g. the rings delineated in Fig. 4. Within the GIP model, K[Xq,Xq]=k2/2(Xq)TXqK[X_{q},X_{q^{\prime}}]=\langle k^{2}/2\rangle(\nabla X_{q})^{T}\nabla X_{q^{\prime}}, whose rank is at most 2 since Xq\nabla X_{q} is 2D. Thus it appears we can solve the following optimization problem given KXX\mathrm{K}_{XX}:

minimize fRR(ξ)\displaystyle\text{minimize }f_{\mathrm{RR}}(\xi)\equiv q,q|ξqTξq(KXX)qq|2Wqq,\displaystyle\sum_{q,q^{\prime}}\left|\xi_{q}^{T}\xi_{q^{\prime}}-(\mathrm{K}_{XX})_{qq^{\prime}}\right|^{2}W_{qq^{\prime}}~{}, (13a)
subject to lRR(ξ)\displaystyle\text{subject to }l_{\mathrm{RR}}(\xi)\equiv q=1NQ(sinθq,cosθq)ξqaq=0,\displaystyle\sum_{q=1}^{N_{Q}}\left(\begin{matrix}-\sin\theta_{q},&\cos\theta_{q}\end{matrix}\right)\xi_{q}a_{q}=0~{}, (13b)

where ξ=(ξ1,,ξNQ)2×NQ\xi=(\xi_{1},\cdots,\xi_{N_{Q}})\in\mathbb{R}_{2{\times}N_{Q}} represents the gradient field, aqa_{q} is the arc length of the angular bin at θq\theta_{q} and WW is a NQ×NQN_{Q}\times N_{Q} matrix weights, which can be configured to combat the effect of readout noise as described below. The reconstructed gradient field is given by the optimal point ξ\xi^{*} of Eqn. (13) up to a global factor, ξq=k2/2Xq\xi_{q}^{*}=\sqrt{\langle k^{2}/2\rangle}\nabla X_{q}. Applying the same procedure to KYY\mathrm{K}_{YY}, we obtain the optimal point ηp=k2/2λYp\eta^{*}_{p}=\sqrt{\langle k^{2}/2\rangle}\lambda\nabla Y_{p}. Thus the GIP model described in Eqn. (7) can be rewritten as MGIP=(ξ)TR(ϕ)ηM_{\mathrm{GIP}}=(\xi^{*})^{T}R(-\phi)\eta^{*}, which we then substitute into Eqn. (12) to retrieve ϕfit\phi_{\mathrm{fit}}.

The minimization of the objective function fRRf_{\mathrm{RR}} is closely related to principal component analysis (PCA). The weight matrix can be configured as uniform Wqq=1W_{qq^{\prime}}=1 when the signal-to-noise ratio is high in the electron yield readout process. When the readout signal-to-noise ratio is low, we recommend ignoring the diagonal band of KXX\mathrm{K}_{XX} by setting the diagonal band in WW to zero (e.g. Wqq=1δqqW_{qq^{\prime}}=1-\delta_{qq^{\prime}}). Guidance for defining is detailed in Sec. IV.2. With a uniform WW, a minimal point of fRRf_{\mathrm{RR}} is given by the PCA result ξP=(s(1)v(1),s(2)v(2))T\xi^{\mathrm{P}}=\left(\sqrt{s^{(1)}}v^{(1)},\sqrt{s^{(2)}}v^{(2)}\right)^{T} [25], where s(α)s^{(\alpha)} is the α\alpha-th largest eigenvalue of KXX\mathrm{K}_{XX} and v(α)v^{(\alpha)} is the corresponding eigenvector (in column vector form). When WW is non-uniform, e.g. Wqq=1δqqW_{qq^{\prime}}=1-\delta_{qq^{\prime}}, we can minimize fRRf_{\mathrm{RR}} by optimizing ξP\xi^{\mathrm{P}} using gradient descent.

The objective function fRRf_{\mathrm{RR}} is invariant under any global orthogonal transformation O:ξqOξqO{:}~{}\xi_{q}\mapsto O\xi_{q}, but only certain minimal points can satisfy the zero-loop constraint Eqn. (13b), which is a general property of a gradient field X0d𝒍=0\oint\nabla X^{0}\cdot d\bm{l}=0. Thus obtaining one minimal point ξP\xi^{\mathrm{P}}, we solve for an orthogonal matrix OO such that lRR(OξP)=0l_{\mathrm{RR}}(O\xi^{\mathrm{P}})=0 This is straightforwardly achieved by parameterizing OO with the rotation angle and parity. An alternative to the zero-loop constraint is maximizing the gradient-flux jRR(OξP)q(cosθq,sinθq)OξqPaqj_{\mathrm{RR}}(O\xi^{\mathrm{P}})\equiv-\sum_{q}(\cos\theta_{q},~{}\sin\theta_{q})O\xi^{\mathrm{P}}_{q}a_{q} with OO, which necessarily satisfies the zero-loop constraint and additionally breaks the remaining discrete degeneracy, as proven in Supplemental Sec. 2.

Algorithm 1 Rank-reduction gradient reconstruction
Perform PCA on KXX\mathrm{K}_{XX}, obtaining ξP(s(1)v(1),s(2)v(2))T\xi^{\mathrm{P}}\leftarrow(\sqrt{s^{(1)}}v^{(1)},\sqrt{s^{(2)}}v^{(2)})^{T};
if WW is not uniform then
     Update ξP\xi^{\mathrm{P}} to minimize fRR(ξP)f_{\mathrm{RR}}(\xi^{\mathrm{P}}) using gradient descent;
end if
OargmaxOjRR(OξP)O^{*}\leftarrow\arg\max_{O}j_{\mathrm{RR}}(O\xi^{\mathrm{P}}) with ξP\xi^{\mathrm{P}} fixed;
return ξ=OξP\xi^{*}=O^{*}\xi^{\mathrm{P}};

Similar to the results in Fig. 5(b-c), we benchmark the RR method under various ϕ\phi and kk in the noiseless limit as mentioned above, as shown in panels (d-f). The gradient fields Xq\nabla X_{q} and Yp\nabla Y_{p} are reconstructed from KXX\mathrm{K}_{XX} and KYY\mathrm{K}_{YY}, respectively, and we show the resultant Xq\nabla X_{q} in panel (d). Similar to the ND method, the delay retrieval error of the RR method is minimized when ϕ\phi is a multiple of π/2\pi/2. The rms error increases with kk, but the magnitude of the error is notably smaller than with the ND method. As shown in Fig. 5(f), the rms error is <1{<}1 mrad within k<Δpk{<}\Delta p, corresponding to <1{<}1 as for a 2μm2\,\mathrm{\mu m} dressing field.

The main reason for the higher accuracy of the RR method lies in the reconstructed “gradient”. The ND gradients give the first-order derivatives of the dressing-free MD Xq\nabla X_{q} and Yp\nabla Y_{p}, independent from the streaking amplitude kk. The gradient reconstructed by the RR procedure, in contrast, deviates from the first-order derivative as kk increases. To the next lowest order, the RR gradients are

ξq\displaystyle\xi_{q} =k22Qqd2𝒓(1+k48k22)X0+o(k4),\displaystyle=\sqrt{\frac{\langle k^{2}\rangle}{2}}\int_{Q_{q}}d^{2}\bm{r}\nabla\left(1+\frac{\langle k^{4}\rangle}{8\langle k^{2}\rangle}\nabla^{2}\right)\langle X^{0}\rangle+o(k^{4})~{}, (14a)
ηp\displaystyle\eta_{p} =k22λPpd2𝒓(1+k4λ28k22)Y0+o(k4),\displaystyle=\sqrt{\frac{\langle k^{2}\rangle}{2}}\lambda\int_{P_{p}}d^{2}\bm{r}\nabla\left(1+\frac{\langle k^{4}\rangle\lambda^{2}}{8\langle k^{2}\rangle}\nabla^{2}\right)\langle Y^{0}\rangle+o(k^{4})~{}, (14b)

which include the third-order derivatives in addition to the first-order (see Supplementary Sec. 1.2). As a result, when using the RR gradients, the inner product ξqTR(ϕ)ηp\xi_{q}^{T}R(\phi)\eta_{p} not only captures the (1+1) order of K[Xq,Yp]K[X_{q},Y_{p}], but also the (1+3) and (3+1) orders. In contrast when using the ND gradient, only the (1+1) order is modelled. Since the errors in the noiseless limit arise from the finite accuracy of the GIP model and are insensitive to the normalized radius pc/Δpp_{c}/\Delta p, the result in Fig. 5 is general across pc/Δpp_{c}/\Delta p.

III.2 Gradient-free Method

In some experiments, gradient reconstruction may be challenging. For example, the numerical differentiation becomes infeasible when the detector lacks angular resolution, and RR becomes infeasible when the available ROIs cannot complete a loop. In this case, it is often possible to identify a part of the streaking correlation matrix R[Xq,Yp]K[Xq,Yp]/K[Xq,Xq]K[Yp,Yp]R[X_{q},Y_{p}]\equiv{K[X_{q},Y_{p}]}/{\sqrt{K[X_{q},X_{q}]K[Y_{p},Y_{p}]}} that can be used to extract the photoemission delay without requiring knowledge of the gradient.

In the angular regions where the gradient is predominantly along the radial direction |r1θXq||rXq||r^{-1}\partial_{\theta}X_{q}|\ll|\partial_{r}X_{q}|, |r1θYp||rYp||r^{-1}\partial_{\theta}Y_{p}|\ll|\partial_{r}Y_{p}| (also referred to as the “radial regions”), the streaking covariance is approximated as

K[Xq,Yp]k22cos(ϕθp+θq)(rXq)(λrYp).K[X_{q},Y_{p}]\approx\frac{\langle k^{2}\rangle}{2}\cos(\phi-\theta_{p}+\theta_{q})(\partial_{r}X_{q})(\lambda\partial_{r}Y_{p})~{}. (15)

The cos(ϕθp+θq)\cos(\phi-\theta_{p}+\theta_{q}) factor explains the positive ridge around θpθqϕ\theta_{p}-\theta_{q}{\sim}\phi in the covariance matrix. The radial gradients in Eqn. (15), including the factor λ\lambda, cancel out, i.e. R[Xq,Yp]cos(ϕθp+θq)R\left[X_{q},Y_{p}\right]\approx\cos(\phi-\theta_{p}+\theta_{q}). Therefore we define the gradient-free model as MGFacos(ϕθp+θq)M_{\mathrm{GF}}\equiv a\cos(\phi-\theta_{p}+\theta_{q}), with free parameters aa and ϕ\phi. From the measured K-estimators, we calculate the sample correlation matrix CorrXY=KXY/KXXKYY\mathrm{Corr}_{XY}=\mathrm{K}_{XY}/\sqrt{\mathrm{K}_{XX}\mathrm{K}_{YY}}, and minimize the mean-squared error in the radial region between CorrXY\mathrm{Corr}_{XY} and the model MGFM_{\mathrm{GF}}.

Refer to caption
Figure 6: Accuracy of gradient-free methods in the absence of machine fluctuations. The simulation conditions, measurement scheme, including ROI boundaries, are the same as in Fig. 3(a). (a) Sample correlation matrix CorrXY\mathrm{Corr}_{XY} at ϕ=π/3\phi=\pi/3. The radial regions are enclosed by black rectangular boxes. (b-c) Same as Fig. 5 except using the gradient-free delay retrieval method.

The radial region is determined from the dressing-free MD. For the case shown in Figs. 3 and 5, the anisotropy parameter is β2=2\beta_{2}=2, the radial region defined by |r1θXq|<0.2|rXq||r^{-1}\partial_{\theta}X_{q}|<0.2|\partial_{r}X_{q}| is 138138^{\circ}-wide around each antinode of the dipole feature. For lower anisotropy parameters, the radial region is larger, since an isotropic feature has no angular gradient component. As shown in Fig. 6(a), within the radial region, the correlation matrix is well described by the gradient-free model MGFM_{\mathrm{GF}}. By fitting MGFM_{\mathrm{GF}} to the measured CorrXY\mathrm{Corr}_{XY} in the radial region, we obtain ϕfit\phi_{\mathrm{fit}}, and the error ϕfitϕ\phi_{\mathrm{fit}}-\phi is shown in Fig. 6(b & c). Similar to the gradient-based methods, the error is zero at ϕ=0,±π/2\phi=0,\pm\pi/2, but the rms error across ϕ[0,2π]\phi\in[0,2\pi] does not vanish when k0k\to 0. This residual error at small streaking amplitude arises from ignoring the non-radial regions, and in the case shown in Fig. 6, it amounts to 2020 mrad rms. The systematic error of the gradient-free method also increases with kk, but the increase is slower than for the ND method. In this case, comparing Fig. 6(c) to Fig. 5(c), we find the gradient-free method is more accurate than the ND method when kΔpk\gtrsim\Delta p, but remains less accurate than the RR method. Another difference between the gradient-free and gradient-based methods is the dependence on pc/Δpp_{c}/\Delta p: the gradient-free method is generally more accurate with higher pc/Δpp_{c}/\Delta p, since the magnitude of the radial gradient relative to the angular gradient increases.

IV Discussion

IV.1 Machine Fluctuations

When implementing analysis procedures that make use of fluctuations in the measured data, it is important to understand how instabilities in the measurement affect the measured correlation, in this case CXY\mathrm{C}_{XY}. When using an attosecond x-ray free electron laser there may be fluctuations in the x-ray parameters, for example the pulse energy and/or the central photon energy. In this section we describe how to manage the effect of these additional fluctuations in the delay extraction procedure.

IV.1.1 Additional Contribution from Machine Fluctuations in the Impulsive Regime

Refer to caption
Figure 7: Decomposition of covariance into contributions from streaking effects and machine fluctuations. (a) Total covariance C[X,Y]C[X,Y] at ϕ=π/3\phi=\pi/3. In this case, the intensity and central photon energy of the ionizing pulse and streaking amplitude are randomly distributed (see main text). The average machine conditions k/Δp=0.26\langle k\rangle/\Delta p=0.26, measurement scheme, including ROI boundaries, are the same as in Fig. 3(a-b). (b) Streaking covariance K[X,Y]K[X,Y]. (c-f) The machine fluctuation L[X,Y]L[X,Y] expanded into different (nX+nY)(n_{X}+n_{Y}) orders: (c) (0+0) order L0+0=C[X0,Y0]L_{0+0}=C[X^{0},Y^{0}], i.e. the unstreaked covariance, (d) (1+1) order k2/2G^(ϕ)L0+0\langle k^{2}/2\rangle\hat{G}(\phi)L_{0+0}, (e) sum of (2+0) and (0+2) orders k2/2H^L0+0\langle k^{2}/2\rangle\hat{H}L_{0+0}, (f) Higher-order terms (nX+nY4n_{X}{+}n_{Y}{\geq}4), amplified 10x for visibility.

As discussed in Sec. II.2, the measured covariance is the arithmetic sum of the streaking covariance and the machine fluctuations C[X,Y]=K[X,Y]+L[X,Y]C[X,Y]=K[X,Y]+L[X,Y]. In the the impulsive regime, L[X,Y]𝔼[𝒟𝒌X𝒟𝒌Y]C[X0,Y0]L[X,Y]\simeq\mathbb{E}\left[{\mathcal{D}}_{\bm{k}_{X}}\otimes{\mathcal{D}}_{\bm{k}_{Y}}\right]C[X^{0},Y^{0}], as written in Eqn. (11). Similar to K[X,Y]K[X,Y] in the impulsive regime, 𝔼[𝒟𝒌X𝒟𝒌Y]\mathbb{E}\left[{\mathcal{D}}_{\bm{k}_{X}}\otimes{\mathcal{D}}_{\bm{k}_{Y}}\right] also expands into direct products of partial derivative operators numbered by the orders (nX+nY)(n_{X}+n_{Y}). In contrast to K[X,Y]K[X,Y], L[X,Y]L[X,Y] has a leading order of (0+0) given by L0+0C[X0,Y0]L_{0+0}\equiv C[X^{0},Y^{0}], i.e. the covariance of the unstreaked MD. The (1+1) order of L[X,Y]L[X,Y] is given by the k2/2G^(ϕ)\langle k^{2}/2\rangle\hat{G}(\phi) operator acting on the two-point function C[X0,Y0]C[X^{0},Y^{0}], and the sum of (0+2) and (2+0) terms amounts to k2/2H^L0+0\langle k^{2}/2\rangle\hat{H}L_{0+0}. The decomposition of the total covariance C[X,Y]C[X,Y] in the presence of machine fluctuations is shown in Fig. 7. The parameters used in Fig. 7 are the same as those used in Fig. 3, but machine fluctuations have been introduced in the following manner. The central photon energy is normally distributed in 2 eV FWHM around 40.8 eV, the ionizing pulse energy follows a Gamma distribution Γ(α=β=2)\Gamma(\alpha{=}\beta{=}2), and the streaking amplitude follows a Rayleigh distribution with an expected k=0.06a.u.\langle k\rangle=0.06\,\mathrm{a.u.}. As shown in Fig. 7(a), the total covariance is mostly positive, with a positive ridge at roughly θpθqϕ\theta_{p}-\theta_{q}\sim\phi. The streaking covariance K[X,Y]K[X,Y], shown in panel (b), is nearly identical to Fig. 3(a), since the average machine condition remains the same. The composition of L[X,Y]L[X,Y] is shown in Fig. 7(c-f). The unstreaked covariance L0+0L_{0+0} is positive, which explains why the total covariance is overall more positive than the streaking covariance. At the same time, L0+0L_{0+0} is independent from the relative timing between the two features. The next order consists of the (1+1) term k2/2G^(ϕ)L0+0\langle k^{2}/2\rangle\hat{G}(\phi)L_{0+0} and the sum of (0+2) and (2+0) terms k2/2H^L0+0\langle k^{2}/2\rangle\hat{H}L_{0+0}, as shown in panels (d-e). The remaining higher-order terms in LL are negligible in this case, as shown in panel (f).

IV.1.2 Accounting for Machine Fluctuations in Delay Retrieval

As mentioned in Sec. III.1, the general strategy for obtaining the K-estimators (KXY\mathrm{K}_{XY}, KXX\mathrm{K}_{XX}, and KYY\mathrm{K}_{YY}) is to remove the contribution from machine fluctuations from the corresponding sample covariances CXY,CXX\mathrm{C}_{XY},\mathrm{C}_{XX}, and CYY\mathrm{C}_{YY}. Here we discuss two procedures for this removal: (1) using partial covariance with respect to a fluctuating global parameter(s), and (2) subtracting the unstreaked covariance.

In the linear regime of light-matter interactions, the electron yield depends on the pulse energy of the ionizing radiation, the sample density, and the detector gain. The combined effect of all these parameters can be described by a global scaling factor FF, which is uniform across momentum space and fluctuates from shot to shot. As long as FF is measured on a single-shot basis, taking the partial covariance with respect to FF removes the correlation resulting from the linear dependence of X,YX,Y on FF:

PCov[X,Y;F]C[X,Y]C[X,F]C[F,F]1C[F,Y],\mathrm{PCov}[X,Y;F]\equiv C[X,Y]-C[X,F]C[F,F]^{-1}C[F,Y]~{}, (16)

whose corresponding sample estimate is PCXY;FCXYCXFCFF1CFY\mathrm{PC}_{XY;F}\equiv\mathrm{C}_{XY}-\mathrm{C}_{XF}\mathrm{C}_{FF}^{-1}\mathrm{C}_{FY} [26]. In this case, we can decompose the two features X,YX,Y as the product of normalized distributions UU and VV with FF, X=FU,Y=FVX=FU,Y=FV. UU and VV depend on the streaking vector 𝒌\bm{k} but are statistically independent from FF. Combining Eqn. (16) with the law of total covariance, the partial covariance is PCov[X,Y;F]=F2C[U,V]=F2(K[U,V]+L[U,V])\mathrm{PCov}[X,Y;F]=\langle F^{2}\rangle C[U,V]=\langle F^{2}\rangle(K[U,V]+L[U,V]). The term F2K[U,V]\langle F^{2}\rangle K[U,V] can be recognized as the streaking covariance K[X,Y]=F2K[U,V]K[X,Y]=\langle F\rangle^{2}K[U,V] scaled by the factor F2/F21\langle F^{2}\rangle/\langle F\rangle^{2}\geq 1. The second term F2L[U,V]\langle F^{2}\rangle L[U,V] is smaller than the machine fluctuation L[X,Y]L[X,Y] by a positive definite value:

L[X,Y]F2L[U,V]=C[F,F]𝔼[𝔼[U|𝒌]𝔼[V|𝒌]].L[X,Y]-\langle F^{2}\rangle L[U,V]=C[F,F]~{}\mathbb{E}\left[\mathbb{E}\left[U|\bm{k}\right]\mathbb{E}\left[V|\bm{k}\right]\right]~{}. (17)

Therefore taking partial covariance has removed this difference that is proportional to C[F,F]C[F,F] and revealed the streaking covariance.

Another method for removing effect of machine fluctuations involves calculating the difference between measurements made with the dressing field and measurements made in the absence of the dressing field. If the sample covariance has been calculated, we use the difference CXYCX0Y0\mathrm{C}_{XY}-\mathrm{C}_{X^{0}Y^{0}} as KXYK_{XY}. In this case KXYK_{XY} becomes the sum of K[X,Y]K\left[X,Y\right] and all terms of the machine fluctuation covariance L[X,Y]L\left[X,Y\right] higher than L0+0=CX0Y0L_{0+0}=\mathrm{C}_{X^{0}Y^{0}}, as shown in Fig. 7. An improved estimate of K[X,Y]K\left[X,Y\right] can be calculated using the partial covariance, KXY=PCXY;FPCX0Y0;F\mathrm{K}_{XY}=\mathrm{PC}_{XY;F}-\mathrm{PC}_{X^{0}Y^{0};F}. In Fig. 8, we demonstrate the performance of KXY\mathrm{K}_{XY} as an estimate of the underlying streaking covariance K[X,Y]K[X,Y] in this case. Panels (a) and (b) compare the sample estimates KXY,KXX\mathrm{K}_{XY},\mathrm{K}_{XX} to the respective underlying streaking covariance K[X,Y],K[X,X]K[X,Y],K[X,X]. As shown in panel (c), the accuracy of all three delay retrieval methods (solid curves) described in Sec. III is generally worse than without machine fluctuations (dashed curves), but the change in the error is less than 20 mrad within k/Δp<2\langle k\rangle/\Delta p<2. This change in systematic error arises from the residual machine fluctuation contribution in the K-estimators. We note that to apply this subtraction procedure, it is important to acquire comparable amount of unstreaked shots as the streaked shots, to ensure the statistical noise from the unstreaked covariance does not dominate the noise in the subtraction result.

Refer to caption
Figure 8: Impact of machine fluctuation on the accuracy of delay retrieval, in the same case as Fig. 7. (a) Sample estimate of streaking covariance, see main text. KXY\mathrm{K}_{XY} is again exemplified at ϕ=π/3\phi=\pi/3. (b) The underlying streaking covariance. Color scales in (a) and (b) are symmetric about zero and bounded by the maximum absolute value in each map. (c) Delay retrieval rms error as k\langle k\rangle is scanned, of the three methods: ND-gradient (green solid), RR-gradient (red solid), and the gradient-free method (yellow solid). Dashed curves in corresponding colors are of each method in the absence of machine fluctuations, taken from Fig. 5(c)(f) and Fig. 6(c).

IV.1.3 Gradient Validation in Rank Reduction

Refer to caption
Figure 9: Validation of RR gradient in the presence of machine fluctuations, in the same case as Fig. 7. (a) Sample covariance CXX\mathrm{C}_{XX}. (b) Gradient reconstructed from CXX\mathrm{C}_{XX} with the rank-2 RR procedure (red dashed) is qualitatively different from the ND gradient (black solid). (c) Leading eigenvalues of CXX\mathrm{C}_{XX} (crosses) and KXX\mathrm{K}_{XX} (dots). Colors represent the characters of the corresponding eigenvectors in (d). (d) Eigenvectors displayed in matching markers and colors with (c).

The residual machine fluctuation contribution in KXX\mathrm{K}_{XX} is a potential systematic issue for the RR method, because the optimization problem Eqn. (13) finds a rank-2 approximation of KXX\mathrm{K}_{XX}. This is motivated by the fact that the gradient inner-product in K[X,X]K[X,X] has a rank of 2, but the residual machine fluctuations mix with the streaking covariance. Therefore in the presence of significant machine fluctuations, the gradient reconstructed with RR requires a validation. If the RR gradient differs significantly from the ND gradient, it should be considered invalid, and delay retrieval should not proceed using such a gradient. We can quantify the similarity between RR gradient ξ\xi^{*} and ND gradient g=(X1,,XNQ)g=(\nabla X_{1},\cdots,\nabla X_{N_{Q}}) by the average cosine similarity SC(a,b)|abT|/(ab)S_{C}(a,b)\equiv|ab^{T}|/(\|a\|\|b\|) of the components 𝒮(SC(ξx,gx)+SC(ξy,gy))/2\mathcal{S}\equiv(S_{C}(\xi^{*x},g^{x})+S_{C}(\xi^{*y},g^{y}))/2. For instance, applying the RR procedure to the CXX\mathrm{C}_{XX} shown in Fig. 9(a) without any removal of the machine fluctuations, the resultant RR gradient is significantly different from the ND gradient, as shown in panel (b). The average cosine similarity is 𝒮=0.5\mathcal{S}=0.5 resulting from similarity in xx component but orthogonality in yy. Given that there are two components (xx and yy) in the gradient field, we suggest a suitable threshold at 𝒮>(1/2+1)/2=3/4\mathcal{S}>(1/2+1)/2=3/4 for the validity of RR gradient. A higher threshold is less desirable, because a valid RR gradient deviates from the first-order partial derivative and outperforms the ND method in delay retrieval accuracy, as pointed out by Eqn. (14).

When an acceptable similarity between the RR gradient and ND gradient cannot be obtained, it is possible to generalize to a rank-3 approximation, i.e. to minimize fRR(ζ)f_{\mathrm{RR}}(\zeta) with ζ3×NQ\zeta\in\mathbb{R}_{3\times N_{Q}}. We obtain a minimal point ζP\zeta^{\mathrm{P}}, subject to the constraint that the rows of ζP\zeta^{\mathrm{P}} are orthogonal. We refer to the row vectors in ζP\zeta^{\mathrm{P}} as principal components. For each pair of principal components, we stack them as the ξP2×NQ\xi^{\mathrm{P}}\in\mathbb{R}_{2\times N_{Q}}, maximize the flux jRR(OξP)j_{\mathrm{RR}}(O\xi^{\mathrm{P}}) as in Algorithm 1. One should validate the reconstructed gradient ξ\xi^{*} with the ND gradient. When no pair of principal components can result in a valid ξ\xi^{*}, the RR method is not recommended. If there are multiple pairs that results in a valid ξ\xi^{*}, we choose the valid n ξ\xi^{*} with the largest spectral norm.

When the measured dressing-free MD has inversion symmetry, X0(𝒓)=X0(𝒓)X^{0}(\bm{r})=X^{0}(-\bm{r}), each principal component is either even f(𝒓)=f(𝒓)f(-\bm{r})=f(\bm{r}) or odd f(𝒓)=f(𝒓)f(-\bm{r})=-f(\bm{r}). This parity serves as a useful guide for selecting the pair of principal components: both xx and yy components of the gradient field must be odd. As shown in Fig. 9(c-d), the largest principal component of CXX\mathrm{C}_{XX} (blue) has even parity, whereas the second (orange) and third (green) are odd. After removing the machine fluctuation contribution from CXX\mathrm{C}_{XX}, the top two components of the resultant KXX\mathrm{K}_{XX} both have odd parity, and strongly resemble the second and third components of CXX\mathrm{C}_{XX}. A rank-3 approximation of CXX\mathrm{C}_{XX} with the aforementioned validation procedure results in the same RR gradient field as the reconstruction with KXX\mathrm{K}_{XX}.

IV.2 Shot Noise

Shot noise is ubiquitous in electron spectroscopy, arising from the particle nature of electrons. The measured electron yield in a momentum region QQ follows a Poisson distribution, with expectation QI(𝒓)𝑑𝒓\int_{Q}I(\bm{r})d\bm{r}, where the MD, I(𝒓)I(\bm{r}) varies from shot to shot. To understand the effect of shot noise on our measurement scheme, we denote EfxE_{\mathrm{fx}} as the expected electron yield at each set of ROIs. We also refer to this quantity as the “flux”, since it represents an average number electron counts per shot. Here we investigate the impact of shot noise on the delay retrieval methods, assuming both sets of ROIs {Qq}\{Q_{q}\} and {Pp}\{P_{p}\} receive the same electron flux. We simulate NsN_{s} measured shots by Poisson-sampling the MD, and apply the delay retrieval methods to this simulated data set. We repeat the Poisson sampling 10 times for each EfxE_{\mathrm{fx}}, and the mean Δϕ¯=ϕfit¯ϕ\overline{\Delta\phi}{=}\overline{\phi_{\mathrm{fit}}}-\phi and standard deviation σ[Δϕ]=σ[ϕfit]\sigma[\Delta\phi]{=}\sigma[\phi_{\mathrm{fit}}] over the repetitions quantify the systematic error and the statistical error, respectively.

Refer to caption
Figure 10: Effects of shot noise on the delay retrieval, in the absence of machine fluctuations. Shot noise is introduced to the case shown in Fig. 5. For each level of electron flux, Ns=7.2×105N_{s}=7.2\times 10^{5} shots are simulated with shot noise and uniformly distributed κ\kappa, to test the delay retrieval methods at various ϕ\phi: ND-gradient (green dotted), RR-gradient (red solid), diagonal-agnostic RR-gradient (red dashed), and the gradient-free method (yellow dotted). Over 10 repetitions, the mean Δϕ¯\overline{\Delta\phi} and standard deviation σ[Δϕ]\sigma[\Delta\phi] quantify the systematic error and statistical error, respectively. (a) Systematic error at k/Δp=0.5k/\Delta p{=}0.5, of the methods The inset in (a) shows the K-estimate KXX\mathrm{K}_{XX} at Efx=18E_{\mathrm{fx}}=18. (b) Statistical error at k/Δp=0.5k/\Delta p{=}0.5. When κ\kappa is sampled from 𝒰[0,2π]\mathcal{U}[0,2\pi] instead of strictly uniform, the statistical error of the gradient-based methods asymptotes to a noise floor due to finite NsN_{s} (black dot-dashed). (c) Total error of the methods ((Δϕ)2¯)1/2(\overline{(\Delta\phi)^{2}})^{1/2} over k/Δpk/\Delta p, at a fixed Efx=3×102E_{\mathrm{fx}}=3\times 10^{2}. For each method, the triangle marks the minimum.

Since shot noise is typically independent across non-intersecting regions, it does not generally induce systematic error in KXY\mathrm{K}_{XY}. There is an exception in the vicinity of the diagonal entries in KXX\mathrm{K}_{XX} and KYY\mathrm{K}_{YY}, since the corresponding pairs of ROIs have correlated shot noise. These entries in KXX\mathrm{K}_{XX} and KYY\mathrm{K}_{YY} are positively biased because the shot noise is unaccounted for in our model. For example, when regions Q1Q_{1} and Q2Q_{2} both include a contribution from the same detector pixel, (KXX)12(\mathrm{K}_{XX})_{12} is affected by such positive bias, i.e. (KXX)12K[X1,X2]>0\langle(\mathrm{K}_{XX})_{12}\rangle-K[X_{1},X_{2}]>0. The critical distance between the pair of ROIs is the noise correlation length of the detector. When the distance between two regions is shorter than this noise correlation length, a bias can be introduced.

The ND method relies on X0¯,Y0¯\overline{X^{0}},\overline{Y^{0}}, and KXY\mathrm{K}_{XY}, as illustrated in Fig. 4, all of which are free from bias induced by shot noise. Thus the systematic error of the ND method is independent of shot noise but remains at the noiseless limit, as shown by the green trace in Fig. 10(a). On the other hand, the RR method is affected by the positive bias in part of the K-estimators that arises from shot noise, as visualized in the inset of Fig. 10(a). The impact of the bias can be mitigated by assigning zero weights to these positively biased entries of KXX\mathrm{K}_{XX} in the RR procedure. In simulation, we can set Wqq=1δqqW_{qq^{\prime}}=1-\delta_{qq^{\prime}}, since it is straightforward to ensure the independence of the Poisson-sampling process between non-intersecting regions. We refer to this variant of the RR method, which is also free from the systematic errors arising from shot noise, as the diagonal-agnostic rank-reduction (daRR) method. This variation improves the accuracy of the delay extraction in low-count scenarios. In high-count scenarios, the daRR method is less accurate than the RR method, due to the small induced bias in the reconstructed gradient. As shown in Fig. 10(a), the point at which the RR method becomes more accurate than the daRR methods is Efx/NQ10E_{\mathrm{fx}}/N_{Q}\approx 10 for the parameters considered. As EfxE_{\mathrm{fx}}\to\infty, the systematic error of the RR method diminishes as 1/Efx\propto 1/E_{\mathrm{fx}}, approaching the noiseless limit.

The bias in KXX\mathrm{K}_{XX} and KYY\mathrm{K}_{YY} also affects the gradient-free delay retrieval method. Here, shot noise affects the sample correlation matrix CorrXY\mathrm{Corr}_{XY} via the positive bias in (KXX)qq(\mathrm{K}_{XX})_{qq} and (KYY)pp(\mathrm{K}_{YY})_{pp}. In the case of k/Δp=0.5,pc/Δp=6.6k/\Delta p=0.5,p_{c}/\Delta p=6.6, the systematic error of the gradient-free method exhibits two plateaus at both the low-count and the high-count ends, with a dip in between, as shown in Fig. 10(a). At high counts we approach the noiseless limit, where the error Δϕ=ϕfitϕ\Delta\phi=\phi_{\mathrm{fit}}-\phi has the same sign as ϕ\phi, as shown in Fig. 6. In the low-count region, in contrast, the error Δϕ\Delta\phi exhibits opposite sign to ϕ\phi. Between the two plateaus, when the sign of Δϕ\Delta\phi flips, a minimum systematic error is found. However, the exact EfxE_{\mathrm{fx}} for which the minimal systematic error occurs strongly depends on k/Δpk/\Delta p and pc/Δpp_{c}/\Delta p. Therefore, the flux providing the minimal error can only be accurately identified when a thorough characterization of the system parameters is possible.

The accuracy of the delay retrieval, quantified by the systematic error Δϕ¯\overline{\Delta\phi}, is insensitive to the number of shots NsN_{s}. This is because the accuracy of X0¯,Y0¯\overline{X^{0}},\overline{Y^{0}}, and K-estimators, depends on the electron flux and not NsN_{s}, when only shot noise is considered. In contrast, precision of the delay retrieval, quantified by the statistical error σ[Δϕ]\sigma[\Delta\phi], depends on NsN_{s}. This is because the variance of X0¯,Y0¯\overline{X^{0}},\overline{Y^{0}}, and K-estimators scales as 1/Ns{\propto}1/N_{s} when the measurement is repeated over a large number of shots NsN_{s}. Shot noise contributes to the variance in a way that is inversely proportional to the total electron counts detected NsEfxN_{s}E_{\mathrm{fx}}. As shown by the colored traces in Fig. 10(b), with shot noise alone, the statistical error of the retrieved delay scales as 1/NsEfx\propto 1/\sqrt{N_{s}E_{\mathrm{fx}}}. In addition to shot noise, the variance of X0¯,Y0¯\overline{X^{0}},\overline{Y^{0}}, and K-estimators, also include the sampling noise of the streaking direction κ\kappa, i.e. the non-uniformity of the distribution of a finite number of measured κ\kappa, which is 1/Ns\propto 1/N_{s} for large NsN_{s}. Therefore the statistical error shrinks as 𝒞κ/Ns+𝒞E/(NsEfx)\sqrt{\mathcal{C}_{\kappa}/N_{s}+\mathcal{C}_{E}/(N_{s}E_{\mathrm{fx}})} when both NsN_{s} and EfxE_{\mathrm{fx}} are large, where 𝒞κ\mathcal{C}_{\kappa} and 𝒞E\mathcal{C}_{E} are constants independent of NsN_{s} and EfxE_{\mathrm{fx}}. In the case shown in Fig. 10(a& b), k/Δp=0.5k/\Delta p=0.5, and these constants are 𝒞E=2.0rad2\mathcal{C}_{E}=2.0\,\mathrm{rad}^{2} and 𝒞κ=0.045rad2\mathcal{C}_{\kappa}=0.045\,\mathrm{rad}^{2}. The sampling noise of κ\kappa does not introduce systematic error in the delay retrieval, as it preserves the accuracy of X0¯,Y0¯\overline{X^{0}},\overline{Y^{0}}, and the K-estimators.

The total error of the retrieved delay, defined as (Δϕ)2¯=(Δϕ¯)2+σ[Δϕ]2\sqrt{\overline{(\Delta\phi)^{2}}}=\sqrt{(\overline{\Delta\phi})^{2}+\sigma[\Delta\phi]^{2}} which combines the systematic and statistical errors, also depends on the streaking amplitude kk. Figure 10(c) shows this total error varies with k/Δpk/\Delta p at a fixed electron flux Efx=3×102E_{\mathrm{fx}}=3\times 10^{2}. At large values of k/Δpk/\Delta p, the total error grows with kk due to the increasing systematic errors as discussed in section III. However, as k/Δpk/\Delta p approaches zero, the total error increases due to the presence of noise. This is because the streaking covariance, which scales as k2k^{2} in the small amplitude limit (Eqn. 8), is overwhelmed by the noise. The dependence on k/Δpk/\Delta p shown in Fig. 10(c) represents the typical behavior of the total error in delay retrieval, although the exact optimal streaking amplitude also depends on other parameters such as EfxE_{\mathrm{fx}} and NsN_{s}.

IV.3 Delay Fluctuation

When employing angular streaking to measure the time-delay between two ionizing pulses, the instabilities in the delay impacts the measured covariance between the two photoelectron features produced by the respective pulses. The delay fluctuation results in variation of ϕ\phi. In our framework, it is straightforward to incorporate fluctuations in ϕ\phi, so long as the flucuations are independent from the rest of machine fluctuations. Since ϕ\phi is the relative angle between the streaking directions of the two features, each individual MD is independent from ϕ\phi, thus 𝔼[I|ϕ]=𝔼[I]\mathbb{E}\left[I|\phi\right]=\mathbb{E}\left[I\right] and C[𝔼[X|ϕ],𝔼[Y|ϕ]]=0C[\mathbb{E}\left[X|\phi\right],\mathbb{E}\left[Y|\phi\right]]=0. Applying the law of total covariance:

C[X,Y]\displaystyle C[X,Y] =𝔼[C[X,Y|ϕ]]+C[𝔼[X|ϕ],𝔼[Y|ϕ]]\displaystyle=\mathbb{E}\left[C[X,Y|\phi]\right]+C[\mathbb{E}\left[X|\phi\right],\mathbb{E}\left[Y|\phi\right]]
=𝔼[C[X,Y|ϕ]],\displaystyle=\mathbb{E}\left[C[X,Y|\phi]\right]~{}, (18)

we only retain the conditional covariance C[X,Y|ϕ]C[X,Y|\phi] averaged over ϕ\phi. Then the measured covariance is simply the ensemble average of the covariance for each ϕ\phi.

To further illustrate the effect of a small delay jitter in the covariance analysis, we look into the scenario where ϕ\phi follows a normal distribution with mean value ϕ0\phi_{0} and standard deviation δϕ\delta\phi. We found that this normal distribution of delay results in a damping factor on the GIP model 𝔼[MGIP(ϕ)]=eδϕ2/2MGIP(ϕ0)\mathbb{E}\left[M_{\mathrm{GIP}}(\phi)\right]=e^{-\delta\phi^{2}/2}M_{\mathrm{GIP}}(\phi_{0}). This damping factor eδϕ2/2e^{-\delta\phi^{2}/2} has an intuitively interpretation: Increasing the delay jitter reduces the correlation between the streaking directions of the two features XX and YY, which reduces the magnitude of the streaking covariance K[X,Y]K[X,Y]. On the other hand, K[X,X]K[X,X] and K[Y,Y]K[Y,Y] are not affected by the delay jitter δϕ\delta\phi, since they originate from a single ionizing pulse and thus have no dependence on ϕ\phi. Incorporating a global scaling factor as a free parameter of the gradient-based and/or gradient-free model, we can retrieve the delay. We can also estimate the delay jitter by comparing the magnitude of KXY\mathrm{K}_{XY} to KXX\mathrm{K}_{XX} and KYY\mathrm{K}_{YY} [13].

IV.4 Instrument Specific Issues

IV.4.1 Momentum Projection

Refer to caption
Figure 11: Effects of momentum projection. (a) Relative error of the GIP term from the streaking covariance with the projection (red solid) vs. the slicing (grey dotted) measurement scheme, the latter of which is same as the pmin=pMGp_{\min}=p_{\mathrm{MG}} trace in Fig. 3(c). (b) Delay retrieval systematic error for the three methods: ND-gradient (green), RR-gradient (brown), and gradient-free (yellow), comparing the projection (solid) vs. the slicing (dotted, same as the dashed in Fig. 8(c)). (c) Difference of the average streaked MD from the unstreaked MD of each photoelectron feature 𝔼[X]𝔼[X0]\mathbb{E}\left[X\right]-\mathbb{E}\left[X^{0}\right]. Values are normalized to maxX0\max X^{0}. ROI lower boundary for panels (a-b) is the maximal gradient line pMGp_{\mathrm{MG}} of the projected unstreaked MD (red solid in panel (c)).

The framework presented above for interpreting photoelectron momentum distributions in the impulsive streaking regime can be applied to both projected and sliced MDs. A projected MD can be measured with a VMI-type spectrometer [21], and a sliced MD can be measured with an array of Time-of-Flight spectrometers (ToFs) [27]. The two schemes perform nearly identically, in terms of the accuracy of the model and the performance of the delay retrieval methods. The GIP model is slightly more accurate for the projected MD, as shown in Fig. 11(a). As a result, the systematic error of delay retrieval methods is slightly lower, as shown in panel (b). The primary reason for the improved accuracy is that the higher order ((nX+nY)=4(n_{X}+n_{Y}){=}4) terms, beyond GIP term, in streaking covariance is reduced by the project along pzp_{z}. The reduction of the higher-order terms is a result of the reduced curvature of the dressing-free MD (e.g. 2X0\nabla^{2}\langle X^{0}\rangle), when the MD is projected. At the same time, difference in accuracy and precision between the two detection modalities are minimal.

IV.4.2 Angular Sparsity

Refer to caption
Figure 12: Effects of sparse angular sampling on the RR delay retrieval methods. (a) The dressing-free MD collected by an array of 16 spectrometers, from the same ionization process as Fig. 1. Angular bins are grouped into the even (orange rectangles) and odd (cyan rectangles) ones. (b) Streaking covariance K[X,Y]K[X,Y] at ϕ=π/3\phi=\pi/3, k/Δp=0.5k/\Delta p=0.5, under the MRCO16 measurement scheme, see main text. Radial boundaries are the same as for Fig. 3(a). (c) Same as (b) but in the MRCO8v8 scheme where XX and YY are detected by the even and odd spectrometers, respectively. (d-f) Comparison of the three schemes (N180 in red, MRCO16 in green, MRCO8v8 in blue) on the RR (solid) and daRR (dashed) delay retrieval methods. (d) Systematic error in the noiseless limit: of the RR method (grey solid), almost the same for the three schemes, and of the daRR method (dashed, colors corresponding to the schemes). (e) In the presence of shot noise controlled by the electron flux EfxE_{\mathrm{fx}}, systematic error at k/Δp=0.5k/\Delta p=0.5 (black dotted line in (d)). (f) The solid and dashed traces are the same as (e) except for statistical errors of the RR and daRR methods, respectively. The dot-dashed traces are asymptotic statistical errors after introducing κ\kappa sampling noise in addition to shot noise, with colors corresponding to the schemes.

Measurement of the sliced (i.e. pz=0p_{z}=0) MD is often done using an array of time-of-flight spectrometers (ToFs), as in the aforementioned device, MRCO, which consists of 1616 ToFs, each collecting 0.2{\sim}0.2% of the full 4π4\pi solid angle [27]. We consider the impact of angular sparsity in the sampling of the MD on the delay retrieval methods. The 0.2{\sim}0.2% angular sampling is notably more sparse than the measurement schemes employed above.To simulate this angular sampling scheme, we integrate an angular window representing each ToF, these windows are shown for the dressing-free MD in Fig. 12(a). As in the previous tests, we use the optimal lower bound for the ROI of the momentum, pmin=pMGp_{\min}=p_{\mathrm{MG}}, to optimize the accuracy of the GIP model. We can then calculate the K-estimators: KXY,KXX,KYY\mathrm{K}_{XY},\mathrm{K}_{XX},\mathrm{K}_{YY} with the limited angular resolution. We compare two measurement schemes, one where both XX and YY are detected by all 16 ToFs NQ=NP=16N_{Q}{=}N_{P}{=}16, denoted MRCO16, and a second scheme where the ToFs are split into two interleaving subsets NQ=NP=8N_{Q}{=}N_{P}{=}8 for XX and YY respectively, denoted MRCO8v8. As a reference we use a densely sampled measurement scheme, denoted as “N180” in Fig. 3-10. Examples of the computed KXY\mathrm{K}_{XY} are shown in Fig. 12(b) and (c), for the MRCO16 and MRCO8v8 schemes, respectively. In both cases, the positive ridge around θpθqϕ\theta_{p}{-}\theta_{q}{\sim}\phi persists regardless of sampling scheme.

We characterize the impact of the angular sparsity in the measurement on the delay retrieval methods described in Sec. III. We focus specifically on the RR and daRR methods, initially in the noiseless limit. We find that the systematic error of the RR method is independent of the sampling scheme (<0.1<0.1 mrad difference), as shown by the solid curves in Fig. 12(d). In contrast, the systematic error of the daRR method increases with increasing angular sparsity, as shown by the dashed curves in Fig. 12(d). The number of elements in the KXX\mathrm{K}_{XX} matrix scales quadratically with NQN_{Q}, where as the number of diagonal elements scales linearly. As a result, when the daRR method ignores the diagonal elements in KXX\mathrm{K}_{XX}, a smaller value of NQN_{Q} ignores relatively more information in KXX\mathrm{K}_{XX}. For the N180 scheme, the daRR method performs nearly the same as the RR method in the noiseless limit, within 0.20.2 mrad difference. Another observation from panel (d) is that the systematic error of daRR is roughly proportional to NQ1k/ΔpN_{Q}^{-1}k/\Delta p, within k/Δp<0.6k/\Delta p<0.6, where the error if the daRR method is approximately linear with kk.

When we introduce shot noise to the simulations, the three measurement schemes (N180, MRCO16, MRCO8v8) share several other similarities in terms of the delay retrieval performance. Decreasing the electron flux in each set of ROIs, EfxE_{\mathrm{fx}}, from the noiseless limit, we see a region where the systematic error of RR method scale as 1/Efx\propto 1/E_{\mathrm{fx}} and then exceeds the daRR method, as shown by the solid curves in Fig. 12(e). This behavior is exhibited under all three schemes, although the values of EfxE_{\mathrm{fx}} below which the daRR method outperforms the RR method are different. As EfxE_{\mathrm{fx}} decreases further, the systematic error of the daRR method remains insensitive to EfxE_{\mathrm{fx}} until the electron counts become excessively scarce Efx5E_{\mathrm{fx}}\lesssim 5. On the other hand, the statistical error of both the RR and the daRR methods follows the 1/NsEfx1/\sqrt{N_{s}E_{\mathrm{fx}}} scaling, as shown by the colored traces in panel (f). Remarkably, in this asymptotic scaling region the statistical error curves for the different measurement schemes overlap each other, rather than having an offset between them. Meanwhile, in the systematic error curves of the RR method, the 1/Efx\propto 1/E_{\mathrm{fx}} scaling regions also overlap with each other when compared at the same EfxE_{\mathrm{fx}}. Note that EfxE_{\mathrm{fx}} represents the total electron counts detected in a set of angular bins, thus achieving the same EfxE_{\mathrm{fx}} requires a higher total number of electrons generated per shot when the measurement scheme is more sparse. Another common feature for measurement schemes is the value of Efx/NQE_{\mathrm{fx}}/N_{Q} where the RR and daRR methods are equally accurate. For the k/Δp=0.5k/\Delta p=0.5 case shown in panel (e), this cross-over is at Efx/NQ10E_{\mathrm{fx}}/N_{Q}\approx 10. This agreement is universal so long as k/Δp<0.6k/\Delta p<0.6 and the RR method has not reached the noiseless limit.

Next, we consider sampling noise of κ\kappa in addition to shot noise. Due to this sampling noise of κ\kappa, when the number of shots NsN_{s} is fixed and EfxE_{\mathrm{fx}} is high, the statistical error of retrieved delay approaches the lower bound σ[Δϕ]𝒞κ/Ns\sigma[\Delta\phi]\approx\sqrt{\mathcal{C}_{\kappa}/N_{s}} for both the RR and daRR methods, similar as in Fig. 10(b). As shown by the dot-dashed lines in Fig. 12(f), this lower bound at 𝒞κ/Ns\sqrt{\mathcal{C}_{\kappa}/N_{s}} is roughly the same under different levels of angular sparsity. Meanwhile, the overlap of the colored curves in Fig. 12(f) indicates that 𝒞E\mathcal{C}_{E} is also almost independent of the angular sparsity. Thus in the presence of both shot noise and sampling noise of κ\kappa, the asymptote of the statistical error σ[Δϕ]𝒞E/(NsEfx)+𝒞κ/Ns\sigma[\Delta\phi]\approx\sqrt{\mathcal{C}_{E}/(N_{s}E_{\mathrm{fx}})+\mathcal{C}_{\kappa}/N_{s}} is not significantly affected by the angular sparsity.

Recall that systematic errors are insensitive to NsN_{s}, the overlap in part of the systematic error curves indicates that in order to retrieve the delay accurately, it is crucial to control the electron counts per shot. Acquiring more shots would not improve the accuracy of the delay retrieval but only improve the precision.

V Encoding Arbitrary Signals

In this section, we extend our discussion to the situation where the emission process approaches or exceeds the period of the dressing laser field, and thus the emission is no longer impulsive. We will remain in the limit that the emission process is shorter than the duration of the dressing field envelope. For this method to work, we still require an impulsive emission process to provide a reference feature. Moreover, these two emission processes should be correlated for the instantaneous process to serve as a reference. Similar to the conventions used above, we will refer to the impulsive feature as feature XX, and the longer timescale emission process will produce feature YY.

Owing to the periodic nature of the interaction of the laser field, the time-dependence of the emission is encoded in the Fourier coefficients of the measured electron momentum distribution (MD):

𝒴m(𝒓;k)02π𝔼[Y(𝒓;𝒌)|𝒌]eimκdκ2π\mathcal{Y}_{m}(\bm{r};k)\equiv\int_{0}^{2\pi}\mathbb{E}\left[Y(\bm{r};\bm{k})|\bm{k}\right]e^{-im\kappa}\frac{d\kappa}{2\pi} (19)

Here Y(𝒓;𝒌)Y(\bm{r};\bm{k}) is the yield of electrons belonging to feature YY at detector coordinate 𝒓\bm{r} for a streaking vector 𝒌\bm{k}, which is similar to the conventions used above. In writing this expression, we have taken into account the machine fluctuations by using 𝔼[Y|𝒌]\mathbb{E}\left[Y|\bm{k}\right]. Applying the law of total covariance, we can write K[X,Y]K[X,Y] as

K[X,Y]\displaystyle K[X,Y] m=1+(μm(k)𝒴m(k)+c.c.)+C[μ0(k),𝒴0(k)]\displaystyle\simeq\sum_{m=1}^{+\infty}\left(\langle\mu_{m}(k)\mathcal{Y}_{m}(k)\rangle+\mathrm{c.c.}\right)+C\left[\mu_{0}(k),\mathcal{Y}_{0}({k})\right] (20)
μm(k)\displaystyle\mu_{m}(k) 1m!(k+2)mF10(m+1,k224)X0,m0\displaystyle\equiv\frac{1}{m!}\left(\frac{-k\partial_{+}}{\sqrt{2}}\right)^{m}{}_{0}F_{1}\left(m+1,\frac{k^{2}\nabla^{2}}{4}\right)\langle X^{0}\rangle,~{}m\geq 0 (21)

where X0\langle X^{0}\rangle is the dressing-free MD averaged over machine fluctuations, +=(x+iy)/2\partial_{+}=(\frac{\partial}{\partial x}+i\frac{\partial}{\partial y})/\sqrt{2} is an operator that acts on a MD, and F10(z,x)n=0((z1)!xn)/((z+n1)!n!){}_{0}F_{1}(z,x)\equiv\sum_{n=0}^{\infty}((z-1)!x^{n})/((z+n-1)!n!) is the confluence hypergeometric limit function [28]. It is clear from Eqn. (21) that all Fourier components, 𝒴m\mathcal{Y}_{m}, are encoded in K[X,Y]K[X,Y], but not always with the same sensitivity. This indicates a signal which is periodic with κ\kappa is encoded in the streaking covariance with the impulsive reference feature XX in the same way as the encoding of 𝔼[Y(𝒓;𝒌)|𝒌]\mathbb{E}\left[Y(\bm{r};\bm{k})|\bm{k}\right] described in Eqn. (20). This encoding in all Fourier components is a general property of a periodic signal.

In applying the law of total covariance to partition K[X,Y]K[X,Y], the conditional variable was the streaking amplitude kk. In writing Eqn. (20) we have partitioned the terms into two terms, one that results from the shot-to-shot variations of κ\kappa (the sum over m0m{\neq}0) and the other that results from variations in kk. The streaking covariance K[X,Y]K[X,Y] depends linearly on the |m|1|m|{\geq}1 components, with sensitivity μm(k)\mu_{m}(k) given by Eqn. (21). In contrast, the m=0m{=}0 component of YY only affects the second term C[μ0(k),𝒴0(k)]C\left[\mu_{0}(k),\mathcal{Y}_{0}({k})\right], which indicates that K[X,Y]K[X,Y] is sensitive to the covariance of 𝒴0\mathcal{Y}_{0} and μ0\mu_{0} arising from the variation of the streaking amplitude kk. Note that in the small streaking regime (k<ΔpXk<\Delta p_{X} of the reference feature), the leading order of the sensitivity is μmkmm!2m/2+mX0\mu_{m}\approx\frac{k^{m}}{m!2^{m/2}}\partial_{+}^{m}\langle X^{0}\rangle, where the minimal order of differentiation on X0\langle X^{0}\rangle is mm. Thus only the components 𝒴±1\mathcal{Y}_{\pm 1} are coupled to the gradient of the reference feature X0\nabla\langle X^{0}\rangle. This indicates that the delay retrieval methods discussed in the preceding sections are only sensitive to the |m|=1|m|=1 components. In the cases where feature YY is in the impulsive regime, as discussed in the preceding sections, Eqn. (20) reduces to Eqn. (9) after rearranging the terms, as shown in Supplemental Sec. 4.1.

To illustrate Eqn. (20), we simulate the MD for Auger-Meitner (AM) decay of a molecule in a superposition of two core-ionized states, prepared by KK-shell ionization with a 0.20.2 fs FWHM pulse [17]. The two intermediate cationic states have an energetic separation of ΔIP=7ωL\Delta I_{\mathrm{P}}=7\hbar\omega_{L}, and the AM decay process couples these two states to a common dication state. In our simulation, the dressing-free AM electrons have a central momentum of 4.344.34 a.u. and 4.384.38 a.u. The emitted electrons are dressed by a |𝑨|=0.1|\bm{A}|=0.1 a.u., cTL=1.85μmcT_{L}=1.85\,\mathrm{\mu m} dressing field (equivalent to Fig. 10) and the MD considered is the pz=0p_{z}=0 slice. Other details of the simulation of the AM distribution are provided in Supplemental Sec. VI. The decay of this superposition of core-ionized states results in a characteristic modulation of YY over κ\kappa at a period of 2π/72\pi/7, which results from interference between the two pathways via different intermediate states [17]. This modulation appears across a wide range of AM momenta. Here we have assumed that Y0Y^{0} is angularly isotropic. This means that the MD of AM electrons at different angular positions θp\theta_{p} are equivalent up to an offset in κ\kappa, i.e. Y(θp,r;κ,k)=Y(0,r;κθp,k)Y(\theta_{p},r;\kappa,k)=Y(0,r;\kappa-\theta_{p},k), so the yield is also modulated over angular position θp\theta_{p}.

Refer to caption
Figure 13: Encoding a κ\kappa-dependent signal in the covariance with an impulsive reference feature. (a) Auger-Meitner (AM) electron yield in the 22^{\circ} angular bin at θp=0\theta_{p}=0, over streaking direction κ\kappa of the reference feature, simulated at |𝑨|=0.1|\bm{A}|{=}0.1 a.u. (b) AM yield YpY_{p} across 180 angular bins. (c) Streaking covariance of the AM yield YpY_{p} and the photoelectron feature employed in Fig. 10. Right inset shows the θq=0\theta_{q}=0 column of this map, i.e. K[X|θq=0,Y]K[X|_{\theta_{q}{=}0},Y]. (d) Amplitude of sensitivity |μm||\mu_{m}| at k=0.1k{=}0.1 a.u., in the same angular bins as in Fig. 10. (e) The Fourier coefficient amplitudes of the streaking covariance at θq=0\theta_{q}=0 that has been shown in (c) (blue solid), compared to amplitude of the product |μm𝒴m||\mu_{m}\mathcal{Y}_{m}| (green dashed) between sensitivity μm\mu_{m} and the AM yield Fourier coefficients 𝒴m\mathcal{Y}_{m}. For reference, |𝒴m||\mathcal{Y}_{m}| is shown in black solid.

In Fig.13 we analyze the MD by partitioning the feature YY into NP=180N_{P}=180 angular bins, between radial boundaries pmin=4.40p_{\mathrm{min}}{=}4.40 a.u. and pmax=4.50p_{\mathrm{max}}{=}4.50 a.u.. Figure 13(a) shows the integral of Y(𝒓;𝒌)Y(\bm{r};\bm{k}), in the angular bin at θp=0\theta_{p}=0, as a function of the streaking directions of the reference feature κ\kappa. As shown in Fig. 13(b), the yield in these angular bins YpY_{p} is clearly modulated with both κ\kappa and θp\theta_{p}. We compute the streaking covariance K[X,Y]K[X,Y] between YY and the impulsive reference feature XX. This reference feature XX is simulated in the same dressing field (k=0.1a.u.0.5ΔpXk=0.1\,\mathrm{a.u.}\approx 0.5\Delta p_{X}), and the ROIs {Qq}\{Q_{q}\} are remain between pMGp_{\mathrm{MG}} and pmax=2.15p_{\max}{=}2.15 a.u. as in Fig. 10. The 2π/72\pi/7-period modulation is barely visible in K[X,Y]K[X,Y], as shown in Fig. 13(c). This attenuation of the 2π/72\pi/7-period modulation results from the low sensitivity of K[X,Y]K[X,Y] to 𝒴7\mathcal{Y}_{7}. We can study the sensitivity of K[X,Y]K[X,Y] to the components 𝒴m\mathcal{Y}_{m} by computing μm\mu_{m} according to Eqn. (21) and integrating over the angular bins QqQ_{q}. We show the magnitude of the resultant sensitivities in Fig. 13(d), which demonstrate a decrease with increasing mm. Such dependence on mm is determined by k,ΔpXk,\Delta p_{X}, and the ROIs on feature XX, as described below. The products of multiplying these calculated μm\mu_{m} to the corresponding Fourier components 𝒴m\mathcal{Y}_{m} yield good agreements with the Fourier coefficients of K[X,Y]K[X,Y] with respect to θp\theta_{p}, which is visualized for the θq=0\theta_{q}=0 slice in panel (e), corroborating Eqn. (21).

We notice that in the small streaking regime, according to Equations (20)(21), the sensitivity is given by Qqμmd2𝒓kmm!2m/2Qq+mX0d2𝒓\int_{Q_{q}}\mu_{m}d^{2}\bm{r}\approx\frac{k^{m}}{m!2^{m/2}}\int_{Q_{q}}\partial_{+}^{m}\langle X^{0}\rangle d^{2}\bm{r}, which is proportional to the regional integral of a mm-th order derivative of the reference feature. Therefore, the scaling of the sensitivity over mm depends on the streaking amplitude relative to the characteristic momentum scale δp\delta p far below which the variations in X0\langle X^{0}\rangle become negligible, e.g. the width ΔpX\Delta p_{X} in the example shown in Fig. 13. In the limit of small streaking, the magnitude of sensitivity |μm||\mu_{m}| scales as o((k/δp)m/m!)o((k/\delta p)^{m}/m!), which rapidly decreases with increasing mm. Even in cases where kk is comparable to δp\delta p, as it is the case in Fig. 13, the streaking covariance tends to understate the high-order Fourier components due to the 1/m!1/m! factor in the sensitivity. This also indicates that to improve the sensitivity to high-order components, it is advisable to intensify the dressing field and to reduce the smoothness of the reference feature to increase k/δpk/\delta p.

The bilinearity of covariance also leads to the independence of sensitivity μm\mu_{m} on feature YY. Rather, μm\mu_{m} is determined by the reference feature XX and the streaking amplitude kk, as shown in Eqn. (21). Therefore, when detecting YY as a function of κ\kappa by analyzing the covariance with an impulsive reference feature XX, the sensitivity to certain Fourier components 𝒴m\mathcal{Y}_{m} can be designed in simulation, by choosing the streaking amplitude kk, the shape of X0X^{0} and the position of ROIs {Qq}\{Q_{q}\}, without detailed prior knowledge in YY.

VI Concluding Remarks

In this work, we present a comprehensive analysis of modelling covariance in the impulsive regime of angular streaking experiments. The displacement of the electron momentum distribution (MD) provides a tight connection between the dressing-free MD and the dressed MD. Such connection establishes universal structures in the composition of streaking covariance that are common across different MDs, regardless of their exact shape. Building on this robust framework, we have developed methods for retrieving temporal information from angular streaking measurements. Specifically for the situations where both features XX and YY in the MD are impulsive, we have proposed and evaluated three methods for retrieving relative time delays between XX and YY: numerical differentiation (ND), rank-reduction (RR), and a gradient-free approach. Each method offers certain advantages while also depending on experimental conditions and data quality. Several key insights are obtained in our study: (1) Both the streaking effect and the machine fluctuations contribute to the covariance that can be estimated from measurements. Proper removal of the machine fluctuation contribution is crucial for isolating the streaking covariance. (2) Shot noise of the electrons impacts the delay retrieval precision, but not significantly on the accuracy of the ND method or the diagonal-agnostic variant of the RR method. (3) Our investigation of angular sparsity in electron detection has shown that in the high-count scenarios, achieving accurate delay retrieval is possible even with limited angular sampling. (4) Ae of an arbitrary periodic YYsignabe encoded to thecan he streaking covarwith a concurrent impulsive streaking signal XX, where the sensitivity to different Fourier components of YY is only determined by the dressing-free MD X0X^{0} and the dressing fieldiance .

By providing a detailed understanding of the covariance structure in angular streaking experiments, our work enables more accurate and robust temporal measurements in a wide range of experimental scenarios. Future work could focus on (1) Developing delay retrieval methods that incorporate higher orders into the model of streaking covariance, in addition to the GIP model; (2) Higher order corrections for the removal of machine fluctuation contribution, e.g. the nX+nY=2n_{X}+n_{Y}=2 terms shown in Fig. 7(c-d), which can be measured by applying finite difference schemes to CX0Y0\mathrm{C}_{X^{0}Y^{0}} or PCX0Y0;F\mathrm{PC}_{X^{0}Y^{0};F}, provided with adequate momentum resolution in all four dimensions of (𝒓q,𝒓p)(\bm{r}_{q},\bm{r}_{p}); (3) Developing adaptive algorithms that can optimize sensitivity to specific Fourier components of the non-impulsive signals. In conclusion, this work provides a solid framework for leveraging covariance analysis in angular streaking experiments, paving the way for advances in ultrafast science and attosecond metrology.

Acknowledgements

This work is primarily supported by the US DOE, Office of Science, Office of Basic Energy Sciences (BES), Chemical Sciences, Geosciences, and Biosciences Division (CSGB). Z.G. and A.M. acknowledge support from the Accelerator and Detector Research Program of the Department of Energy, Basic Energy Sciences division. Z.G. also acknowledge support from Robert Siemann Fellowship of Stanford University.

References

  • The Nobel Committee for Physics [2023] The Nobel Committee for Physics, Scientific background to the nobel prize in physics 2023 (2023).
  • Itatani et al. [2002] J. Itatani, F. Quéré, G. L. Yudin, M. Y. Ivanov, F. Krausz, and P. B. Corkum, Attosecond Streak Camera, Physical Review Letters 88, 173903 (2002).
  • Bradley et al. [1971] D. Bradley, B. Liddy, and W. Sleat, Direct linear measurement of ultrashort light pulses with a picosecond streak camera, Optics Communications 2, 391 (1971).
  • Thumm et al. [2015] U. Thumm, Q. Liao, E. M. Bothschafter, F. Süßmann, M. F. Kling, and R. Kienberger, Attosecond physics: Attosecond streaking spectroscopy of atoms and solids, in Photonics (John Wiley & Sons, Ltd, 2015) Chap. 13, pp. 387–441, https://onlinelibrary.wiley.com/doi/pdf/10.1002/9781119009719.ch13 .
  • Schultze et al. [2010] M. Schultze, M. Fieß, N. Karpowicz, J. Gagnon, M. Korbman, M. Hofstetter, S. Neppl, A. L. Cavalieri, Y. Komninos, T. Mercouris, C. A. Nicolaides, R. Pazourek, S. Nagele, J. Feist, J. Burgdörfer, A. M. Azzeer, R. Ernstorfer, R. Kienberger, U. Kleineberg, E. Goulielmakis, F. Krausz, and V. S. Yakovlev, Delay in Photoemission, Science 328, 1658 (2010).
  • Drescher et al. [2002] M. Drescher, M. Hentschel, R. Kienberger, M. Uiberacker, V. Yakovlev, A. Scrinzi, T. Westerwalbesloh, U. Kleineberg, U. Heinzmann, and F. Krausz, Time-resolved atomic inner-shell spectroscopy, Nature 419, 803 (2002).
  • Eckle et al. [2008] P. Eckle, A. N. Pfeiffer, C. Cirelli, A. Staudte, R. Dörner, H. G. Muller, M. Büttiker, and U. Keller, Attosecond Ionization and Tunneling Delay Time Measurements in Helium, Science 322, 1525 (2008).
  • Hartmann et al. [2018] N. Hartmann, G. Hartmann, R. Heider, M. S. Wagner, M. Ilchen, J. Buck, A. O. Lindahl, C. Benko, J. Grünert, J. Krzywinski, J. Liu, A. A. Lutman, A. Marinelli, T. Maxwell, A. A. Miahnahri, S. P. Moeller, M. Planas, J. Robinson, A. K. Kazansky, N. M. Kabachnik, J. Viefhaus, T. Feurer, R. Kienberger, R. N. Coffee, and W. Helml, Attosecond time–energy structure of X-ray free-electron laser pulses, Nature Photonics 12, 215 (2018).
  • Duris et al. [2020] J. Duris, S. Li, T. Driver, E. G. Champenois, J. P. MacArthur, A. A. Lutman, Z. Zhang, P. Rosenberger, J. W. Aldrich, R. Coffee, G. Coslovich, F.-J. Decker, J. M. Glownia, G. Hartmann, W. Helml, A. Kamalov, J. Knurr, J. Krzywinski, M.-F. Lin, J. P. Marangos, M. Nantel, A. Natan, J. T. O’Neal, N. Shivaram, P. Walter, A. L. Wang, J. J. Welch, T. J. A. Wolf, J. Z. Xu, M. F. Kling, P. H. Bucksbaum, A. Zholents, Z. Huang, J. P. Cryan, and A. Marinelli, Tunable isolated attosecond X-ray pulses with gigawatt peak power from a free-electron laser, Nature Photonics 14, 30 (2020), number: 1 Publisher: Nature Publishing Group.
  • Franz et al. [2024] P. Franz, S. Li, T. Driver, R. R. Robles, D. Cesar, E. Isele, Z. Guo, J. Wang, J. P. Duris, K. Larsen, J. M. Glownia, X. Cheng, M. C. Hoffmann, X. Li, M.-F. Lin, A. Kamalov, R. Obaid, A. Summers, N. Sudar, E. Thierstein, Z. Zhang, M. F. Kling, Z. Huang, J. P. Cryan, and A. Marinelli, Terawatt-scale attosecond X-ray pulses from a cascaded superradiant free-electron laser, Nature Photonics 18, 698 (2024), publisher: Nature Publishing Group.
  • Haynes et al. [2021] D. C. Haynes, M. Wurzer, A. Schletter, A. Al-Haddad, C. Blaga, C. Bostedt, J. Bozek, H. Bromberger, M. Bucher, A. Camper, S. Carron, R. Coffee, J. T. Costello, L. F. DiMauro, Y. Ding, K. Ferguson, I. Grguraš, W. Helml, M. C. Hoffmann, M. Ilchen, S. Jalas, N. M. Kabachnik, A. K. Kazansky, R. Kienberger, A. R. Maier, T. Maxwell, T. Mazza, M. Meyer, H. Park, J. Robinson, C. Roedig, H. Schlarb, R. Singla, F. Tellkamp, P. A. Walker, K. Zhang, G. Doumy, C. Behrens, and A. L. Cavalieri, Clocking Auger electrons, Nature Physics , 1 (2021), publisher: Nature Publishing Group.
  • Li et al. [2022] S. Li, T. Driver, P. Rosenberger, E. G. Champenois, J. Duris, A. Al-Haddad, V. Averbukh, J. C. T. Barnard, N. Berrah, C. Bostedt, P. H. Bucksbaum, R. N. Coffee, L. F. DiMauro, L. Fang, D. Garratt, A. Gatton, Z. Guo, G. Hartmann, D. Haxton, W. Helml, Z. Huang, A. C. LaForge, A. Kamalov, J. Knurr, M.-F. Lin, A. A. Lutman, J. P. MacArthur, J. P. Marangos, M. Nantel, A. Natan, R. Obaid, J. T. O’Neal, N. H. Shivaram, A. Schori, P. Walter, A. L. Wang, T. J. A. Wolf, Z. Zhang, M. F. Kling, A. Marinelli, and J. P. Cryan, Attosecond coherent electron motion in Auger-Meitner decay, Science 375, 285 (2022), publisher: American Association for the Advancement of Science.
  • Guo et al. [2024] Z. Guo, T. Driver, S. Beauvarlet, D. Cesar, J. Duris, P. L. Franz, O. Alexander, D. Bohler, C. Bostedt, V. Averbukh, X. Cheng, L. F. DiMauro, G. Doumy, R. Forbes, O. Gessner, J. M. Glownia, E. Isele, A. Kamalov, K. A. Larsen, S. Li, X. Li, M.-F. Lin, G. A. McCracken, R. Obaid, J. T. O’Neal, R. R. Robles, D. Rolles, M. Ruberti, A. Rudenko, D. S. Slaughter, N. S. Sudar, E. Thierstein, D. Tuthill, K. Ueda, E. Wang, A. L. Wang, J. Wang, T. Weber, T. J. A. Wolf, L. Young, Z. Zhang, P. H. Bucksbaum, J. P. Marangos, M. F. Kling, Z. Huang, P. Walter, L. Inhester, N. Berrah, J. P. Cryan, and A. Marinelli, Experimental demonstration of attosecond pump–probe spectroscopy with an X-ray free-electron laser, Nature Photonics 18, 691 (2024), publisher: Nature Publishing Group.
  • Maroju et al. [2020] P. K. Maroju, C. Grazioli, M. Di Fraia, M. Moioli, D. Ertel, H. Ahmadi, O. Plekan, P. Finetti, E. Allaria, L. Giannessi, G. De Ninno, C. Spezzani, G. Penco, S. Spampinati, A. Demidovich, M. B. Danailov, R. Borghes, G. Kourousias, C. E. Sanches Dos Reis, F. Billé, A. A. Lutman, R. J. Squibb, R. Feifel, P. Carpeggiani, M. Reduzzi, T. Mazza, M. Meyer, S. Bengtsson, N. Ibrakovic, E. R. Simpson, J. Mauritsson, T. Csizmadia, M. Dumergue, S. Kühn, H. Nandiga Gopalakrishna, D. You, K. Ueda, M. Labeye, J. E. Bækhøj, K. J. Schafer, E. V. Gryzlova, A. N. Grum-Grzhimailo, K. C. Prince, C. Callegari, and G. Sansone, Attosecond pulse shaping using a seeded free-electron laser, Nature 578, 386 (2020).
  • Maroju et al. [2023] P. K. Maroju, M. Di Fraia, O. Plekan, M. Bonanomi, B. Merzuk, D. Busto, I. Makos, M. Schmoll, R. Shah, P. R. Ribič, L. Giannessi, G. De Ninno, C. Spezzani, G. Penco, A. Demidovich, M. Danailov, M. Coreno, M. Zangrando, A. Simoncig, M. Manfredda, R. J. Squibb, R. Feifel, S. Bengtsson, E. R. Simpson, T. Csizmadia, M. Dumergue, S. Kühn, K. Ueda, J. Li, K. J. Schafer, F. Frassetto, L. Poletto, K. C. Prince, J. Mauritsson, C. Callegari, and G. Sansone, Attosecond coherent control of electronic wave packets in two-colour photoionization using a novel timing tool for seeded free-electron laser, Nature Photonics 17, 200 (2023), number: 2 Publisher: Nature Publishing Group.
  • Driver et al. [2024] T. Driver, M. Mountney, J. Wang, L. Ortmann, A. Al-Haddad, N. Berrah, C. Bostedt, E. G. Champenois, L. F. DiMauro, J. Duris, D. Garratt, J. M. Glownia, Z. Guo, D. Haxton, E. Isele, I. Ivanov, J. Ji, A. Kamalov, S. Li, M.-F. Lin, J. P. Marangos, R. Obaid, J. T. O’Neal, P. Rosenberger, N. H. Shivaram, A. L. Wang, P. Walter, T. J. A. Wolf, H. J. Wörner, Z. Zhang, P. H. Bucksbaum, M. F. Kling, A. S. Landsman, R. R. Lucchese, A. Emmanouilidou, A. Marinelli, and J. P. Cryan, Attosecond delays in X-ray molecular ionization, Nature 632, 762 (2024), publisher: Nature Publishing Group.
  • Wang et al. [2024] J. Wang, T. Driver, P. L. Franz, P. Kolorenč, E. Thierstein, R. R. Robles, E. Isele, Z. Guo, D. Cesar, O. Alexander, S. Beauvarlet, K. Borne, X. Cheng, L. F. DiMauro, J. Duris, J. M. Glownia, M. Graßl, P. Hockett, M. Hoffman, A. Kamalov, K. A. Larsen, S. Li, X. Li, M.-F. Lin, R. Obaid, P. Rosenberger, P. Walter, T. J. Wolf, J. P. Marangos, M. F. Kling, P. H. Bucksbaum, A. Marinelli, and J. P. Cryan, Probing electronic coherence between core-level vacancies at different atomic sites, Physical Review X (accepted)  (2024).
  • Haynes et al. [2020] D. C. Haynes, M. Wurzer, A. Schletter, A. Al-Haddad, C. Blaga, C. Bostedt, J. Bozek, M. Bucher, A. Camper, S. Carron, R. Coffee, J. T. Costello, L. F. DiMauro, Y. Ding, K. Ferguson, I. Grguraš, W. Helml, M. C. Hoffmann, M. Ilchen, S. Jalas, N. M. Kabachnik, A. K. Kazansky, R. Kienberger, A. R. Maier, T. Maxwell, T. Mazza, M. Meyer, H. Park, J. S. Robinson, C. Roedig, H. Schlarb, R. Singla, F. Tellkamp, K. Zhang, G. Doumy, C. Behrens, and A. L. Cavalieri, Clocking Auger Electrons, arXiv:2003.10398 [physics]  (2020), arXiv: 2003.10398.
  • Kitzler et al. [2002] M. Kitzler, N. Milosevic, A. Scrinzi, F. Krausz, and T. Brabec, Quantum Theory of Attosecond XUV Pulse Measurement by Laser Dressed Photoionization, Physical Review Letters 88, 173904 (2002).
  • Wolkow [1935] D. M. Wolkow, Über eine Klasse von Lösungen der Diracschen Gleichung, Zeitschrift für Physik 94, 250 (1935).
  • Li et al. [2018] S. Li, E. G. Champenois, R. Coffee, Z. Guo, K. Hegazy, A. Kamalov, A. Natan, J. O’Neal, T. Osipov, M. Owens, D. Ray, D. Rich, P. Walter, A. Marinelli, and J. P. Cryan, A co-axial velocity map imaging spectrometer for electrons, AIP Advances 8, 115308 (2018), publisher: American Institute of Physics.
  • Glownia et al. [2010] J. M. Glownia, J. Cryan, J. Andreasson, A. Belkacem, N. Berrah, C. I. Blaga, C. Bostedt, J. Bozek, L. F. DiMauro, L. Fang, J. Frisch, O. Gessner, M. Gühr, J. Hajdu, M. P. Hertlein, M. Hoener, G. Huang, O. Kornilov, J. P. Marangos, A. M. March, B. K. McFarland, H. Merdji, V. S. Petrovic, C. Raman, D. Ray, D. A. Reis, M. Trigo, J. L. White, W. White, R. Wilcox, L. Young, R. N. Coffee, and P. H. Bucksbaum, Time-resolved pump-probe experiments at the LCLS, Optics Express 18, 17620 (2010).
  • Kheifets et al. [2022] A. S. Kheifets, R. Wielian, V. V. Serov, I. A. Ivanov, A. L. Wang, A. Marinelli, and J. P. Cryan, Ionization phase retrieval by angular streaking from random shots of XUV radiation, Physical Review A 106, 033106 (2022), publisher: American Physical Society.
  • DeGroot and Schervish [2010] M. H. DeGroot and M. J. Schervish, Probability and Statistics, 4th ed. (Pearson, Upper Saddle River, NJ, 2010) p. 261.
  • Eckart and Young [1936] C. Eckart and G. Young, The approximation of one matrix by another of lower rank, Psychometrika 1, 211 (1936).
  • Johnson and Wichern [2007] R. A. Johnson and D. W. Wichern, in Applied Multivariate Statistical Analysis (Pearson, Upper Saddle River, NJ, 2007) 6th ed., pp. 407–410.
  • Walter et al. [2021] P. Walter, A. Kamalov, A. Gatton, T. Driver, D. Bhogadi, J.-C. Castagna, X. Cheng, H. Shi, R. Obaid, J. Cryan, W. Helml, M. Ilchen, and R. N. Coffee, Multi-resolution electron spectrometer array for future free-electron laser experiments, Journal of Synchrotron Radiation 28, 1364 (2021).
  • Petkovšek et al. [1996] M. Petkovšek, H. S. Wilf, and D. Zeilberger, A=B (A K Peters, Wellesley, MA, 1996) p. 38.