Using photoelectron spectroscopy to measure resonant inelastic X-ray scattering: A computational investigation

Daniel J. Higley [email protected] SLAC National Accelerator Laboratory, 2575 Sand Hill Road, Menlo Park, California 94025, USA Hirohito Ogasawara SLAC National Accelerator Laboratory, 2575 Sand Hill Road, Menlo Park, California 94025, USA Sioan Zohar SLAC National Accelerator Laboratory, 2575 Sand Hill Road, Menlo Park, California 94025, USA Georgi L. Dakovski SLAC National Accelerator Laboratory, 2575 Sand Hill Road, Menlo Park, California 94025, USA

Abstract

Resonant inelastic X-ray scattering (RIXS) has become an important scientific tool. Nonetheless, conventional high-resolution RIXS measurements ( $<100$ meV), especially in the soft x-ray range, require large and low-throughput grating spectrometers that limits measurement accuracy and simplicity. Here, we computationally investigate the performance of a different method for measuring RIXS, Photoelectron Spectrometry for Analysis of X-rays (PAX). This method transforms the X-ray measurement problem of RIXS to an electron measurement problem, enabling use of compact, high-throughput electron spectrometers. In PAX, X-rays to be measured are incident on a converter material and the energy distribution of the resultant photoelectrons, the PAX spectrum, is measured with an electron spectrometer. The incident X-ray spectrum is then estimated through a deconvolution algorithm that leverages concepts from machine learning. We investigate a few example PAX cases. Using the 3d levels of Ag as a converter material, and with $10^{5}$ detected electrons, we accurately estimate features with 100s of meV width in a model RIXS spectrum. Using a sharp Fermi edge to encode RIXS spectra, we accurately distinguish 100 meV FWHM peaks separated by 45 meV with $10^{7}$ electrons detected that were photoemitted from within 0.4 eV of the Fermi level.

I Introduction

Resonant Inelastic X-Ray Scattering (RIXS) has emerged as a powerful technique to study elementary excitations Ament et al. (2011). RIXS probes excitations via core-valence transitions with element-specific energies, which allows one to tune the elemental locations being probed through the incident X-ray photon energy. Because of the large momentum of X-rays, RIXS is able to probe the dispersion of elementary excitations in solids, unlike lower energy optical photons. Further, in contrast to other X-ray-based spectroscopies, the energy resolution of RIXS is not limited by short core hole lifetimes. Because of these strengths, much effort has been devoted to developing RIXS capabilities. In the soft x-ray range, X-ray grating and synchrotron light source development has enabled RIXS measurements with energy resolution better than 100 meV Brookes et al. (2018). Having these high energy resolutions is particularly important for studies of solids where characteristic energies of many important excitations are around 100 meV or less Ament et al. (2011); Chaix et al. (2017). Leveraging these capabilities, RIXS studies have given new insights in wide-ranging topics including solid state physics Ament et al. (2011); Le Tacon et al. (2011); Schlappa et al. (2012); Chaix et al. (2017), nanoparticles Liu et al. (2017), interfaces Rajasekaran et al. (2012), batteries House et al. (2020); Firouzi et al. (2018), liquids Wernet et al. (2015) and gases Hennies et al. (2010).

Despite these strengths and much instrumentation development Brookes et al. (2018), grating spectrometers only detect one X-ray photon for every $\approx 10^{6}$ X-ray photons incident on a sample Ghiringhelli and Braicovich (2013); Dakovski et al. (2017). Thus, RIXS is a very photon hungry technique. In addition, the X-ray flux incident on a sample is constrained by radiation damage and X-ray source output. This limits achievable count rate, and thus accuracy, of RIXS measurements. Work has been done towards improving this. Transition edge sensors Uhlig et al. (2015) and off-axis zone plates Marschall et al. (2017) can make more accurate RIXS measurements than traditional instrumentation in certain cases, but so-far demonstrated resolutions are significantly more than 0.5 eV. More information could be gleaned from RIXS if one could make faster and more accurate measurements that maintained hundreds of meV or better resolution. For example, high resolution time-resolved RIXS measurements probe dynamics of elementary excitations and can discriminate well between different states that occur in sample evolution Wernet et al. (2015), but are very challenging due to their requirement of recording many accurate spectra.

Coinciding with the rise and development of RIXS, the capabilities of electron spectrometers have also greatly increased Damascelli et al. (2003). Photoelectron spectrometers can now have energy resolutions better than 20 meV for 500 eV kinetic energy electrons Seidel et al. (2017). The collection efficiency, and thus achievable signal-to-noise ratio with a given number of particles emitted from a sample, can be much higher for electron spectrometers than X-ray spectrometers with comparable resolutions. Further, the footprint of electron spectrometers with 10s of meV resolution when measuring several hundred eV electrons (few m²) is far smaller than current X-ray spectrometers with comparable resolutions for several hundred eV photons (few hundred m²) Brookes et al. (2018). These features are largely due to the ease with which electrons can be manipulated, owing to their charge, in comparison with neutral X-rays. Thus if one can transform the X-ray measurement problem of RIXS to an electron measurement problem, then there could be large gains in count rates, as well as instrumentation compactness and ease of implementation.

Refer to caption — Figure 1: Concept of PAX. (A) Experimental schematic. The measured PAX spectrum (E, F) is approximately the convolution of the X-ray spectrum incident on the converter material (B) and the photoemission spectrum of the converter material (C, D). The X-ray spectrum (B) is a model of a RIXS spectrum recorded at the Co L₃ edge (778 eV incident X-ray photon energy). The photoemission spectra model photoemission from the Ag 3d levels (C) and a sharp Fermi edge (D). The PAX spectrum shown in in (E) is calculated with the photoemission spectrum of (B), while (F) is calculated using (D).

Recently, Dakovski et al. Dakovski et al. (2017) proposed doing exactly that to measure Resonant Inelastic X-ray Scattering (RIXS) with hundreds of meV or better resolution through Photoelectron Spectrometry for Analysis of X-rays (PAX) Krause (1965); Ebel (1975). Fig. 1 gives an overview of this technique, which makes use of sharp photoemission features that occur in the photoemission spectra of materials such as Ag, Au, Pt or Al when measured with monochromatic incident X-ray radiation. Fig. 1A shows an experimental schematic. X-rays to be measured (Fig. 1B) are incident on a converter system, where absorption of the X-rays generates photoelectrons. The converter material is assumed to give some photoemission spectrum when measured with monochromatic X-ray radiation, $xps(BE)$ , as a function of binding energy, $BE$ (Fig. 1C shows an example Ag 3d photoemission spectrum, while Fig. 1D shows an example sharp Fermi edge). The emitted photoelectrons are then detected with a photoelectron spectrometer. We call the resultant electron spectrum the PAX spectrum (Fig. 1E or F). The expected shape of the PAX spectrum is given by a convolution with $h(E)=xps(-E)$ acting as an impulse response function. Convolving the spectrum of X-rays incident on the converter material (Fig. 1B) with $h(E)$ (Fig. 1C or D) approximately gives the expected value of the PAX spectrum (Fig. 1E or F),

E\{m(kE)\}=\int_{0}^{\infty}s(\hbar\omega)h(kE-\hbar\omega)d\hbar\omega,

(1)

where we only integrate over physically realistic positive photon energies. Here, $kE$ is electron kinetic energy, $s(\hbar\omega)$ is the X-ray spectrum incident on the converter material as a function of photon energy, $\hbar\omega$ , and $m(kE)$ is a measured PAX spectrum.

Given a PAX spectrum, the ground truth X-ray spectrum, $s(\hbar\omega)$ , can be estimated directly through deconvolution or in a parameterized form such as a sum of peaks. (The ground truth X-ray spectrum is the X-ray spectrum that would be measured without noise and with perfect resolution.) The decomposition of RIXS spectra into a sum of peaks is a natural method already widely used for traditionally recorded RIXS spectra as the parameters of these peaks can be directly linked to physical characteristics of the matter under study Ament et al. (2011). For more complex RIXS spectra, or cases where less is known about the form of a RIXS spectrum before measurement, the more general case of deconvolution may be more appropriate or would be a first step in further decomposition of a RIXS spectrum into elementary features.

While Dakovski et al. Dakovski et al. (2017) demonstrated the possibility of recording PAX spectra for RIXS and estimating the corresponding RIXS spectra as a sum of peaks, a general algorithm for faithfully reconstructing X-ray spectra from PAX measurements, and a quantitative assessment of the potential of PAX for measurement of X-ray spectra are needed. Here, we fill these gaps by proposing an algorithm for analyzing PAX data using methods from statistical data analysis and machine learning Fister et al. (2007); Bertero et al. (2009); James et al. (2013), then characterizing the performance of PAX in the estimation of model RIXS spectra using the Ag 3d levels or a sharp Fermi edge as a model converter.

The rest of this report is organized as follows. In section II, we discuss the considerations for choosing a converter material for PAX, and explain why the Ag 3d lines and a sharp Fermi edge are compelling cases. In section III we describe and discuss the deconvolution algorithm we used to estimate RIXS spectra from simulated PAX spectra. In section IV we show the simulated performance of PAX in estimating model RIXS features. We find that, using the Ag 3d levels as a photoemission converter, PAX can accurately estimate the width of few hundred meV features when 10⁵ electrons are detected in the measured PAX spectrum. Using a sharp Fermi edge photoemission converter shows promise for estimating finer features. Finally, in section V we conclude and give an outlook for future investigations.

II Choice of Converter Material

Subshell	Description
Ag 3d	Two 233 meV FWHM peaksPanaccione et al. (2005)
Au 4f	Two 335 meV FWHM peaksTakata et al. (2005)
Al 2p	Two 60 meV FWHM peaksBorg et al. (2004)
Pt Valence	Sharp Fermi edge

Table 1: Description of photoemission from some electronic subshells of solids.

The converter material plays a key role in PAX measurements. Converter materials with high conversion efficiency and narrow photoemssion lines are desirable. Higher conversion efficiencies give higher numbers of detected electrons and thus, potentially, higher signal-to-noise ratios. Narrow photoemission lines enable retrieval of narrow X-ray spectral features with a reasonable number of detected electrons. For thick converter materials with near normal incidence of X-rays and emission of photoelectrons, the conversion efficiency of X-rays to photoelectrons can be approximated as Henke (1972)

\epsilon=\tau\rho\lambda_{e},

(2)

where $\tau_{q}$ is the effective ionzation cross-section for the creation of photoelectrons from subshell $q$ , $\rho$ is the number per unit volume of atoms within the sample which can emit photoelectrons from the subshell used for PAX, and $\lambda_{e}$ is the electron mean free path at the kinetic energy of the relevant photoelectrons.

In Fig. 2 we show the conversion efficiency estimated with Eq. 2 for some promising cases. For these calculations subshell photoemission cross sections were taken from Yeh and Lindau (1985), electron mean free paths for Ag and Au were taken from Tanuma et al. (2002), electron mean free paths for Al and Pt were taken from Shinotsuka et al. (2015), and atoms per unit volume were calculated from values in Rumble (2019). From Fig. 2, we see that, in the soft X-ray range, conversion efficiencies of nearly ten percent with photoemission linewidths of a few hundred meV are possible using the Au 4f or Ag 3d lines. Narrower photoemission features are available at the expense of a reduced conversion efficiency (such as Al 2p and Pt Fermi level photoemission shown in Fig. 2).

III An Algorithm for Deconvolving PAX Spectra

In principle, a similar convolution equation as Eq. 1 describes the measured signal in many X-ray spectroscopies. The signal of interest, such as a RIXS spectrum or X-ray absorption spectrum, is convolved with an instrument response function and an intrinsic broadening function to give the measured spectrum. Thus, more accurate X-ray spectra can often be retrieved through deconvolution of measured results Ebel and Gurker (1975); Fister et al. (2007); Laverock et al. (2011). It is not typical to analyze such spectra using deconvolution however. This is because these convolutions only broaden the measured spectra, and measured spectra are still interpretable as a simple blurring of the true spectrum. For PAX, however, the measured spectrum is typically convolved with a more complicated function than a single peak. The converter material photoelectron spectrum could consist of, for example, two narrow peaks and a non-uniform background, as is common for core levels. It may not be easy to infer the original X-ray spectrum from the measured PAX spectrum in these cases. Thus, while deconvolution is an optional step in traditional X-ray spectroscopies, it is important for PAX measurements.

III.1 Model of PAX Spectra

Eq. 1 gives the expected value of a PAX measurement in the case that the PAX spectrum is measured at every electron kinetic energy. In reality, the measured PAX spectrum, $m[kE]$ is discrete with each measured point integrating over a range of electron kinetic energies. Thus, the expected value of the measured PAX spectrum is approximately given by a discrete convolution,

E\{m[kE]\}=\sum_{\hbar\omega=0}^{\infty}s[\hbar\omega]h[kE-\hbar\omega],

(3)

where $s[\hbar\omega]$ and $h[-BE]$ are the discretized versions of the X-ray spectrum incident on the converter material and the photoemission spectrum of the converter material.

Eq. 3 extends over an infinite range, but, fortunately, experimental circumstances can be chosen so that the summation is non-negligible only over a finite and experimentally tractable range. We assume that we want to estimate a RIXS spectrum from $\hbar\omega_{min}$ to $\hbar\omega_{max}$ and that we want to use photoemission features extending from at least $BE_{min}$ to $BE_{max}$ in the measurement. These ranges give a PAX spectrum extending from $KE_{min}=\hbar\omega_{min}-BE_{max}$ through $KE_{max}=\hbar\omega_{max}-BE_{min}$ . In order to accurately model this PAX spectrum we must keep all X-ray photon energies in Eq. 3 that contribute non-negligibly to the PAX spectrum over this range. X-rays with energies higher than some cutoff $\hbar\omega_{+}$ give negligible contributions (a typical cutoff may be a few hundred meV above the incident X-ray energy). The lower limit of photon energies that contribute to the PAX spectrum is set through the lowest binding energy that contributes significantly to the PAX spectrum, $BE_{-}$ . For example, if one is using valence photoemission features, this limit on binding energies is set by the typical restriction of significant photoemission intensity to positive binding energies. This sets the lower limit $\hbar\omega_{-}=kE_{min}+BE_{-}$ on the photon energies of X-rays that will contribute to the PAX spectrum.

With these range limitations, Eq. 3 is simplified to

E\{m[kE]\}=\sum_{\hbar\omega=\hbar\omega_{-}}^{\hbar\omega_{+}}s[\hbar\omega]h[kE-\hbar\omega],

(4)

which is practical to analyze for PAX. It is convenient to write this as

E\{\mathbf{m}\}=H\mathbf{s},

(5)

where $\mathbf{m}$ is a column vector whose entries are the measured PAX spectrum, $\mathbf{s}$ is another column vector whose entries are the desired X-ray spectrum, and $H$ is a matrix such that the entries of $H\mathbf{s}$ are the same as the discrete convolution $h\ast s$ .

We assume we are in a regime where shot noise is the dominant noise. In this case, the measured PAX spectrum can be approximated by a Poisson process. The probability of measuring $b$ counts for a kinetic energy bin with expected value $a$ is

p(b)=\frac{\exp(-a)a^{b}}{b!}.

(6)

The probability of measuring a PAX spectrum $\mathbf{m}$ is then

p(\mathbf{m})=\prod_{i}\frac{\exp(-H\mathbf{s})_{i}\left(H\mathbf{s}\right)_{i}^{\mathbf{m}_{i}}}{\mathbf{m}_{i}!},

(7)

where $\mathbf{x}_{i}$ denotes the ith element of $\mathbf{x}$ .

We note that a matrix equation like Eq. 5 holds as a description of the expected value of a PAX spectrum even when the photoemission spectrum of the converter material is dependent on the incident photon energy. Thus, the methods we describe below can still be used to estimate a RIXS spectrum in such cases, albeit with likely less computational efficiency.

III.2 Regularized Maximum Likelihood Estimation for Estimating RIXS with PAX

We now describe how we estimate a ground truth RIXS spectrum given a PAX data set. We assume that we have a set of PAX spectra recorded under statistically identical conditions as well as a high accuracy measurement of the photoemission spectrum of the converter material recorded with monochromatic incident X-ray radiation. Neglecting noise of the photoemission spectrum is an acceptable approximation because the photoemission spectrum can be measured with direct photoemission which has orders of magnitude higher count rate than a PAX measurement. We estimate the ground truth RIXS spectrum from this data using regularized maximum likelihood estimation. The maximum likelihood estimate of the ground truth RIXS spectrum is the spectrum which maximizes the probability of measuring the actually measured PAX spectrum. Regularization prevents the estimate from having finer structure than is warranted for the quality of the data.

For the probability given in Eq. 7, the negative log-likelihood of $\mathbf{s}$ is

L(\mathbf{s})=\sum_{i}(H\mathbf{s})_{i}-\mathbf{m}_{i}\log((H\mathbf{s})_{i})+\log(\mathbf{m}_{i}!).

(8)

The gradient of this with respect to $\mathbf{s}$ is

\nabla L(\mathbf{s})=H^{T}\mathbf{1}-H^{T}\frac{\mathbf{m}}{H\mathbf{s}},

(9)

where $\mathbf{1}$ is a vector where all the entries are one and with dimension such that the equation it appears in is valid, and $x^{T}$ denotes the transpose of $x$ . Having this gradient, we can iteratively minimize the negative log likelihood (maximizing the likelihood) with the scaled gradient iteration Bertero et al. (2009)

\hat{\mathbf{s}}^{(n+1)}=\hat{\mathbf{s}}^{(n)}-\hat{\mathbf{s}}^{(n)}\nabla L(\hat{\mathbf{s}}^{(n)}),

(10)

where $\hat{\mathbf{s}}^{(n)}$ is the estimate of $\mathbf{s}$ after $n$ iterations. This gives the iteration

\hat{\mathbf{s}}^{(n+1)}=\hat{\mathbf{s}}^{(n)}\left[\mathbf{1}-H^{T}\mathbf{1}+H^{T}\frac{\mathbf{m}}{H\mathbf{s}}\right].

(11)

This is equivalent to

\hat{s}^{(n+1)}=\hat{s}^{(n)}\left[1[\hbar\omega]-h^{*}\ast 1[\hbar\omega]+h^{*}\ast\frac{m}{h\ast\hat{s}^{(n)}}\right],

(12)

where $h^{*}$ is the photoemission impulse response function with the order of the entries reversed and all the entries of $1[\hbar\omega]$ are one. This iteration requires a starting point, $\hat{s}^{(0)}$ . For this, we used a smoothed version of the measured PAX spectrum.

If $h^{*}\ast 1[\hbar\omega]=1[\hbar\omega]$ , then Eq. 12 simplifies to the well-known Lucy-Richardson algorithm Richardson (1972); Lucy (1974); Shepp and Vardi (1982). The Lucy-Richardson algorithm and its variants have been widely used in imaging Bertero et al. (2009); Dey et al. (2006); Starck et al. (2002) and spectroscopy Fister et al. (2007). For PAX, however, we often will not be able to reduce Eq. 12 to the Lucy-Richardson algorithm as photoemission spectra used for PAX can be non-negligible over a wide range.

It is a well-known problem that algorithms like Eq. 12 amplify high frequency noise when they are used without regularization White (1994); Bertero et al. (2009). This is essentially a result of the reduction in strength of high frequency components relative to low frequency components after convolution with an extended function. Accurately inferring the pre-convolution strength of these high frequency components requires a more accurate measure of their strength in the post-convolution data then their lower frequency counterparts.

Various regularization schemes have been proposed to avoid the amplification of high frequency noise encountered in such algorithms. This is typically achieved by enforcing some degree of smoothness of the deconvolved result. Regularization by stopping iterations after certain criteria have been met Reeves (1995), damping the effect of iterations that do not improve the reconstruction White (1994), and total variation regularization Dey et al. (2006) have been proposed. A method of regularization well suited to our case was proposed by Fister et al. Fister et al. (2007) for application to Lucy-Richardson deconvolution of X-ray absorption and inelastic X-ray scattering spectra. In this algorithm, the iterative deconvolution is stabilized against high frequency noise amplification by convolution with a Gaussian function after each iteration. Applying this to our algorithm gives a regularized version of Eq. 12,

\hat{s}^{(n+1)}=f\ast\hat{s}^{(n)}\left[1[\hbar\omega]-h^{*}\ast 1[\hbar\omega]+h^{*}\ast\frac{m}{h\ast\hat{s}^{(n)}}\right],

(13)

where $f(x)$ is a Gaussian function with unit integrated amplitude,

f(x)=\frac{1}{\sigma\sqrt{2\pi}}\exp\left(-\frac{1}{2}(x/\sigma)^{2}\right).

(14)

The width of the Gaussian, $\sigma$ , constrains the maximum roughness of deconvolved spectra and thus sets the regularization strength (it acts as a hyperparameter in the deconvolution algorithm). Smaller regularization strengths allow for more roughness in the deconvolved spectra than larger regularization strengths.

Fig. 3 shows the effect of regularization strength on the deconvolved spectra for simulations using the model ground truth spectrum shown in Fig. 1A, and model Ag 3d photoemission spectrum shown in Fig. 1B. We used an energy separation of 10 meV between points for all simulations using the Ag 3d levels as a photoemission converter. Part A of Fig. 3 shows results for $10^{4}$ detected electrons, while part B shows analogous results for $10^{7}$ detected electrons. In each case, deconvolved and ground truth spectra are shown with regularization strength decreasing from top to bottom. As the regularization strength decreases, the deconvolved spectra attain more detail and increasingly finer structure is seen. This comes at the expense, however, of more statistical variation in the deconvolved spectra. For the case with 10⁴ detected electrons, the deconvolved spectra accurately estimate increasingly fine spectral features with smaller regularization strengths except for the smallest regularization strength of $\sigma=2.8$ meV. For that case, the deconvolved spectrum has fine features, but they do not accurately reflect the ground truth spectrum on this scale. In contrast, with $10^{7}$ detected electrons, as shown in Fig. 3B, the deconvolved spectra accurately reflect the ground truth spectrum smoothed to an extent given by the particular regularization strength even for the smallest regularization strength shown. Thus, the optimal regularization strength is smaller for $10^{7}$ detected electrons than for $10^{4}$ detected electrons. More generally, the best regularization strength decreases with increasing numbers of detected electrons, but also depends on the converter material photoemission spectrum and the ground truth X-ray spectrum.

III.3 Estimating the Optimal Regularization Strength

We now show how we can estimate the optimal regularization strength. Fister et al. Fister et al. (2007) proposed one method of choosing the regularization based on the smallest expected feature in the ground truth spectrum. Since we do not always know this before measurements, we use a more general procedure here. We define the optimal regularization strength as that which minimizes the Root Mean Squared Error (RMSE) of a deconvolved spectrum with respect to the ground truth it estimates,

\hat{\sigma}=\operatorname*{arg\,min}\{\sqrt{||\hat{\mathbf{s}}-\mathbf{s}||^{2}}\}.

(15)

Fig. 4 shows the dependence of different statistics on the regularization strength and number of detected electrons. These simulations were carried out using the Ag 3d photoemission spectrum shown in Fig. 1C as a model photoemission converter and the model RIXS spectrum of Fig. 1A as a ground truth spectrum to estimate. The results shown in Fig. 4 can be used to assess the potential of the different statistics in estimating the optimal regularization strength.

Fig. 4A shows an example ground truth spectrum and a deconvolved estimate of it from simulated PAX data. Calculating the RMSE of such deconvolved spectra as a function of the regularization strength and number of detected electrons results in the curves shown in Fig. 4B. This deconvolved RMSE decreases with increasing regularization strength down to the minimum at the optimal regularization strength, where it then gradually increases with increasing regularization strength. The optimal regularization strength is smaller for higher numbers of detected electrons. Unfortunately, this deconvolved RMSE is not experimentally accessible as the ground truth spectrum is generally unknown.

Instead of trying to determine the optimal regularization strength through direct assessment of deconvolved error, we can assess how well our model reconstructs PAX spectra from deconvolved spectra as a proxy for the accuracy of deconvolved spectra. In other words, we can compare the convolution of a deconvolved spectrum with the corresponding photoemission impulse response function to recorded PAX data. If the deconvolution is perfect and there is no noise, these spectra should be the same. It has been shown in previous deconvolution studies that such a comparison can allow one to estimate the optimal regularization strength Reeves (1992); Wahba and Wang (1990). In making these assessments, we distinguish between two cases. In the first case, we compare a reconstruction of PAX data to a training PAX spectrum (the same spectrum that was used as input for deconvolution). In the second case, we compare a reconstruction of PAX data to a validation PAX spectrum (a spectrum recorded in statistically identical conditions as the training spectrum, but that was not used as part of that deconvolution).

Fig. 4C shows a comparison of a reconstruction of a PAX spectrum to a training PAX spectrum. Fig. 4D shows the RMSE of such reconstructions as a function of regularization strength and number of detected electrons. Unfortunately, as seen there, this statistic only decreases with decreasing regularization strength and is not minimized at the same locations as the deconvolved RMSE. Thus we cannot use the minimum of this statistic to estimate the optimal regularization parameter. Deconvolutions with too small regularization strengths closely fit fine features of the recorded PAX spectrum even though fitting so closely is not warranted given the noise in the data. This is an example of overfitting. To avoid this problem, we can assess the performance of the deconvolution on data that was not used as input to the deconvolution James et al. (2013).

Fig. 4E compares the PAX reconstruction to a validation PAX spectrum. Fig. 4F shows the dependence of the RMSE of the reconstruction with respect to the validation PAX spectrum on the regularization strength and the number of detected electrons. In each case, this statistic is minimized near the optimal regularization strengths. Unlike the RMSE of deconvolved data shown in Fig. 4B, this statistic is experimentally accessible, and thus we use it to estimate the optimal regularization strength.

III.4 Stopping Criterion

To perform iterative deconvolution, it is necessary to decide when a sufficient number of iterations have been completed. To choose such a stopping criterion, we recognized that (1) we wanted to perform sufficient iterations for the deconvolution with the optimal regularization strength to converge to near its asymptotic value and (2) we wanted the number of iterations to be high enough for the optimal regularization strength to be closely approximated by that which minimizes the validation reconstruction RMSE.

Fig. 5 illustrates the stopping criterion we used for this study. This uses data simulated with the same conditions as Fig. 3A. For smaller regularization strengths, the deconvolved and validation reconstruction mean squared errors reach minima before increasing again. For larger regularization strengths, the deconvolved and validation reconstruction mean squared error decrease to near their asymptotic value quickly without overshooting. We see that, for this data, to fulfill the above conditions, it is necessary to complete iterations at least a few times more than that where the errors for the smallest regularization strength are minimized. In our case we chose to conduct iterations at least equal to four times that where the validation RMSE for the smallest regularization strength reaches a minimum. We note that it can still be beneficial to do more iterations where possible to better fulfill the above priorities.

We have now finished describing an algorithm for estimating a ground truth X-ray spectrum from PAX data. We summarize this procedure in Algorithms 1 and 2. The code used to perform the deconvolution analysis described in this report can be found at github.com/dhigley6/PAX2.

Input: m: measured PAX spectrum, h: PAX impulse response function (measured converter photoemission spectrum as a function of negative binding energy),

\sigma

: regularization strength

Output:

\hat{s}

: regularized maximum likelihood estimate of target X-ray spectrum

f(x)\leftarrow\frac{1}{\sigma\sqrt{2\pi}}\exp\left(-\frac{1}{2}(x/\sigma)^{2}\right)

;

n\leftarrow 0

;

\hat{s}^{(0)}\leftarrow f\ast m

;

while stopping criterion not met do

\hat{s}^{(n+1)}\leftarrow f\ast\hat{s}^{(n)}\left[1[\hbar\omega]-h^{*}\ast 1[\hbar\omega]+h^{*}\ast\frac{m}{h\ast\hat{s}^{(n)}}\right]

;

n\leftarrow n+1

;

end while

\hat{s}\leftarrow\hat{s}^{(n)}

;

Algorithm 1 Regularized Deconvolution of PAX Data

Input: M: set of

k

measured PAX spectra with the nth spectrum labeled

m_{n}

, h: PAX impulse response function (measured converter photoemission spectrum as a function of negative binding energy),

\mathcal{S}

: set of regularization strengths to choose among,

D(m,h,\sigma)

: regularized deconvolution of m with impulse response function

h

and regularization strength

\sigma

, as described in Algorithm 1

Output:

\hat{\sigma}

: estimate of regularization strength which minimizes the squared difference between the deconvolved and ground truth X-ray spectrum

for $\sigma\in\mathcal{S}$ do

for $m_{n}\in M$ do

m_{-n}=\frac{1}{k-1}\sum_{m_{i},i\neq n}m_{i}

;

\hat{s}_{-1}\leftarrow D(m_{-1},h,\sigma)

;

end for

CV(\sigma)\leftarrow\frac{1}{k}\sum_{i=1}^{k}||\hat{s}_{-i}\ast h-m_{i}||^{2}

;

end for

\hat{\sigma}\leftarrow argmin(CV)

/* After estimating an optimal regularization parameter, run Algorithm 1 with this regularization parameter to get the deconvolved spectrum */

Algorithm 2 Estimate Optimal Regularization Strength for Regularized Deconvolution

IV Performance of PAX

Now that we have described an algorithm for analyzing PAX spectra, in this section we evaluate the performance of this algorithm on simulated data. We simulated PAX measurements using the Ag 3d photoemission lines to estimate features with structure on a 100 meV scale as well as a sharp Fermi edge such as seen in Au to estimate finer features.

Fig. 6 shows the performance of PAX with the Ag 3d lines as a photoemission converter (A) in estimating the model RIXS spectrum shown in Fig. 1A. Part B shows ground truth and deconvolved X-ray spectra for the number of detected electrons increasing from top to bottom. As the number of detected electrons increases finer details of the X-ray spectrum are accurately estimated. Features with a few hundred meV width are already estimated well with $\approx 10^{5}$ detected electrons, and the width of well estimated features decreases further with increasing number of detected electrons. The RMSE of the deconvolved spectrum decreases as the number of detected electrons increases, as shown in Fig. 6C. Fig. 6D quantifies the ability of the method to accurately estimate fine features through the FWHM of the lowest energy loss peak of the deconvolved spectrum as a function of the number of detected electrons. This FWHM decreases as the number of detected electrons increases, and, after $10^{6.5}$ electrons have been detected, the width of the feature in the deconvolved spectrum is within 10 percent of its true width. We note that the integrated intensity of this first loss feature is only 2.4 percent of the total integrated intensity of the entire model RIXS spectrum. Thus one only needs to detect less than $10^{4}$ photoelectrons that were emitted from RIXS photons originating from this feature in order to accurately estimate the shape of this feature. We accurately estimated this feature despite it being much sharper than the the 233 meV FWHM widths of the Ag 3d photoemission peaks of the model impulse response function.

Fig. 7 shows an analysis of the performance of PAX in estimating spectra with structure on scales finer than 100 meV by using a sharp Fermi edge. For this analysis we used an energy separation of 2 meV between points. The photoemission spectrum was modeled with a constant density of states near the Fermi level and a temperature of 4K (boiling point of He). PAX spectra were simulated for $10^{7}$ electrons detected from photoemission within 0.4 eV of the Fermi level and variable separation of the X-ray doublet peaks. For each panel of Fig. 7, we compare three deconvolved simulated PAX spectra to the ground truth spectrum. By deconvolving more than one simulated PAX spectrum we get an indication of whether features in the deconvolved spectrum are reproducible with the experimental accuracy. This check has the drawback that not all of the data is used in forming a single deconvolved spectrum. In order to get an idea of whether features are reproducible without dividing the data into subsets, we can use the bootstrap method Efron and Tibshirani (1986); James et al. (2013). In this method, we approximate the distribution of PAX spectra that we would obtain from averaging $n$ recorded PAX spectra that compose a PAX data set (in this case $n=1000$ ). We do this by forming bootstrapped PAX spectra from the data by sampling $n$ times from the $n$ recorded PAX spectra and averaging the results. The sampling is done with replacement so that some samples may be chosen more than once and others not at all. We can then tell if a feature is unlikely to be a random occurrence by confirming whether it consistently occurs in a large fraction of the deconvolved bootstrapped PAX spectra. For clarity, only three such spectra are shown in Fig. 7, but to make scientific conclusions one should look at several times more spectra Efron and Tibshirani (1986).

As the peak separation decreases from top to bottom in Fig. 7, our ability to tell that the ground truth spectrum is two peaks rather than a single peak decreases. With the 70 meV peak separation (A) as well as the 45 meV peak separation (B), it is clear that the deconvolved spectra are not well represented by a single peak, and this is reproducibly the case between independent measurements or bootstrapped measurements. In contrast, for the 25 meV peak separation (C), it is no longer clear that the ground truth spectrum consists of more than one Gaussian peak, and a measurement with better statistics would be required to tell this.

V Conclusions and Outlook

We have demonstrated the potential of PAX for measurement of RIXS spectra. Using the Ag 3d levels as a photoemission converter for PAX, few hundred meV FWHM features can be accurately estimated with $10^{5}$ detected electrons. Even finer features can be accurately estimated with higher numbers of detected electrons. Details with aspects much smaller than 100 meV could be estimated using a sharp Fermi edge photoemssion converter, albeit at the expense of reduced conversion efficiency of RIXS photons to electrons.

In this report, we proposed and tested one algorithm which can be used to estimate a RIXS spectrum from measured PAX data. Our algorithm is simple and closely linked to the classic Lucy-Richardson algorithm. Recently, however, more sophisticated algorithms have shown much promise on achieving more accurate deconvolution results and reducing computational time Ikoma et al. (2018); Zhang et al. (2017). Applying such techniques to PAX could improve on the performance described here. In addition, using uncertainty quantification methods would enable more robust interpretation of X-ray spectra estimated from PAX data. This has been done for similar problems by assessing the sensitivity of deconvolved spectra to artificially added noise Fister et al. (2007) as well as more sophisticated methods Kaipio and Somersalo (2006).

Additional experimental development could also push the capability of PAX further. The model photoemission converters highlighted here were chosen based on a survey of literature photoemission data. A systematic investigation of other materials may provide photoemission features better suited for PAX measurements. Finally, the PAX measurement method could be applied to other situations where high signal-to-noise ratio estimates of X-ray spectra are desired without using traditional grating-based technology. An example could be transmissive soft X-ray spectrometers.

Acknowledgements.

We acknowledge A. Owen for a helpful conversation. Use of the Linac Coherent Light Source (LCLS), SLAC National Accelerator Laboratory, is supported by the U.S. Department of Energy, Office of Science, Office of Basic Energy Sciences under Contract No. DE-AC02-76SF00515. Use of the Stanford Synchrotron Radiation Lightsource, SLAC National Accelerator Laboratory, is supported by the U.S. Department of Energy, Office of Science, Office of Basic Energy Sciences under Contract No. DE-AC02-76SF00515.

References

Ament et al. (2011) L. J. Ament, M. Van Veenendaal, T. P. Devereaux, J. P. Hill, and J. Van Den Brink, Rev. Mod. Phys. 83, 705 (2011).
Brookes et al. (2018) N. Brookes, F. Yakhou-Harris, K. Kummer, A. Fondacaro, J. Cezar, D. Betto, E. Velez-Fort, A. Amorese, G. Ghiringhelli, L. Braicovich, et al., Nucl. Instrum. Methods Phys. Res. A 903, 175 (2018).
Chaix et al. (2017) L. Chaix, G. Ghiringhelli, Y. Peng, M. Hashimoto, B. Moritz, K. Kummer, N. Brookes, Y. He, S. Chen, S. Ishida, et al., Nat. Phys. 13, 952 (2017).
Le Tacon et al. (2011) M. Le Tacon, G. Ghiringhelli, J. Chaloupka, M. M. Sala, V. Hinkov, M. Haverkort, M. Minola, M. Bakr, K. Zhou, S. Blanco-Canosa, et al., Nat. Phys. 7, 725 (2011).
Schlappa et al. (2012) J. Schlappa, K. Wohlfeld, K. Zhou, M. Mourigal, M. Haverkort, V. Strocov, L. Hozoi, C. Monney, S. Nishimoto, S. Singh, et al., Nature 485, 82 (2012).
Liu et al. (2017) B. Liu, M. M. Van Schooneveld, Y.-T. Cui, J. Miyawaki, Y. Harada, T. O. Eschemann, K. P. De Jong, M. U. Delgado-Jaime, and F. M. De Groot, J. Phys. Chem C 121, 17450 (2017).
Rajasekaran et al. (2012) S. Rajasekaran, S. Kaya, T. Anniyev, H. Ogasawara, and A. Nilsson, Phys. Rev. B 85, 045419 (2012).
House et al. (2020) R. A. House, U. Maitra, M. A. Pérez-Osorio, J. G. Lozano, L. Jin, J. W. Somerville, L. C. Duda, A. Nag, A. Walters, K.-J. Zhou, et al., Nature 577, 502 (2020).
Firouzi et al. (2018) A. Firouzi, R. Qiao, S. Motallebi, C. W. Valencia, H. S. Israel, M. Fujimoto, L. A. Wray, Y.-D. Chuang, W. Yang, and C. D. Wessells, Nat. Commun. 9, 1 (2018).
Wernet et al. (2015) P. Wernet, K. Kunnus, I. Josefsson, I. Rajkovic, W. Quevedo, M. Beye, S. Schreck, S. Grübel, M. Scholz, D. Nordlund, et al., Nature 520, 78 (2015).
Hennies et al. (2010) F. Hennies, A. Pietzsch, M. Berglund, A. Föhlisch, T. Schmitt, V. Strocov, H. O. Karlsson, J. Andersson, and J.-E. Rubensson, Phys. Rev. Lett. 104, 193002 (2010).
Ghiringhelli and Braicovich (2013) G. Ghiringhelli and L. Braicovich, J. Electron Spectrosc. Relat. Phenom. 188, 26 (2013).
Dakovski et al. (2017) G. L. Dakovski, M.-F. Lin, D. S. Damiani, W. F. Schlotter, J. J. Turner, D. Nordlund, and H. Ogasawara, J. Synchrotron Radiat. 24, 1180 (2017).
Uhlig et al. (2015) J. Uhlig, W. Doriese, J. Fowler, D. Swetz, C. Jaye, D. Fischer, C. Reintsema, D. Bennett, L. Vale, U. Mandal, et al., J. Synchrotron Radiat. 22, 766 (2015).
Marschall et al. (2017) F. Marschall, Z. Yin, J. Rehanek, M. Beye, F. Döring, K. Kubiček, D. Raiser, S. T. Veedu, J. Buck, A. Rothkirch, et al., Sci. Rep. 7, 1 (2017).
Damascelli et al. (2003) A. Damascelli, Z. Hussain, and Z.-X. Shen, Rev. Mod. Phys. 75, 473 (2003).
Seidel et al. (2017) R. Seidel, M. N. Pohl, H. Ali, B. Winter, and E. F. Aziz, Rev. Sci. Instrum. 88, 073107 (2017).
Krause (1965) M. O. Krause, Phys. Rev. 140, A1845 (1965).
Ebel (1975) M. F. Ebel, X-Ray Spectrom. 4, 43 (1975).
Fister et al. (2007) T. Fister, G. Seidler, J. Rehr, J. Kas, W. Elam, J. Cross, and K. Nagle, Phys. Rev. B 75, 174106 (2007).
Bertero et al. (2009) M. Bertero, P. Boccacci, G. Desiderà, and G. Vicidomini, Inverse Probl. 25, 123006 (2009).
James et al. (2013) G. James, D. Witten, T. Hastie, and R. Tibshirani, An Introduction to Statistical Learning (Springer, 2013).
Panaccione et al. (2005) G. Panaccione, G. Cautero, M. Cautero, A. Fondacaro, M. Grioni, P. Lacovig, G. Monaco, F. Offi, G. Paolicelli, M. Sacchi, et al., J. Phys. Condens. Matter 17, 2671 (2005).
Takata et al. (2005) Y. Takata, M. Yabashi, K. Tamasaku, Y. Nishino, D. Miwa, T. Ishikawa, E. Ikenaga, K. Horiba, S. Shin, M. Arita, et al., Nucl. Instrum. Methods Phys. Res. A 547, 50 (2005).
Borg et al. (2004) M. Borg, M. Birgersson, M. Smedh, A. Mikkelsen, D. L. Adams, R. Nyholm, C.-O. Almbladh, and J. N. Andersen, Phys. Rev. B 69, 235418 (2004).
Henke (1972) B. L. Henke, Phys. Rev. A 6, 94 (1972).
Yeh and Lindau (1985) J. Yeh and I. Lindau, At. Data Nucl. Data Tables 32, 1 (1985).
Tanuma et al. (2002) S. Tanuma, S. Ichimura, K. Goto, and T. Kimura, J. Surf. Anal. 9, 285 (2002).
Shinotsuka et al. (2015) H. Shinotsuka, S. Tanuma, C. Powell, and D. Penn, Surf. Interface Anal. 47, 871 (2015).
Rumble (2019) J. R. Rumble, CRC Handbook of Chemistry and Physics, Vol. 100 (CRC Pres, 2019).
Ebel and Gurker (1975) H. Ebel and N. Gurker, Phys. Lett. A 50, 449 (1975).
Laverock et al. (2011) J. Laverock, A. Preston, D. Newby Jr, K. Smith, and S. Dugdale, Phys. Rev. B 84, 235111 (2011).
Richardson (1972) W. H. Richardson, J. Opt. Soc. Am. 62, 55 (1972).
Lucy (1974) L. B. Lucy, Astron. J. 79, 745 (1974).
Shepp and Vardi (1982) L. A. Shepp and Y. Vardi, IEEE Trans. Med. Imaging 1, 113 (1982).
Dey et al. (2006) N. Dey, L. Blanc-Feraud, C. Zimmer, P. Roux, Z. Kam, J.-C. Olivo-Marin, and J. Zerubia, Microsc. Res. Tech. 69, 260 (2006).
Starck et al. (2002) J.-L. Starck, E. Pantin, and F. Murtagh, Publ. Astron. Soc. Pac. 114, 1051 (2002).
White (1994) R. L. White, in Instrumentation in Astronomy VIII, Vol. 2198 (International Society for Optics and Photonics, 1994) pp. 1342–1349.
Reeves (1995) S. J. Reeves, Int. J. Imaging Syst. Technol. 6, 387 (1995).
Reeves (1992) S. J. Reeves, J. Vis. Commun. Image Represent 3, 433 (1992).
Wahba and Wang (1990) G. Wahba and Y. Wang, Commun. Stat. - Theory Methods 19, 1685 (1990).
Efron and Tibshirani (1986) B. Efron and R. Tibshirani, Stat. Sci. , 54 (1986).
Ikoma et al. (2018) H. Ikoma, M. Broxton, T. Kudo, and G. Wetzstein, Sci. Rep. 8, 1 (2018).
Zhang et al. (2017) J. Zhang, J. Pan, W.-S. Lai, R. W. Lau, and M.-H. Yang, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2017) pp. 3817–3825.
Kaipio and Somersalo (2006) J. Kaipio and E. Somersalo, Statistical and Computational Inverse Problems (Springer, 2006).