This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Cleaning Images with Gaussian Process Regression

Hengyue Zhang Department of Physics, University of California, Santa Barbara, Santa Barbara, CA 93106, USA Timothy D. Brandt Department of Physics, University of California, Santa Barbara, Santa Barbara, CA 93106, USA
Abstract

Many approaches to astronomical data reduction and analysis cannot tolerate missing data: corrupted pixels must first have their values imputed. This paper presents astrofix, a robust and flexible image imputation algorithm based on Gaussian Process Regression (GPR). Through an optimization process, astrofix chooses and applies a different interpolation kernel to each image, using a training set extracted automatically from that image. It naturally handles clusters of bad pixels and image edges and adapts to various instruments and image types. For bright pixels, the mean absolute error of astrofix is several times smaller than that of median replacement and interpolation by a Gaussian kernel. We demonstrate good performance on both imaging and spectroscopic data, including the SBIG 6303 0.4m telescope and the FLOYDS spectrograph of Las Cumbres Observatory and the CHARIS integral-field spectrograph on the Subaru Telescope.

software: Astropy (Astropy Collaboration et al., 2013, 2018), Matplotlib (Hunter, 2007), NumPy (Harris et al., 2020), Photutils (Bradley et al., 2020), SciPy (Virtanen et al., 2020).

1 Introduction

The history of data artifacts is as long as the history of observational astronomy. Artifacts such as dead pixels, hot pixels, and cosmic ray hits are common in astronomical images. They at best render the pixels’ data unusable while at worst disable the entire image in downstream approaches.

In dealing with missing pixels, some astronomical procedures simply ignore them while others require imputing their values first. Optimal extraction of spectra (Horne, 1986) and Point Spread Function (PSF) photometry ignore missing data, while box spectral extraction and aperture photometry do not. Aperture photometry and box extraction have the advantage of requiring little knowledge about the PSF or line-spread function (LSF). For this reason, aperture photometry has been used for ultra-precise Kepler photometry (Jenkins et al., 2010). Box extraction is currently standard for the SPHERE (Claudi et al., 2008) and GPI (Macintosh et al., 2008) integral-field spectrographs.

In general, correcting the corrupted data in an image involves two steps: identifying what they are and imputing their values. Existing algorithms have emphasized bad pixel identification and rejection. For example, there are well-developed packages that detect cosmic rays (CRs) by comparing multiple exposures (e.g. Windhorst et al., 1994). When multiple exposures are not available, Rhoads (2000) rejects CRs by applying a PSF filter, van Dokkum (2001) by Laplacian edge detection (LACosmic), and Pych (2004) by iterative histogram analysis. Among the above methods, LACosmic offers the best performance (Farage & Pimbblet, 2005). Approaches based on machine learning like deepCR (Zhang & Bloom, 2020), a deep-learning algorithm, may offer further improvements.

In contrast, the literature on methods of imputing missing data is sparse. Currently, the most common approach is the median replacement, which replaces a bad pixel with the median of its neighbours. Algorithms that apply median replacement include LACosmic. Some other packages, such as astropy.convolution, take an average of the surrounding pixels, weighted by Gaussian kernels. An alternative is a 1D linear interpolation. This approach is standard for the integral-field spectrographs GPI (Macintosh et al., 2008) and SPHERE (Claudi et al., 2008). deepCR, on the other hand, predicts the true pixel values by a trained neural network. However, none of these methods are statistically well-motivated, and they usually apply a fixed interpolation kernel to all images. In reality, however, the optimal kernel could vary from images to images. Moreover, in a continuous region of bad pixels or near the boundary of an image, most existing data imputation approaches either have their performance compromised or have to treat these regions as special cases. Only deepCR can handle them naturally with minimal performance differences.

In this paper, we present astrofix, a robust and flexible data imputation approach based on Gaussian Process Regression (GPR). GPR defines a family of interpolation kernels, within which we choose the optimal kernel for each image, using a training set that is extracted automatically from that image. We structure the paper as follows: Section 2 introduces GPR and the covariance function, explains how they lead to a parametric family of interpolation kernels and shows how this formalism handles additional missing data naturally. Section 3 discusses our strategies of choosing the optimal kernel. Section 4 specifies the implementation of our algorithm. Section 5 shows four examples—a CCD image of a globular cluster, a spectroscopic CCD exposure, a raw CHARIS image, and PSF photometry and astrometry on a stellar image—to demonstrate the consistently good performance of our algorithm. Finally, Section 6 summarizes our approach, discusses its limitations, and proposes directions for future work.

2 An Interpolation Kernel from Gaussian Process Regression

GPR is a statistical method of inferring the values of a function f(𝒓)f({\bf\it r}) at locations 𝒓𝒋{\bf\it r_{j}}, given that we know its values at locations 𝒓𝒊{\bf\it r_{i}} and the covariance function K(𝒓,𝒓)K({\bf\it r},{\bf\it r^{\prime}}) that gives the covariance between any two points 𝒓{\bf\it r} and 𝒓{\bf\it r^{\prime}}. The true values of the function are assumed to be drawn from a multivariate Gaussian distribution with zero mean and an infinite-dimensional covariance matrix given by the covariance function:

[f(𝒓𝒊)f(𝒓𝒋)]N[0,[K(𝒓𝒊,𝒓𝒊)K(𝒓𝒋,𝒓𝒊)K(𝒓𝒊,𝒓𝒋)K(𝒓𝒋,𝒓𝒋)]].\begin{bmatrix}f({\bf\it r_{i}})\\ f({\bf\it r_{j}})\end{bmatrix}\sim N\left[0,\begin{bmatrix}K({\bf\it r_{i}},{\bf\it r_{i}})&K({\bf\it r_{j}},{\bf\it r_{i}})\\ K({\bf\it r_{i}},{\bf\it r_{j}})&K({\bf\it r_{j}},{\bf\it r_{j}})\end{bmatrix}\right]. (1)

In many cases, we assume that we have only imperfect measurements g(𝒓𝒊)g({\bf\it r_{i}}) of the true function values f(𝒓𝒊)f({\bf\it r_{i}}). We assume the measurement errors to be Gaussian with zero mean and covariance matrix CdataC_{\rm data}. Then, the function values f(𝒓𝒋)f({\bf\it r_{j}}) at locations 𝒓𝒋{\bf\it r_{j}} are related to the measured values g(𝒓𝒊)g({\bf\it r_{i}}) at locations 𝒓𝒊{\bf\it r_{i}} by

[g(𝒓𝒊)f(𝒓𝒋)]N[0,[K(𝒓𝒊,𝒓𝒊)+CdataK(𝒓𝒋,𝒓𝒊)K(𝒓𝒊,𝒓𝒋)K(𝒓𝒋,𝒓𝒋)]].\begin{bmatrix}g({\bf\it r_{i}})\\ f({\bf\it r_{j}})\end{bmatrix}\sim N\left[0,\begin{bmatrix}K({\bf\it r_{i}},{\bf\it r_{i}})+C_{\rm data}&K({\bf\it r_{j}},{\bf\it r_{i}})\\ K({\bf\it r_{i}},{\bf\it r_{j}})&K({\bf\it r_{j}},{\bf\it r_{j}})\end{bmatrix}\right]. (2)

The function f(𝒓𝒋)f({\bf\it r_{j}}) then has the expectation value

f(𝒓𝒋)=K(𝒓𝒋,𝒓𝒊)(K(𝒓𝒊,𝒓𝒊)+Cdata)1g(𝒓𝒊).\left<f({\bf\it r_{j}})\right>=K({\bf\it r_{j}},{\bf\it r_{i}})\left(K({\bf\it r_{i}},{\bf\it r_{i}})+C_{\rm data}\right)^{-1}g({\bf\it r_{i}}). (3)

GPR has been implemented in code packages such as GPy (The GPy authors, 2014) and George (Ambikasaran et al., 2015) and is commonly used for data interpolation and model inference, but its applications in image processing are rare because of the enormous size of imaging data. In the present paper we apply GPR to the problem of imputing missing data in a two-dimensional image. We take the function gg to be the measured counts by an instrument subject to read noise, shot noise, and instrumental effects; the function ff to be the true intensity on the sky; 𝒓𝒊{\bf\it r_{i}} to be the good pixels; and 𝒓𝒋{\bf\it r_{j}} to be the bad pixels where the measured values gg are not available. The problem is then equivalent to finding f(𝒓𝒋)\left<f({\bf\it r_{j}})\right> from g(𝒓𝒊)g({\bf\it r_{i}}) by Equation (3) given a covariance function KK. The values f(𝒓𝒋)\left<f({\bf\it r_{j}})\right> approximate the missing data at pixels 𝒓𝒋{\bf\it r_{j}} and can replace them in the original two-dimensional image.

Because of the O(n3){O(n^{3})} complexity of matrix inversion, using Equation (3) on an entire image with several million pixels is not possible. We instead correct for one bad pixel at a time using only its neighbors. The resulting estimate of a bad pixel’s value turns out to be a weighted sum of its neighbours’ counts. This may be thought of as an interpolation kernel applicable to the entire image. For example, if we use all pixels that are in the 9×99\times 9 region centered at the bad pixel, then f(𝒓𝒋)f({\bf\it r_{j}}) would be a scalar, g(𝒓𝒊)g({\bf\it r_{i}}) an 80×180\times 1 vector, K(𝒓𝒋,𝒓𝒊)K({\bf\it r_{j}},{\bf\it r_{i}}) a 1×801\times 80 vector, and (K(𝒓𝒊,𝒓𝒊)+Cdata)1\left(K({\bf\it r_{i}},{\bf\it r_{i}})+C_{\rm data}\right)^{-1} an 80×8080\times 80 matrix. If CdataC_{\rm data} is diagonal with constant variance σiσ\sigma_{i}\equiv\sigma, and if K(𝒓,𝒓)K({\bf\it r},{\bf\it r^{\prime}}) depends only on the relative position 𝒓𝒊𝒓𝒋{\bf\it r_{i}}-{\bf\it r_{j}}, then the 1×801\times 80 vector GK(𝒓𝒋,𝒓𝒊)(K(𝒓𝒊,𝒓𝒊)+Cdata)1G\equiv K({\bf\it r_{j}},{\bf\it r_{i}})\left(K({\bf\it r_{i}},{\bf\it r_{i}})+C_{\rm data}\right)^{-1} would be the same for all 𝒓𝒋{\bf\it r_{j}}. Filling in zero for the missing central pixel (so that it is neglected when estimating its value) and reshaping to a 9×99\times 9 array, GG becomes an interpolation kernel. Figure 1 compares the convolution of an image with a GPR kernel to that with a median filter and with a Gaussian kernel.

The residual of the convolution at a pixel can be thought of as the residual of image imputation if that pixel were the only bad pixel on the image. The convolution kernel that produces the least residual is thus the most suitable for image imputation. In our comparison, the GPR kernel best restores the original image because we have optimized it for image imputation using methods described in Section 3.1. In Figure 1 and all subsequent figures, the kernels and residuals are plotted on a color scale such that white is zero, red is positive, and blue is negative.

Refer to caption
Figure 1: Part of an image (top left) convolved with three different kernels: a median filter (second column), a Gaussian kernel with zero weight in the central pixel and renormalized to unit area (third column), and a GPR kernel (right column). The GPR kernel best restores the original image. It is constructed from the squared exponential covariance function with a=3.02a=3.02 and h=0.72h=0.72 (which have been tuned as described in Section 3.1 using the full image). The Gaussian kernel has standard deviation 0.720.72 to match the GPR kernel. The middle row shows the interpolating kernels themselves, and the bottom row shows the residuals of convolution. The original image is normalized by its brightest pixel. The kernels and residuals are plotted on a color scale such that white is zero, red is positive, and blue is negative.

2.1 The Covariance Function

The choice of covariance function determines the smoothness and the extent of the kernel. In principle, a wide variety of covariance functions could be used, but we will focus on the squared exponential covariance function because of its simplicity and its good approximation to a typical instrumental PSF:

KSE(𝒓,𝒓)=a2exp[|𝒓𝒓|22h2].K_{\rm SE}({\bf\it r},{\bf\it r^{\prime}})=a^{2}\exp\left[-\frac{{|{\bf\it r}-{\bf\it r^{\prime}}|}^{2}}{2h^{2}}\right]. (4)

The parameter aa in Equation (4) describes the magnitude of the correlation between pixels; it should be compared to the data variance σ2\sigma^{2}. The kernel does not depend on aa and σ\sigma individually but on the correlation-to-noise ratio αa/σ\alpha\equiv a/\sigma, due to the fact that Ka2=α2σ2K\propto a^{2}=\alpha^{2}\sigma^{2} and Cdataσ2C_{\rm data}\propto\sigma^{2}, so (K+Cdata)1σ2(K+C_{\rm data})^{-1}\propto\sigma^{-2}, canceling σ\sigma in Equation (3). Hence, in the rest of the paper we will set σ=1\sigma=1 (so that CdataC_{\rm data} is the identity matrix) and use aa to mean a/σa/\sigma. The noiseless limit a/σa/\sigma\rightarrow\infty implies a constant, dark image, and a kernel independent of aa. The other limit, a/σ0a/\sigma\rightarrow 0, implies a noisy image where no useful information about a pixel’s value can be extracted from its neighbors. The parameter hh gives the characteristic scale over which the image varies. Larger values of hh correspond to broader kernels. The values of the hyperparameters aa and hh, together with a particular choice of the covariance function, parameterize a family of GPR kernels.

Refer to caption
Figure 2: Example GPR kernels constructed using the squared exponential covariance function with different values for the hyper-parameters a/σa/\sigma and hh. An increased aa assigns larger weights and more significant negative weights. An increased hh causes the weights to change over a larger scale. All of the GPR kernels are zero at the center (the location of the bad pixel) and have ring-like structures due to the azimuthal symmetry of KSEK_{\rm SE}.

Figure 2 shows four examples of GPR kernels with different hyperparameters a/σa/\sigma and hh. As a/σa/\sigma increases, GPR makes use of more pixels in the region and assigns larger weights to the adjacent pixels, allowing significant negative weights to be used. As hh increases, the absolute values of the weights decay more slowly with increasing distance; the weights change over a larger scale. All GPR kernels with a single bad pixel have ring-like structure because the value of the squared exponential covariance function depends only on the Euclidean distance to the central pixel, and they are zero at the center where the bad pixel is located.

2.2 Additional Missing Pixels in the Kernel

The fact that each particular choice of aa, hh, and covariance function uniquely determines a convolution kernel has several advantages. One of them is that when there is a continuous region of bad pixels to be assigned zero weights, GPR is able to recompute the weights on all other pixels instead of simply re-normalizing them. To give zero weights to additional bad pixels near the bad pixel to be corrected, all we need to do is remove the corresponding entries in the covariance function and GPR automatically compensates for the loss of information by increasing or reducing the weights of their neighbours. Figure 3 panels (a) through (c) show how additional bad pixels near the central bad pixel re-weight the entire GPR kernel and break its symmetry.

Refer to caption
Figure 3: The re-weighting of GPR kernels due to additional missing data. All kernels are based on the squared exponential covariance function with hyperparameters a/σ=10a/\sigma=10 and h=1h=1. Panel (a) shows the original kernel. Panel (b) has a bad pixel at (1,1)(1,1), so the pixels in that direction are assigned larger weights to compensate for the loss of information. Panel (c) has a cross of bad pixels at the center, so the weights that they originally had are redistributed to their nearest neighbours. Panel (d) shows a bad pixel at location (1,0)(1,0) relative to the top left corner of the image, so we are only able to construct a 5×65\times 6 kernel and the weights change accordingly.

Another advantage of GPR is its straightforward handling of bad pixels on the edges of an image. Similar to the accommodation of additional bad pixels, we only need to remove all points that are outside of the image from the covariance function, and GPR will recompute the weights of the rest of the points. Figure 3 panel (d) gives an example of a truncated GPR kernel due to a bad pixel on the top left corner of the image. Because the bad pixel is at (1,0)(1,0) relative to the corner, with a kernel width of 99, we are only able to construct a 5×65\times 6 kernel. GPR enables the weights to change accordingly to deal with the loss of information.

3 Optimizing the GPR Kernel

Because aa, hh, and the covariance function parameterize a family of convolution kernels, we can optimize within the parameter space for each image to obtain a kernel that is the most suitable for repairing that image. In the high signal-to-noise ratio (SNR) limit, the optimization process can be viewed as modelling the instrumental effects on the structure of an image. For an image of distant stars, for example, it is roughly equivalent to finding the correlation function that best represents the instrumental PSF. The optimal kernel should not change as long as this PSF is stable across the image. Variations in the image itself, such as a spatially changing flux, do not affect the optimal kernel as long as the kernel is normalized. Hence, we optimize only once per image, and in general, one can optimize only once per instrument and per configuration.

3.1 Choosing the Hyperparameters

To choose the optimal hyperparameters aa and hh, our strategy is to identify a set of well-measured pixels in the image to serve as the training set. Given aa and hh, we can then convolve the entire image by the kernel defined by aa and hh and compute the subsequent change in value at each pixel in the training set. We find the aa and hh that minimize the mean absolute deviation of these residuals.

The training set has to be large enough to be representative of the image, and it should focus on the pixels of scientific importance. Generally this means pixels with count rates appreciably higher than the background level. A kernel trained on low-SNR background pixels would simply learn to average a large set of surrounding pixels rather than identifying the correlation between pixels containing astronomically useful flux. A kernel optimized on a high-SNR training set would be good at correcting pixels of scientific importance even for low-SNR images (see Section 5.4). For a typical application, then, the training set should consist of pixels well above the background level but below saturation. Section 4.1 describes astrofix’s choice of training set.

3.2 Choosing the Optimal Covariance Function

In principle, any covariance function can be used to generate GPR kernels, but only covariance functions that resemble the instrumental PSF would be useful. The squared exponential covariance function is therefore a reasonable choice. One notable alternative that we investigate here is the Ornstein-Uhlenbeck covariance function:

KOU(𝒓,𝒓)=a2exp[|𝒓𝒓|h]K_{\rm OU}({\bf\it r},{\bf\it r^{\prime}})=a^{2}\exp\left[-\frac{|{\bf\it r}-{\bf\it r^{\prime}}|}{h}\right] (5)

Compared to the squared exponential covariance function, the Ornstein-Uhelenbeck covariance function is not quadratic but linear in Euclidean distance in the exponent. As a consequence, it tends to under-emphasize pixels that are the closest to the central bad pixel, which often contain the most important information. Figure 4 compares the optimized kernels for the star image in Figure 1 using the Squared Exponential covariance function (Equation (4)) with those using the Ornstein-Uhelenbeck covariance function (Equation (5)). The magnitude of the weights in the central region is significantly smaller for the KOUK_{\rm OU} kernels than for the KSEK_{\rm SE} kernels.

One other feature of a KOUK_{\rm OU} kernel is that its more slowly changing function values lead to smaller fluctuations in the weights and less prominent negative weights. As a consequence, most outer pixels have almost zero weights. In fact, for most values of hh smaller than the kernel width, almost all of the weight in a KOUK_{\rm OU} kernel is concentrated in the central 5×55\times 5 region. Hence, a KOUK_{\rm OU} kernel of width >5>5 is almost identical to a KOUK_{\rm OU} kernel of width 55, which is not the case for KSEK_{\rm SE} kernels.

As a result of the two aforementioned features, a KOUK_{\rm OU} kernel is usually not as accurate and as flexible as a KSEK_{\rm SE} kernel. Figure 5 shows the histogram of residuals (normalized by the shotnoise=count/gain\rm shot\ \rm noise=\sqrt{\rm count/\rm gain}) from convolving the star image with the two types of kernels. In terms of the mean absolute deviations of the residuals, the KSEK_{\rm SE} kernel with width 9 performs better than the other kernels, while the KOUK_{\rm OU} kernel with width 9 has almost the same residual distribution as the KOUK_{\rm OU} kernel with width 5.

astrofix does have other covariance functions implemented, but we have found the squared exponential to produce the narrowest residual distribution. We restrict further analysis in this paper to the squared exponential covariance function.

Refer to caption
Figure 4: The optimized kernels for the star image in Figure 1 using KSEK_{\rm SE} versus KOUK_{\rm OU} as the covariance function, and using kernel width =5=5 versus width =9=9. Due to the linear exponent in the covariance function, the KOUK_{\rm OU} kernels have smaller maximum weights, less prominent negative weights, and most of their weights are concentrated in the central 5×55\times 5 region no matter what kernel width we choose.
Refer to caption
Figure 5: The residual (normalized by the shotnoise=count/gain\rm shot\ \rm noise=\sqrt{\rm count/\rm gain}) of convolving the star image in Figure 1 with the kernels in Figure 4. Both KSEK_{\rm SE} kernels outperform the KOUK_{\rm OU} kernels, and increasing the width of a KSEK_{\rm SE} kernel from 5 to 9 further improves its performance.

3.3 A Stretched Kernel for 2D Spectroscopic Data

So far, we have discussed GPR kernels that have weights depending only on the Euclidean distance rr, but for anisotropic images, it is sometimes preferred to use a kernel that has directional dependence. Such a kernel can be particularly useful to the imputation of 2D spectroscopic data. For this purpose, we define the stretched squared exponential covariance function:

KsSE(𝒓,𝒓)=a2exp[12((xx)2hx2+(yy)2hy2)]K_{\rm sSE}({\bf\it r},{\bf\it r^{\prime}})=a^{2}\exp\left[-\frac{1}{2}\left(\frac{(x-x^{\prime})^{2}}{h_{x}^{2}}+\frac{(y-y^{\prime})^{2}}{h_{y}^{2}}\right)\right] (6)

which produces kernels determined by three parameters: aa, hxh_{x}, and hyh_{y}. We assume here that the preferred direction (e.g. the dispersion direction) approximately follows one of the orthogonal axes of the detector, though it would be straightforward to adopt a different angle between the preferred direction of the image and the detector axes. For isotropic images, we expect the optimization to give hxhyh_{x}\approx h_{y}. For vertical spectroscopic lines (horizontal dispersion) in a slit spectrograph, we expect that hx<hyh_{x}<h_{y} because the pixel values vary on a larger scale in the yy-direction (along the slit). Figure 6 compares the isotropic and the stretched KSEK_{SE} kernel optimized for the spectroscopic arc lamp image in Section 5.3, and Figure 10 shows that the stretched kernel produces smaller residuals in this example.

We encourage the user to exercise care with a stretched covariance function. The extra free parameter increases the risk of overfitting the hyperparameters. A stretched kernel can also smooth out a point source on the slit, especially if the hyperparameters are optimized on the sky background, a galaxy stretching the length of the slit, or a lamp flat. Our training image in this case was an arc lamp for wavelength calibration; the best-fit hyperparameters would differ for a galaxy or a twilight spectrum.

Refer to caption
Figure 6: The isotropic versus the stretched kernel optimized for the spectroscopic arc lamp image in Section 5.3. The stretched kernel has greater weights in the yy-direction because the slit is oriented vertically. Despite the different parameter values of the two kernels, they assign similar weights to the nearest pixels in the yy-direction, which results in similar performance as shown in Figure 10.

4 Implementation and Computational Cost

We implement the GPR image imputation algorithm in the Python package astrofix (Zhang & Brandt, 2021), which is available at https://github.com/HengyueZ/astrofix. astrofix accepts images with a bad pixel mask or images with bad pixels flagged as NaN, and it fixes any given image in three steps:

  1. 1.

    Determine the training set of pixels that astrofix will attempt to reproduce.

  2. 2.

    Find the optimal hyperparameters aa and hh (or aa, hxh_{x} and hyh_{y}) given the training set from Step 1.

  3. 3.

    Fix the image by imputing data for the bad pixels using the optimized kernel from Step 2.

We describe the details of each step in the subsections below. Appendix A details the computational cost of our approach. The computation time is a function of the image size and the fraction of bad pixels. On a single processor, it is \sim10 seconds for a 2000×20002000\times 2000 image with 0.5% bad pixels, or a 4000×40004000\times 4000 image with 0.1% bad pixels.

4.1 Defining the Training Set

The set of pixels used to optimize the GPR should include as many pixels of scientific importance as possible, while avoiding pixels that are close to the background level, saturated, or themselves bad. By default, we include all good (unflagged) pixels that are below 1/51/5 of the brightest pixel’s value and 1010 median absolute deviations (MAD) above the median of the background. We take the median and the MAD of the image in lieu of the mean and standard deviation, because of the robustness and convenience of the median. The MAD is an underestimate of the standard deviation: for a Gaussian background distribution, 1010 MAD correspond to about 7σ7\sigma.

In many cases, the users can replace the default training set selection parameters (1/51/5 and 1010) with better ones by deriving the background distribution and the saturation level from the histogram of the image, but for high-SNR images with sufficient bright pixels, the quality of the training is largely insensitive to the parameters. The default training set usually works sufficiently well without requiring a highly accurate model of the background. For low-SNR images, one has to tune the parameters more carefully or use a high-SNR image taken by the same instrument (and under the same configuration) as the training set.

The training set might include pixels that are themselves good but are close to bad pixels that can ruin the convolution. We remove such pixels during the training process described in the next subsection.

4.2 Tuning the Hyperparameters

In the high-SNR limit, the training process can be considered as modeling the structure that an instrument imparts to an image, such as the PSF. Therefore, an efficient strategy is to run the training only once for each instrument using a combined training set built from multiple images. However, the optimal kernel may still vary from image to image due to, e.g., variations in the seeing. It is not computationally expensive to redo the training as appropriate for each image (see Appendix A).

In the training process, we convolve the training set pixels with GPR kernels and vary the hyperparameters aa and hh (or aa, hxh_{x}, and hyh_{y}) to minimize the mean absolute residual in the training set. We only need to compute the kernel once at fixed values for aa and hh. The computational cost then scales as Ntrain×w2N_{\rm train}\times w^{2}, where ww is the width (in pixels) of the kernel and NtrainN_{\rm train} is the number of pixels in the training set. We construct an image array of size Ntrain×w2N_{\rm train}\times w^{2} to perform the convolution.

A problem naturally occurs in the limit of noiseless data, where the optimal value of the parameter aa (i.e. a/σa/\sigma in our notation) would be infinite. If aa\rightarrow\infty and hh is relatively large, evaluating Equation (3) requires the inversion of an ill-conditioned (though not formally singular) matrix. This leads to numerical instability in the constructed GPR kernel, and the optimization procedure can end up in a spurious minimum.

We restrict aa to small values by multiplying the mean absolute residual with a penalty function:

P=1+exp[(aμ)τ]P=1+\exp\left[\frac{\left(a-\mu\right)}{\tau}\right] (7)

Here, μ\mu can be interpreted as a ”soft” upper bound on aa. For aμa\ll\mu, P1P\approx 1; for a>μa>\mu, PP increases exponentially. τ\tau determines how much aa can penetrate the soft upper bound. The best values of μ\mu and τ\tau are determined by the largest condition number that the machine can handle in a matrix inversion without much loss of accuracy. Figure 7 shows how the condition number depends on the hyperparameters aa and hh for 3 different kernel widths. A larger kernel width requires inverting a larger matrix, which leads to a larger condition number. The condition number increases quickly with hh for small hh and increases quickly with aa for larger hh. The contours indicate the upper bound on aa given a desired upper bound on the condition number. For instance, if the maximum acceptable condition number is 10810^{8}, then in the worst case scenario (largest ww and hh), μ=2400\mu=2400 is roughly the upper bound on aa. By default, we set μ=3000\mu=3000 and τ=200\tau=200, which corresponds to a maximum condition number of about 6×1086\times 10^{8} for w=9w=9 and h=9h=9. The user can change the upper bound parameters according to their own needs and referring to Figure 7.

Refer to caption
Figure 7: Dependence of the condition number on aa and hh for 3 different kernel widths. The condition number increases quickly with hh for small hh and increases quickly with aa for larger hh. The contours indicate the upper bound on aa given a desired upper bound on the condition number.

Then, for a given image and a chosen training set, astrofix’s steps to tune the GPR hyperparameters are:

  1. 1.

    Flag all bad pixels as NaN.

  2. 2.

    Construct the image array for pixels in the training set, and remove rows that contain NaN values.

  3. 3.

    Propose the values of aa and hh (or aa, hxh_{x}, and hyh_{y}), and construct the corresponding kernel.

  4. 4.

    Convolve the image with the kernel, restricting the convolution to pixels in the training set.

  5. 5.

    Find the mean absolute residual of the convolved image and the original image for pixels in the training set.

  6. 6.

    Minimize the mean absolute residual times the penalty function with respect to the parameters.

  7. 7.

    Return the optimal parameter values and the residual distribution.

Steps 3 through 5 are wrapped into one single function to be passed to scipy.optimize.minimize, which is responsible for step 6. In step 3, we do not reweight the kernel in the presence of additional bad pixels because of its expense. We simply remove, in step 2, all affected pixels from the training set by checking each row of the image array for NaN values. In step 6, we require that a1a\geq 1 because a<1a<1 would imply that the correlation between pixels is smaller than the measurement noise. If aa is small, neighboring pixels would contribute little information about a missing data point and our effort would be meaningless. We also require that 1/2hw1/2\leq h\leq w. A characteristic scale larger than the kernel width would imply a spatially uniform image, while one much smaller than a pixel would again mean that a pixel’s neighbors do not contain meaningful information to interpolate its value.

The computational expense of the optimization process depends on the size of the training set, but it usually takes only a few seconds. For example, it took \approx6 seconds to train on the CHARIS spectrograph in Section 5.2 with \approx439,000 pixels in the training set. Appendix A includes a detailed analysis of the computational cost of astrofix.

4.3 Fixing the Image

The optimal kernel found in the training process relies on the assumption that for every bad pixel, no other bad pixels or image edges lie within the w×ww\times w region of the kernel. In this case, correcting the image is equivalent to a simple convolution with the optimal kernel at the location of each bad pixel. However, if there are additional bad pixels or pixels off the edges of the image, we must set their weights to zero and recompute the entire kernel by removing the corresponding entries in the covariance function (as discussed in Section 2.2). When the bad pixel density is large, almost every bad pixel requires computing a new kernel, so the resulting computational cost scales linearly with the bad pixel density and can be expensive (see Appendix A).

The optimal kernel that corresponds to an isolated bad pixel is usually used many times, so we pre-construct it before looping through all bad pixels. Because we correct only one bad pixel at a time and only a small fraction of the image overall, we use the weighted sum of the neighboring pixels rather than convolving the entire image. Finally, we replace the bad pixels with their imputed values only after exiting the loop. This avoids using imputed data as if they were actual measurements.

To sum up, astrofix’s steps to impute the missing data in an image are as follows:

  1. 1.

    Pre-construct the optimal kernel for an isolated bad pixel.

  2. 2.

    For each bad pixel:

    Check if there are additional bad pixels or image edges nearby.

    If not, use the optimal kernel. Otherwise, construct a kernel that has the largest possible size (w\leq w in each direction) and has weights reassigned according to the locations of additional missing pixels.

    Compute the sum of the neighboring pixels weighted by the kernel to find the bad pixel’s imputed value. Store this value.

  3. 3.

    Replace the bad pixels with the imputed values.

  4. 4.

    Return the fixed image.

5 Examples and Results

We tested astrofix on a variety of images with real and artificial bad pixels. In this section we give a few examples to demonstrate its consistently good performance.

5.1 Artificial Bad Pixels on a CCD Image of a Globular Cluster

Images of star clusters often contain a high density of useful information. While approaches like PSF photometry can handle missing data appropriately (if a good model of the PSF is available), such an analysis is not always possible. Aperture photometry, for example, cannot tolerate missing data. We therefore test astrofix on a real CCD exposure of the globular cluster NGC 104, taken by the SBIG 6303 0.4m telescope at the Las Cumbres Observatory (LCO). We randomly generated 62069 (\sim1%) artificial bad pixels and imputed their values.

The left panel of Figure 8 compares the residual of image imputation Δμ\Delta\mu by GPR to that by two standard approaches: astropy.convolution (with a Gaussian kernel of standard deviation 1) and the 5×55\times 5 median filter used by LACosmic. The residual is normalized by the shot noise. Because most bad pixels that we have randomly selected were near the background level, we only include in the histogram the 2642 bad pixels that originally had counts 10σ\sigma above the background. This tests astrofix’s ability to restore useful information. The right panels of Figure 8 show the unnormalized residuals in a 25×2525\times 25 region at the center of the cluster. GPR outperforms the other approaches, especially in dealing with continuous regions of bad pixels. In terms of the outlier-resistant mean absolute deviation, GPR outperforms astropy’s interpolation by about a factor of 2, and median replacement by more than a factor of 3.

Refer to caption
Figure 8: The residual (Δμ\Delta\mu) of correcting the cluster image by GPR, astropy.convolution, and the 5×55\times 5 median filter used by LACosmic. The histogram (left panel) includes the normalized residual of the 2642 artificial bad pixels that have counts 10σ10\sigma above the background. The right panels show the unnormlized residual in a 25×2525\times 25 sample region. The good pixels are colored in white. Positive residuals are in red and negative residuals are in blue. The darkest blue points have values <600<-600: they are too bad to be differentiated by this color map. GPR outperforms astropy.convolution by about a factor of 2 and median replacement by a factor of 3, and it performs especially well in dealing with continuous region of bad pixels.

5.2 Raw CHARIS Imaging Data

Infrared detectors are more specialized than CCDs, and often have a higher incidence of bad pixels. The CHARIS integral-field spectrograph (IFS), for example, uses a Hawaii2-RG array (Peters et al., 2012), and has a bad pixel fraction of about 0.5% (Brandt et al., 2017). Similar instruments include the GPI (Macintosh et al., 2014) and SPHERE (Beuzit et al., 2019) IFSs; these have comparable bad pixel rates. The data processing pipelines for GPI (Perrin et al., 2014) and SPHERE both use box extraction to reconstruct the spectra of each lenslet. This approach cannot accommodate missing data, so both pipelines interpolate to fill in bad pixel values.

We test astrofix on a raw CHARIS image with 26338 intrinsic bad pixels flagged by values of zero. There were 438996 pixels in the training set, and the total computational time was 14.96 seconds. The top panels of Figure 9 show a slice of the raw image as well as the corrected images by GPR, by 5×55\times 5 median replacement, and by astropy.convolution (σ=1\sigma=1). The bottom panels show a slice through the corresponding extracted data cubes. The original data cube comes from the standard CHARIS data reduction pipeline (Brandt et al., 2017) which ignores bad pixels. GPR clearly outperforms both astropy and the median filter and performs comparably well to the standard pipeline. When the data processing pipeline cannot ignore bad pixels (e.g. box extraction for GPI and SPHERE), astrofix would become a useful interpolation method.

Refer to caption
Figure 9: The top left panel shows a 50×5050\times 50 region in a raw CHARIS read with bad pixels, and the other top panels show the corrected images by GPR, the 5×55\times 5 median filter, and astropy.convolution (σ=1\sigma=1). The bottom panels show a 50×5050\times 50 monochromatic slice through the extracted data cubes. Then, with the CHARIS plate scale of 16.4 mas/lenslet (Brandt et al., 2017), each bottom panel shows the same 0.82′′×0.82′′0.82^{\prime\prime}\times 0.82^{\prime\prime} field at λ=1.80\lambda=1.80 μm\rm\mu m, but extracted from different corrected images. The original data cube comes from the standard CHARIS data reduction pipeline (Brandt et al., 2017) which ignores bad pixels. GPR clearly outperforms both astropy and the median filter and performs comparably well to the standard pipeline.
Refer to caption
Figure 10: The residual (Δμ\Delta\mu) of correcting the 2D arc lamp spectrum by GPR, sGPR, astropy.convolution (σ=1\sigma=1), and the 5×55\times 5 median filter. The histogram (left panel) includes the normalized residual of the 1524 artificial bad pixels that had counts 10σ\sigma above the background. The right panels show the unnormalized residual in a 35×2535\times 25 sample region. In this example, both GPR and sGPR outperform other approaches by a factor >4>4, and sGPR makes a small improvement over the conventional GPR.

5.3 Artificial Bad Pixels on 2D Spectroscopic CCD Data

The standard GPR approach does not require any special treatment to adapt well to spectroscopic data, but we can apply a stretched GPR kernel to further improve its performance. In a real spectroscopic CCD exposure taken by the FLOYDS spectrograph (Sand et al., 2011) at LCO, we randomly generated 53537 (\sim5%) artificial bad pixels, of which 1524 pixels had counts 10σ10\sigma above the background. We applied both GPR and stretched GPR (sGPR) to impute the missing data, and their residuals (normalized by the shot noise) are compared in the left panel of Figure 10 with the residuals from astropy.convolution (σ=1\sigma=1) and the 5×55\times 5 median filter. The right panels of Figure 10 show the unnormalized residuals in a 35×2535\times 25 region. Both GPR and sGPR outperform other approaches by a factor >4>4, and sGPR offers a small improvement over GPR.

The data in this case are from an arc lamp and show little structure perpendicular to the dispersion direction. The results would differ for spectroscopy of a point source. Depending on the strength of spectral features, the image in the dispersion direction might appear smoother than in the slit direction, the reverse of this case. For this reason, we urge caution in the use of sGPR with astrofix.

5.4 PSF Photometry and Astrometry on Low-SNR Stellar Images with Bad Pixels

The previous examples demonstrate astrofix’s ability to repair high-SNR images. In the low-SNR limit, however, the residual minimization process of astrofix favors a uniform kernel with a0a\rightarrow 0 to smooth out the background noise. This reduces astrofix to a simple average, which is suboptimal for restoring the useful information in a pixel. Therefore, we recommend that the user restrict the training to the brighter pixels in the low-SNR image or train on a high-SNR image taken by the same instrument (and under the same configuration) and apply the optimized kernel to the low-SNR image. Because such a kernel is trained on strongly correlated neighbouring pixels, it would be non-uniform and could produce a wider residual distribution for low-SNR bad pixels. However, the optimized kernel would perform substantially better than the uniform kernel in repairing pixels of practical use, such as pixels containing the PSF of a star.

As an example, we compare astrofix and astropy.convolution in terms of PSF photometry and astrometry on the truncated stellar image in Figure 1. We generate artificial bad pixels in its PSF and add white Gaussian noise to lower its SNR. We then apply both astrofix and astropy.convolution to repair the PSF at SNR=5,6,7,,40\rm SNR=5,6,7,...,40. We train the GPR kernel on the rest of the image with high SNR and get a=3.02,h=0.72a=3.02,h=0.72. For astropy.convolution, we apply a Gaussian kernel of σ=0.95\sigma=0.95 to match the σ\sigma of the fitted PSF. Finally, we perform PSF photometry and astrometry on the fixed images and compute the deviations of the fitted centroids from the true centroid position (Δx\Delta x and Δy\Delta y, in the unit of pixels) and the fractional difference between the fitted flux and the true flux (ΔF/F\Delta F/F). For both photometry and astrometry, we assume Gaussian PSFs and use IterativelySubtractedPSFPhotometry and IRAFStarFinder in the Python package Photutils with the default parameters. We repeat the procedure for 1000 random noise samples at each SNR level.

Figure 11 shows the results for two different sets of bad pixels. astrofix produces much smaller systematic errors on all three fitted parameters, showing that the optimized GPR kernel is the more accurate model of the PSF, although, at very small SNRs, it generally produces a slightly larger spread in the fitted parameters. In other words, we trade precision for accuracy by training the kernel on high-SNR pixels, putting greater weights on modeling the PSF instead of averaging out the noise. This strategy maintains the scientific value of astrofix even for low-SNR images, and a high-SNR training set is almost always available from the brighter regions on the same image or other images taken by the same instrument.

Refer to caption
Figure 11: Performance comparison between astrofix and astropy.convolution in the low-SNR limit in terms of PSF photometry and astrometry. The bad pixels (white in the top row) in the PSF of the truncated stellar image in Figure 1 are fixed by astrofix and astropy.convolution separately, and the difference between the PSF photometry and astrometry results on the fixed images and the true results are presented in the next three rows as a function of SNR. The GPR kernel was trained on the rest of the image with high SNR. The Gaussian kernel has σ=0.95\sigma=0.95 to match the σ\sigma of the PSF. 1000 different noise samples are generated at each SNR level, and the thick lines enclose the 90% intervals. For both bad pixel configurations, astrofix produces much smaller systematic errors on all three fitted parameters, showing that the optimized GPR kernel is the more accurate model of the PSF, although, at very small SNRs, it generally produces a slightly larger spread (less precision) in the fitted parameters. This is because the GPR was trained to model the PSF instead of averaging out noise.

A fully uniform background would remove the bias seen in Figure 11. In this case, each neighboring pixel would contain the same information as the bad pixel, and the optimal approach would be a simple average of neighboring good pixels. Our approach with astrofix, however, will better retain small-scale structure atop a smoothly varying background.

6 Conclusions

We have presented astrofix, an image imputation algorithm based on Gaussian Process Regression. The ability to optimize the interpolation kernel within a parametric family of kernels, combined with the simple and natural handling of clusters of bad pixels and image edges, enables astrofix to perform several times better than median replacement and astropy.convolution. It adapts well to various instruments and image types, including both imaging and spectroscopy.

As discussed in Section 5.4, the actual performance of astrofix depends on the SNR of the image, and its optimization process requires a high-SNR training set. For a low-SNR image, the training set can only be obtained from other images taken by the same instrument, assuming that the instrument has negligible variations from images to images. The assumption is usually reliable, but it can sometimes fail due to changes in the seeing conditions, so it takes some extra effort to check these conditions before choosing a training set. The sGPR covariance function in astrofix should be used with caution because it may smooth out a point source in the slit direction. The default parameters in astrofix, such as the initial guess for the optimization and the cutoffs for the training set selection, were chosen carefully but have not been tested extensively on all image types. Nevertheless, they worked sufficiently well in all of our examples.

It is worth noting that astrofix works best when there is sufficient structure on the scale of the kernel (1/2hw1/2\leq h\leq w). For a uniform background or a smooth, extended source varying only on a large scale hwh\gg w, the correlation-to-noise ratio would be extremely low within the size of the kernel, meaning that neighbouring pixels contribute almost no information to interpolate a pixel’s value. In this case, astrofix may not be as good as a simple average of neighbouring pixels. However, astrofix will still outperform other methods in correcting bad pixels if the extended source does have structure on the kernel scale for astrofix to learn, which is often the case. A better interpolation for each bad pixel then leads naturally to a better recovery of large-scale information.

We thank the anonymous referee for an insightful and detailed report. This work made use of observations from the LCOGT network. It is our pleasure to thank Curtis McCully for sharing the LCO images used in the examples and for a helpful conversation. We also thank Tim Morton for his feedback on this manuscript and insightful suggestions on the astrofix code. Finally, we would like to thank the entire UCSB Physics 240 class in the Spring 2020 quarter, where we discussed the very first ideas that lead to this work.

Appendix A Computational Cost of astrofix

Because astrofix requires optimization and reconstruction of kernels, it has a greater computational cost than most existing image imputation approaches. Nevertheless, one can reduce the cost by using a smaller kernel width or by optimizing only once for each instrument without much loss in accuracy (see Figure 5). The computational cost also depends on the number of bad pixels in an image. In particular, the cost of training is largely independent of the bad pixel fraction, while the cost of fixing the image scales with the bad pixel fraction as follows:

Cfix=ANbad+BNbad(1(1f)w21)C_{\mathrm{fix}}=AN_{\rm bad}+BN_{\rm bad}(1-(1-f)^{w^{2}-1}) (A1)

Here, the first term gives the total cost of convolving NbadN_{\rm bad} bad pixels with a kernel of width ww, and the constant AA is the computational cost of O(w2)O(w^{2}) floating point operations. The second term gives the total cost of reconstructing the kernel, where f=Nbad/Ntotf=N_{\rm bad}/N_{\mathrm{tot}} is the bad pixel fraction and (1(1f)w21)(1-(1-f)^{w^{2}-1}) is the probability that a bad pixel has at least one bad pixel neighbour in the surrounding w×ww\times w region. The constant BB is the average time of applying Equation (3), which consists of a matrix inversion of complexity O(w6)O(w^{6}) and a matrix-vector multiplication of complexity O(w4)O(w^{4}). As a consequence, BAB\gg A and the cost of kernel construction is greater than the cost of convolution for large values of ff. Assuming the bad pixels to be randomly positioned throughout the detector, the scaling of CfixC_{\rm fix} with ff depends on how ff compares to w2w^{-2}:

Cfix{AfNtot+Bf2(w21)Ntotfw2AfNtot+BfNtotfw2\displaystyle C_{\rm fix}\approx\begin{cases}AfN_{\mathrm{tot}}+Bf^{2}(w^{2}-1)N_{\mathrm{tot}}&f\ll w^{-2}\\ AfN_{\mathrm{tot}}+BfN_{\mathrm{tot}}&f\gg w^{-2}\end{cases} (A2)

In the limit of fw2f\gg w^{-2}, (1f)w210(1-f)^{w^{2}-1}\rightarrow 0, and both terms in CfixC_{\rm fix} increase linearly with ff. The cost per bad pixel stays constant because almost every bad pixel has at least one bad neighbour which requires reconstructing the kernel. In the limit of fw2f\ll w^{-2}, (1f)w211f(w21)(1-f)^{w^{2}-1}\rightarrow 1-f(w^{2}-1), so CfixC_{\rm fix} depends quadratically on ff. The cost per bad pixel becomes smaller because the kernel does not need to be recomputed for most bad pixels. The two limits are visible in Figure 12, which shows how the computational time of different components of astrofix scales with the fraction of bad pixels on the cluster image in Section 5.1. Each point is the median of 100 samples. The solid blue line fits for CfixC_{\mathrm{fix}} using Equation (A1), while the dashed and dotted lines are the corresponding convolution cost and kernel construction cost. With w=9w=9, only when f<1%f<1\% does the scaling of the computational cost deviate from the linear relation of the large ff limit. This linear scaling of the computational cost will set in at much lower values of ff if bad pixels tend to come in clusters (for example, due to cosmic rays).

For real images with real bad pixels, it usually takes \sim1 second on our Intel Core i5-8300H 2.30 GHz CPU to correct 2000 bad pixels, and the total time ranges from a few seconds to almost a minute. For example, the CHARIS data in Section 5.2 took a total of 14.96 seconds, of which 8.56 seconds were spent on fixing 26338 bad pixels on the 2048×20482048\times 2048 image, and 6.33 seconds were spent on the optimization process with 438996 pixels in the training set. The rest of the time went to defining the training set.

Refer to caption
Figure 12: Computational cost of astrofix in correcting a 200×200200\times 200 region from the cluster image in Section 5.1. Each point is the median of 100 samples. The cost of fixing the image (solid blue line) depends linearly on ff for large ff and depends quadratically on ff for f<1%f<1\%. The cost of training stays roughly constant. Because of the complexity of matrix inversion, the cost of kernel construction (dotted blue line) is much greater than the cost of convolution (dashed blue line) for most values of ff.

References

  • Ambikasaran et al. (2015) Ambikasaran, S., Foreman-Mackey, D., Greengard, L., Hogg, D. W., & O’Neil, M. 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence, 38, 252
  • Astropy Collaboration et al. (2018) Astropy Collaboration, Price-Whelan, A. M., Sipőcz, B. M., et al. 2018, AJ, 156, 123
  • Astropy Collaboration et al. (2013) Astropy Collaboration, Robitaille, T. P., Tollerud, E. J., et al. 2013, A&A, 558, A33
  • Beuzit et al. (2019) Beuzit, J. L., Vigan, A., Mouillet, D., et al. 2019, A&A, 631, A155
  • Bradley et al. (2020) Bradley, L., Sipőcz, B., Robitaille, T., et al. 2020, astropy/photutils: 1.0.0
  • Brandt et al. (2017) Brandt, T. D., Rizzo, M., Groff, T., et al. 2017, Journal of Astronomical Telescopes, Instruments, and Systems, 3, 048002
  • Claudi et al. (2008) Claudi, R. U., Turatto, M., Gratton, R. G., et al. 2008, in Society of Photo-Optical Instrumentation Engineers (SPIE) Conference Series, Vol. 7014, Ground-based and Airborne Instrumentation for Astronomy II, ed. I. S. McLean & M. M. Casali, 70143E
  • Farage & Pimbblet (2005) Farage, C. L., & Pimbblet, K. A. 2005, PASA, 22, 249
  • Harris et al. (2020) Harris, C. R., Millman, K. J., van der Walt, S. J., et al. 2020, Nature, 585, 357–362
  • Horne (1986) Horne, K. 1986, PASP, 98, 609
  • Hunter (2007) Hunter, J. D. 2007, Computing in Science Engineering, 9, 90
  • Jenkins et al. (2010) Jenkins, J. M., Caldwell, D. A., Chandrasekaran, H., et al. 2010, ApJ, 713, L87
  • Macintosh et al. (2014) Macintosh, B., Graham, J. R., Ingraham, P., et al. 2014, Proceedings of the National Academy of Science, 111, 12661
  • Macintosh et al. (2008) Macintosh, B. A., Graham, J. R., Palmer, D. W., et al. 2008, in Society of Photo-Optical Instrumentation Engineers (SPIE) Conference Series, Vol. 7015, Adaptive Optics Systems, ed. N. Hubin, C. E. Max, & P. L. Wizinowich, 701518
  • Perrin et al. (2014) Perrin, M. D., Maire, J., Ingraham, P., et al. 2014, in Society of Photo-Optical Instrumentation Engineers (SPIE) Conference Series, Vol. 9147, Ground-based and Airborne Instrumentation for Astronomy V, ed. S. K. Ramsay, I. S. McLean, & H. Takami, 91473J
  • Peters et al. (2012) Peters, M. A., Groff, T., Kasdin, N. J., et al. 2012, in Society of Photo-Optical Instrumentation Engineers (SPIE) Conference Series, Vol. 8446, Ground-based and Airborne Instrumentation for Astronomy IV, ed. I. S. McLean, S. K. Ramsay, & H. Takami, 84467U
  • Pych (2004) Pych, W. 2004, PASP, 116, 148
  • Rhoads (2000) Rhoads, J. E. 2000, PASP, 112, 703
  • Sand et al. (2011) Sand, D. J., Brown, T., Haynes, R., & Dubberley, M. 2011, in American Astronomical Society Meeting Abstracts, Vol. 218, American Astronomical Society Meeting Abstracts #218, 132.03
  • The GPy authors (2014) The GPy authors. 2012–2014, GPy: A gaussian process framework in python, http://github.com/SheffieldML/GPy
  • van Dokkum (2001) van Dokkum, P. G. 2001, PASP, 113, 1420
  • Virtanen et al. (2020) Virtanen, P., Gommers, R., Oliphant, T. E., et al. 2020, Nature Methods, 17, 261
  • Windhorst et al. (1994) Windhorst, R. A., Franklin, B. E., & Neuschaefer, L. W. 1994, PASP, 106, 798
  • Zhang & Brandt (2021) Zhang, H., & Brandt, T. D. 2021, Hengyuez/astrofix: v0.1.0
  • Zhang & Bloom (2020) Zhang, K., & Bloom, J. S. 2020, ApJ, 889, 24