Galaxy Image Deconvolution for Weak Gravitational Lensing with Unrolled Plug-and-Play ADMM

Tianao Li (李天骜),¹ Emma Alexander²
¹Department of Electronic Engineering, Tsinghua University, Beijing 100084, P.R.China
²Department of Computer Science, Northwestern University, Evanston, IL 60208, USA E-mail: [email protected]: [email protected]

(Accepted XXX. Received YYY; in original form ZZZ)

Abstract

Removing optical and atmospheric blur from galaxy images significantly improves galaxy shape measurements for weak gravitational lensing and galaxy evolution studies. This ill-posed linear inverse problem is usually solved with deconvolution algorithms enhanced by regularisation priors or deep learning. We introduce a so-called "physics-informed deep learning" approach to the Point Spread Function (PSF) deconvolution problem in galaxy surveys. We apply algorithm unrolling and the Plug-and-Play technique to the Alternating Direction Method of Multipliers (ADMM), in which a neural network learns appropriate hyperparameters and denoising priors from simulated galaxy images. We characterise the time-performance trade-off of several methods for galaxies of differing brightness levels as well as our method’s robustness to systematic PSF errors and network ablations. We show an improvement in reduced shear ellipticity error of 38.6% (SNR=20)/45.0% (SNR=200) compared to classic methods and 7.4% (SNR=20)/33.2% (SNR=200) compared to modern methods. \faGithub

keywords:

techniques: image processing - methods: data analysis - galaxies: general.

^†^†pubyear: 2022^†^†pagerange: Galaxy Image Deconvolution for Weak Gravitational Lensing with Unrolled Plug-and-Play ADMM–Galaxy Image Deconvolution for Weak Gravitational Lensing with Unrolled Plug-and-Play ADMM

1 Introduction

Images captured by real-world systems are degraded by nonidealities in the atmosphere, optics, and sensors. Processing images to correct these effects can significantly reduce errors in physical measurements drawn from astronomical images, such as the galaxy shape estimation used in cosmological weak lensing analysis to advance our understanding of cosmological models, the relationship between visible and dark matter, and cosmological-scale gravity (Kaiser et al., 1994; Bartelmann & Schneider, 2001; Mandelbaum et al., 2015; Mandelbaum, 2018). Conventionally, these nonidealities are modelled by convolution with a blur kernel called the Point Spread Function (PSF), followed by noise. Many computer vision algorithms have been developed to solve this ill-posed linear inverse problem, known as deconvolution. Here, we characterise the improvements that can be made to shear estimation through several deconvolution approaches, with a focus on a data-driven unrolled optimisation.

Two classic deconvolution algorithms still in widespread use are inverse filtering, such as Fourier division, and the iterative Richardson-Lucy algorithm (Richardson, 1972; Lucy, 1974), which optimizes the image estimate using gradient descent with an adaptive step size. These methods can amplify image noise, highlighting the importance of parametric models for stable estimation of galaxy shape, such as the Fourier Power Function Shapelets method (Li & Mandelbaum, 2022), which uses a pipeline of five shape estimation steps based on shapelet representation and bias correction.

With the recent flourishing of deep learning, many new galaxy image deconvolution algorithms have been developed. In computer vision, so-called physics-informed machine learning combines an explicit forward model of the image formation process with data-driven regularisation. Notably, Sureau et al. (2020) introduced two such deconvolution approaches enhanced by deep learning: Tikhonet, which post-processes the results of Tikhonov deconvolution with a deep neural net, and ADMMNet, an iterative deconvolution framework with learned denoising parameters. Nammour et al. (2022) then proposed ShapeNet, which retrains the Tikhonet network with a loss function designed to preserve galaxy shape. ADMMNet is based on the Alternating Direction Method of Multipliers (ADMM, Boyd et al., 2011), which finds the maximum-a-posteriori estimate of the ground truth image by alternating between tractable subproblems. After each subproblem, estimates are updated with a variable step size, where the optimal step size depends on the generally-unknown prior distribution of the dataset. ADMMNet takes advantage of the Plug-and-Play approach (Chan et al., 2016), in which the subproblems are seen to correspond to alternating deconvolution and denoising steps. This modular structure is exploited by including an off-the-shelf denoiser (typically a deep neural network) in place of the standard denoising step, for more accurate data-driven priors within the context of an interpretable and well-studied optimisation problem. In the case of ADMMNet, the deconvolution step is untrained and the denoising network is trained to remove additive noise from unblurred images, then the alternating steps are applied for a freely-varying number of iterations until convergence.

Refer to caption — Figure 1: Overview of the image processing pipeline. *Image simulations*: we simulated galaxy images for the upcoming LSST ground-based sky survey by modifying galaxy images from the COSMOS Dataset with the modular galaxy image simulation toolkit Galsim. The raw images were randomly sheared and rotated to simulate weak lensing effects. These images were convolved with randomised atmospheric and optical PSFs and scaled to varying brightnesses before Gaussian noise was added. Finally, all images were downsampled to the pixel scale of LSST. *Model*: we unrolled Plug-and-Play ADMM with a Gaussian likelihood into an N-layer neural network. The proposed method takes N=8 but separate networks were trained for N = 2 and 4 for full characterization of the method. Light grey boxes denote fixed computations while green boxes are neural networks (a small convolutional neural network for hyperparameter estimation and a ResUNet for denoising). Green outlines indicate the use of predicted hyperparameters in the fixed computations and horizontal green lines show that weights are shared across network layers. The network was trained with multiscale L1 loss but reconstruction quality is analysed using reduced shear error.

We propose an Unrolled Plug-and-Play ADMM network. In an "unrolled" or "unfolded" iterative optimisation, the iteration count is fixed a priori and each layer of a neural network performs one stage of the optimisation (Monga et al., 2021). Layers contain fixed computations reflecting built-in knowledge (the image formation model) as well as optimisation parameters (the deep denoising network and a small number of step size hyperparameters) that are treated as learnable weights. This approach allows efficient, end-to-end training of a limited number of weights and a predictable compute time. Fig. 1 shows the proposed network architecture, with fixed computations shown in light grey and learned parameters indicated in green. The green boxes are neural networks, green outlines denote the use of trained hyperparameters, and horizontal green lines indicate that learned weights are identical across layers. Details of network structure follow Sanghvi et al. (2022), including a subnetwork for parameter estimation so that the network can respond to the image noise level and PSF, though we use a Gaussian-noise-based update step in place of the Poisson step. We propose an 8-iteration network but report performance for 2- and 4-iteration networks as well.

In this paper, we benchmark classical and modern deconvolution methods on realistic simulated data for ellipticity error, compute time, and sensitivity to PSF errors, showing that unrolled Plug-and-Play ADMM, not previously applied to astronomical images, outperforms former methods robustly. We also provide source code and trained weights of the suggested model. Our user-friendly framework allows one to easily simulate their own data set and train the model under their settings. Our code and tutorials are all available on our GitHub repository.

2 Experiments

2.1 Image Simulations

To benchmark classical methods and train our network, clean ground truth images must be paired with realistic PSFs and noisy blurred images. We created our dataset by simulating ground-based observations with the modular galaxy image simulation toolkit Galsim¹¹1https://github.com/GalSim-developers/GalSim (Rowe et al., 2015) and the COSMOS Real Galaxy Dataset (Mandelbaum et al., 2012).

The image simulation process is illustrated in Fig. 1 and was based on the upcoming Legacy Survey of Space and Time (LSST, Ivezić et al., 2019). Raw images were picked randomly from the COSMOS dataset and then randomly sheared (uniform distribution of $[0.01,0.05]$ ) and rotated (uniform distribution of $[0,2\pi]$ ) to simulate weak lensing effects. We then applied both atmospheric and optical blurring. The atmospheric PSFs were simulated with the Kolmogorov function, whose seeing (FWHM) was sampled from a distribution (Connolly et al., 2010) as used in the GREAT3 Challenge (Mandelbaum et al., 2014). To simulate the atmospheric variation that leads to shape noise, we added shear (uniform distribution of $[0.01,0.03]$ ) and rotation (uniform distribution of $[0,2\pi]$ ) to the Kolmogorov functions. Our optical PSFs were simulated using Galsim, using the LSST diameter for diffraction and built-in ranges for obscuration, defocus, coma aberration and astigmatism. The two PSF components were convolved to get the overall PSF, which is saved for training and testing. The clean galaxy image (ground truth) was then convolved with this simulated PSF, then the brightness was scaled to fit the SNR of interest (SNR= $\sqrt{\Sigma I(x)^{2}/\sigma^{2}}$ ). Specifically, based on LSST read noise and a sky level of 350 ADUs/pixel²²2https://smtn-002.lsst.io, we added Gaussian noise with a standard deviation $\sigma=19.4$ ADUs. Finally, all images were downsampled to meet the pixel scale of the LSST (0.2 arcsec). While these images are mean-subtracted, a significant difference between our dataset and the GREAT3 simulations is that we do not normalise image brightness, so the hyperparameter subnetwork can readily adapt our method in response to SNR based on input brightness.

2.2 Network Training

We train on 40,000 samples using multi-scale L1 loss between ground truth and reconstructed images for 50 epochs with the Adam optimizer on an NVIDIA GeForce RTX 2080 Ti GPU. A separate network was trained for each iteration count (N=2,4,8). ResUNet (He et al., 2016) serves as the plugin denoiser (17,007,744 parameters), introducing skip connections in the neural network to avoid the problem of vanishing or exploding gradients as the network gets deeper with the increase of iterations. A CNN (80,236 parameters) was trained to learn the step size hyperparameters. We use the open source machine learning framework PyTorch, with which we also provide tools for dataset generation and benchmarking.

2.3 Performance across SNR levels

We tested the suggested method at different SNR levels and compared its performance to other algorithms. Sample images deconvolved by each of the algorithms are shown in Fig 2. At low SNR, the Richardson-Lucy algorithm results in noisier reconstructions in which the galaxy is gradually submerged in the noisy patterns in the background. Modern methods suppress background noise but can distort galaxy shape, particularly at low SNR. We use the FPFS toolkit for ellipticity estimation, in conjunction with the known PSF for inverse filtering (labelled as the FPFS method) or with a single-point PSF as a stable estimator without inverse filtering, which we apply to the output of all reported deconvolution methods throughout the paper. Errors are reported in terms of median reduced shear ( $\Delta g$ , Bartelmann & Schneider, 2001).

We characterise the reduced shear ellipticity error of each method across a dataset of 10,000 simulated galaxy images for 7 SNR levels from 20 to 200, shown in Fig. 3. We tested the Richardson-Lucy algorithm with 10 iterations, the built-in robust Fourier division method of FPFS, and a reimplementation and retraining of Tikhonet and ShapeNet on our dataset, and we also include reported errors from ADMMNet for comparison. The proposed method consistently outperforms classical and modern methods across SNR levels.

2.4 Time-Performance Trade-off

The improved performance of the proposed method comes at the cost of elevated computational demand. Fig. 4 shows the trade-off between compute time and ellipticity error at SNR levels of 20, 40 and 100 on the test dataset. Error bars show 5x the standard error of the ellipticity errors. Reducing the iterations in the proposed method reduces the compute time per galaxy at the cost of slightly increased error (purple). This is not a universal trade-off for iterative methods; increasing the number of iterations of the Richardson-Lucy algorithm boosts noise and degrades performance on dim galaxies (green). Performance for ADMMNet (blue) is as originally reported by the authors, who also report an average time per galaxy of 8.75 seconds, orders of magnitude higher than the other methods considered here. This is not likely due to hardware differences, as our GPU roughly matches their reported timing for Tikhonet. Rather, ADMMNet is allowed to iterate until convergence rather than undergoing a prefixed number of steps, enabling dramatically extended computation.

2.5 Robustness to Systematic Errors in PSF

In real-world observations, PSFs at the center of the galaxies are calculated through interpolation from PSFs measured at stars, yet the spatial variance of the PSF across the field of view is often difficult to model (Terry et al., 2022). As a result, interpolation cannot precisely reconstruct the effective PSF across galaxies. This requires the subsequent deconvolution algorithm to be robust to systematic errors in the PSFs.

Errors in PSF shape and size result in additive and multiplicative shear biases (Mandelbaum, 2018). In our PSF model, these two types of systematic errors correspond to errors in the PSF shear and the full-width half maximum (FWHM) of the Kolmogorov function, i.e., the atmospheric seeing. To test the robustness of the tested methods to PSF noise, we characterized the ellipticity errors caused by systematic errors (i.e. model mismatch) between the PSF used for image formation and the PSF provided to the deconvolution algorithms. Fig. 5 shows these errors, with an illustration of the type and level of error shown in the cartoon PSFs below the plots. Under reasonable PSF errors, the proposed method maintains best performance, but other modern methods are more robust to large mistakes in PSF shape and size. A 2-iteration version of the proposed method (dashed purple) is less sensitive to large model mismatch, showing that the iteration number for the proposed method should be reduced when high errors in PSF estimation are expected.

2.6 Ablation Studies

By altering the structure of the network and retraining, we can isolate the effects of individual design choices to the proposed method. Fig. 6 shows the effects of 3 types of alterations. Changes in the loss function are shown in teal. Changing from multiscale loss to MSE loss, as used in Tikhonet, reduces performance slightly, and to shape loss, as used in Shapenet, leads to comparable performance. Removal of the subnetwork that predicts hyperparameters (dashed purple), as in Tikhonet, Shapenet, and ADMMNet, reduces performance at high SNR, as does removal of iteration layers (see Fig. 4). ADMMNet-style training (denoiser-only training on noised images without PSFs, shown in dark blue) leads to a considerable loss in performance. It should be noted that ADMMNet is allowed to iterate to convergence, so that our fixed-iteration network structure does not reflect identical computation.

3 Discussion

The proposed method could be further improved in several ways. While we have primarily tested deconvolution algorithms in a non-blind (known-PSF) setting, in practice the PSFs cannot be known exactly and the process is at least partially blind. We present an initial characterisation of performance in response to simple errors in PSF, but higher order errors must be considered and addressed for best performance. Our simulated data can be improved with more accurate optical models for the LSST PSF (Terry et al., 2022) and more realistic generative models such as Peterson et al. (2005), which will improve further once observations begin. Additionally, to address spatial variations, PSF interpolation can be included in the pipeline and combined with low-rank deconvolution used in Farrens et al. (2017) and Yanny et al. (2020). This may be of particular interest for larger galaxies with fine internal structure that could be resolved using high-quality deconvolution. It would also be interesting to test the effect of training set size on the generalizability of the methods considered here. The shared weights across iteration layers in a unrolled network may confer an advantage in a low-training-data setting, in which parameter-efficient methods would be able to take advantage of data from novel instruments earlier in their collection periods.

4 Conclusions

We have simulated galaxy images to train a specific subset of the parameters in a well-studied optimisation method. In comparison with traditional and learned methods, the suggested model is more robust and effective, because it combines a powerful and well-analysed iterative method with data-driven priors learned by an end-to-end optimised deep neural network through algorithm unrolling. Previous work on data-driven ADMM, ADMMNet, trained only the denoising step and required a varying number of iterations to reach convergence, leading to orders of magnitude more time per galaxy to compute. Our fixed iteration count and joint training across both subproblems achieves better performance in much less time.

We tested the suggested method on data simulated with realistic galaxy images, and compared its performance to classical and modern methods in terms of the median and variance of reduced shear ellipticity error over a wide range of galaxy brightnesses, as well as the time consumption, the robustness to systematic PSF noise, and the influence of network ablations and iteration count. Our experiments showed that 8-iteration unrolled Plug-and-Play ADMM provides the most accurate ellipticity measurements across SNR levels at the cost of slightly elevated compute time.

Because image statistics and computational resources vary across applications and observational conditions, the networks we trained will not be the best method for every galaxy image deconvolution problem. Our simulations were modelled on a specific instrument currently in development, and deviations from these parameters will reduce performance. Because of this, we provide an open-source, user-friendly code base for generating and retraining on simulated datasets for astronomical deconvolution problems. Simulation results allow researchers to develop and select the most suitable algorithm for their performance and speed needs.

Acknowledgements

We would like to thank Adam Miller, Jason Wang, Keming Zhang, Nick Antipa, and an anonymous reviewer for their valuable advice, and the Computational Photography Lab (esp. Aniket Dashpute) at Northwestern University for support of computational resources.

Data Availability

Our GitHub repository (https://github.com/Lukeli0425/Galaxy-Deconvolution) includes the code needed to regenerate the simulated data, as well as a link to our dataset. We also provide the suggested unrolled ADMM model with trained weights under LSST settings along with tutorials for using it, allowing a convenient deconvolution with the suggested method for further research.

References

Bartelmann & Schneider (2001) Bartelmann M., Schneider P., 2001, Physics Reports, 340, 291
Boyd et al. (2011) Boyd S., Parikh N., Chu E., Peleato B., Eckstein J., et al., 2011, Foundations and Trends® in Machine learning, 3, 1
Chan et al. (2016) Chan S. H., Wang X., Elgendy O. A., 2016, IEEE Transactions on Computational Imaging, 3, 84
Connolly et al. (2010) Connolly A., et al., 2010, in Modeling, Systems Engineering, and Project Management for Astronomy IV. pp 612–621
Farrens et al. (2017) Farrens S., Mboula F. N., Starck J.-L., 2017, Astronomy & Astrophysics, 601, A66
He et al. (2016) He K., Zhang X., Ren S., Sun J., 2016, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
Ivezić et al. (2019) Ivezić Ž., et al., 2019, The Astrophysical Journal, 873, 111
Kaiser et al. (1994) Kaiser N., Squires G., Broadhurst T., 1994, arXiv preprint astro-ph/9411005
Li & Mandelbaum (2022) Li X., Mandelbaum R., 2022, arXiv preprint arXiv:2208.10522
Lucy (1974) Lucy L. B., 1974, The astronomical journal, 79, 745
Mandelbaum (2018) Mandelbaum R., 2018, Annual Review of Astronomy and Astrophysics, 56, 393
Mandelbaum et al. (2012) Mandelbaum R., Lackner C., Leauthaud A., Rowe B., 2012, COSMOS real galaxy dataset, doi:10.5281/zenodo.3242143, https://doi.org/10.5281/zenodo.3242143
Mandelbaum et al. (2014) Mandelbaum R., et al., 2014, The Astrophysical Journal Supplement Series, 212, 5
Mandelbaum et al. (2015) Mandelbaum R., et al., 2015, Monthly Notices of the Royal Astronomical Society, 450, 2963
Monga et al. (2021) Monga V., Li Y., Eldar Y. C., 2021, IEEE Signal Processing Magazine, 38, 18
Nammour et al. (2022) Nammour F., Akhaury U., Girard J., Lanusse F., Sureau F., Ali C. B., Starck J.-L., 2022, Astronomy & Astrophysics
Peterson et al. (2005) Peterson J., et al., 2005, in American Astronomical Society Meeting Abstracts. pp 26–23
Richardson (1972) Richardson W. H., 1972, JoSA, 62, 55
Rowe et al. (2015) Rowe B., et al., 2015, Astronomy and Computing, 10, 121
Sanghvi et al. (2022) Sanghvi Y., Gnanasambandan A., Chan S. H., 2022, in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
Sureau et al. (2020) Sureau F., Lechat A., Starck J.-L., 2020, Astronomy & Astrophysics, 641, A67
Terry et al. (2022) Terry S. K., et al., 2022, AIROPA IV: Validating Point Spread Function Reconstruction on Various Science Cases, doi:10.48550/ARXIV.2209.05489, https://arxiv.org/abs/2209.05489
Yanny et al. (2020) Yanny K., et al., 2020, Light: Science & Applications, 9, 1