Single-shot Phase Retrieval from
a Fractional Fourier Transform Perspective

Yixiao Yang, Ran Tao, Kaixuan Wei, and Jun Shi This work is supported in part by the National Natural Science Foundation of China under Grant 62027801 and Grant 61421001. (Corresponding author: Ran Tao.)Yixiao Yang and Ran Tao are with the Department of Information and Electronics, Beijing Institute of Technology, Beijing 100081, China (e-mail: [email protected] and [email protected]).Kaixuan Wei is with the Department of Computer Science, Princeton University, USA (e-mail:[email protected]).Jun Shi is with the Communication Research Center, Harbin Institute of Technology, Harbin 150001, China (e-mail:[email protected]).

Abstract

The realm of classical phase retrieval concerns itself with the arduous task of recovering a signal from its Fourier magnitude measurements, which are fraught with inherent ambiguities. A single-exposure intensity measurement is commonly deemed insufficient for the reconstruction of the primal signal, given that the absent phase component is imperative for the inverse transformation. In this work, we present a novel single-shot phase retrieval paradigm from a fractional Fourier transform (FrFT) perspective, which involves integrating the FrFT-based physical measurement model within a self-supervised reconstruction scheme. Specifically, the proposed FrFT-based measurement model addresses the aliasing artifacts problem in the numerical calculation of Fresnel diffraction, featuring adaptability to both short-distance and long-distance propagation scenarios. Moreover, the intensity measurement in the FrFT domain proves highly effective in alleviating the ambiguities of phase retrieval and relaxing the previous conditions on oversampled or multiple measurements in the Fourier domain. Furthermore, the proposed self-supervised reconstruction approach harnesses the fast discrete algorithm of FrFT alongside untrained neural network priors, thereby attaining preeminent results. Through numerical simulations, we demonstrate that both amplitude and phase objects can be effectively retrieved from a single-shot intensity measurement using the proposed approach and provide a promising technique for support-free coherent diffraction imaging.

Index Terms:

Single-shot phase retrieval, fractional Fourier transform, Fresnel diffraction, and untrained neural network.

I Introduction

Phase Retrieval (PR) is a long-established challenge for estimating a signal from the phase-less linear measurements, encountered in various science and engineering fields including x-ray crystallography [1], computational microscopy [2], computer-generated holography [3], and many more [4]. In the realm of optical systems, the direct measurement of the phase by electronic detectors is often difficult, thus computational phase retrieval comes into play [5]. Furthermore, the observation in the far-field diffraction or the focal plane of a lens can be formulated by the Fourier transform, which enables Fourier PR.

Despite its popularity, the Fourier PR problem faces a significant obstacle as it is not uniquely solvable solely from the Fourier magnitude [6]. This is due to the inherent structure of the Fourier transform, leading to trivial ambiguities such as target translation and inversion, as well as non-trivial ambiguities that preserve the Fourier magnitude. As a result, designing efficient and convergent reconstruction algorithms becomes an extremely challenging task. To overcome this issue, researchers have explored various strategies, such as obtaining additional measurements, incorporating prior knowledge of the object, or a combination of both.

Refer to caption — Figure 1: Operation of inversion or shift on the object corresponds to the same magnitude in the Fourier domain, but a different one in the FrFT domain. Here we adopt the $0.5$ th-order fractional Fourier transform (FrFT).

I-A Related Work

Numerous physical measurement techniques have been developed over the years to record redundant information and serve as a remedy to the ill-posed nature of the PR problem. A notable example is coherent diffraction imaging (CDI), which mitigates the ambiguities of PR by invoking pre-determined optical masks that offer valuable constraints [7]. Pioneering research focused on utilizing the non-zero support as an optical mask, thereby enabling oversampled Fourier transform measurements and incorporating prior information on the signal, such as positivity and real-valuedness [8]. Besides oversampling, the use of random masks has gained attention as a fashion to introduce multiple measurements within the optimization process [9]. This technique is founded on the principle that various masks can modulate the signal of interest and introduce information redundancy to reduce ambiguities [10]. And it can be implemented through diverse means, including the utilization of masks [11] and oblique illuminations [12]. Another method to enrich observation diversity is ptychography or scanning CDI, which employs lateral sample (or probe) shifting to illuminate new regions of the sample while ensuring adequate overlap [13]. The main idea is to seek uniqueness from the highly overlapped Fourier measurements [14]. Despite the above successes, the requirement of redundant (oversampled/multiple/overlapped) measurements significantly increases system complexity and data acquisition time, therefore unsuitable for dynamic imaging scenarios [15]. In addition, in 3D imaging applications, it is extremely difficult to collect multiple observations and reliable single-shot PR methods are urgently needed [16].

With the recent successes of deep learning techniques in computational imaging, there has been an increasing interest in employing data-driven methods for snapshot phase retrieval, whose goal is to retrieve the object from a single diffraction pattern. This approach circumvents multiple measurements by leveraging labeled data to train a neural network to learn the inverse mapping of the single-shot measurement function. Notable examples of this class include works by Wang et al. [17], as well as SiSPRNet [18]. The former utilizes a trained convolutional neural network (CNN) to recover synthetic aperture images from a snapshot acquired on a 16-camera array, and the latter retrieves phase directly from a single Fourier-intensity measurement using CNN. While these data-driven approaches yield promising results, they are limited in reduced interpretability and generalization as a result of solely relying on black-box neural networks and disregarding the underlying physical model. In addition, these supervised methods still face data challenges, requiring pairs of measurements and their corresponding ground truth images as training datasets. In many scientific imaging applications, collecting such paired data is extremely expensive if not impossible.

The above optical settings typically assume that measurements are taken at the Fourier plane or Fraunhofer regime. In fact, the intensity pattern can be collected at an arbitrary plane between the object field and the far field, implying the new measurement model that differs from Fourier transform. The regime of Fresnel (near-field) diffraction, described by the Fresnel Integral, is the most notable case in this context. This has led to growing interest in the concept of Fresnel Phase Retrieval [19], including Fresnel CDI [20] and Fresnel Ptychography [21]. Although experimentally validated, it is susceptible to aliasing artifacts in the numerical calculation of near-field diffraction, arising from the sampling of non-band-limited chirp function [22]. Meanwhile, the Fractional Fourier Transform (FrFT), which employs the chirp function as its kernel, has garnered significant attention in the signal-processing community as a generalization of the Fourier transform [23]. Developments in the theoretical framework of FrFT have been rapid [24], covering sampling [25], filtering [26], and discrete algorithms [27]. In [28], the definitional concept of fractional Fourier optics was introduced, based on the equivalent relationship between FrFT and Fresnel integral. This work revealed that the propagation of light between two spherical surfaces could be interpreted as a process of continual FrFT. Thus, this gives rise to a few works on the problem of phase retrieval from multiple FrFT magnitude measurements [29]. Despite the theoretical merits, the existing research in this domain has predominantly disregarded the integration of quantitative analysis with numerical diffraction calculations. In addition, its potential in single-shot PR is largely overlooked by the community.

I-B Motivation and Contributions

To fill this gap, the work presented in this paper revisits the power of FrFT and proposes a novel single-shot PR paradigm from a joint physics and mathematics perspective with several insights developed. Firstly, the FrFT, as a well-defined signal processing tool, can offer a new option for numerical diffraction calculations in the near field. Compared with the Fresnel integral, the FrFT benefits from the fractional Fourier sampling theory and fast discrete algorithms, avoiding aliasing artifacts in numerical calculations. Thus, the FrFT-based measurement model enables accurate and efficient computation of diffraction fields, bridging the gap between theory and practical applications. This finding supports that the single FrFT measurement can be accurately obtained through physical near-field diffraction, facilitating a snapshot operation without the need for extra optical components, such as masks.

Secondly, the measurement in the FrFT domain has good and specific theoretical properties that could benefit PR. In contrast to the seriousness of the phase loss in the Fourier domain, the magnitude of the FrFT contains both amplitude and phase information of the original signal. Specifically, the FrFT provides a space-frequency representation of the signal, containing both spatial and frequency details. Therefore, changes in spatial signal amplitude are also reflected in the FrFT amplitude, which can be leveraged to overcome some spatial ambiguities of PR, as depicted in Fig. 1. In addition, the modulus of fractional Fourier space-frequency representation describes the frequency change of the signal in space and contains the signal’s phase information. These theoretical foundations support the feasibility of recovering the original signal from a single FrFT measurement.

Based on the above insights, the contributions of this paper are summarized in the following:

•

To the best of our knowledge, we are the first to address the problem of reconstructing a two-dimensional image from the single intensity measurement in the FrFT domain, coined Single-shot Fractional Fourier Phase Retrieval (SFrFPR).
•

To this end, we formulate a detailed FrFT-based measurement model for near-field diffraction calculation, featuring adaptability to both short-distance and long-distance propagation scenarios. Furthermore, we propose a self-supervised reconstruction method, which harnesses the fast algorithm of FrFT alongside untrained neural network priors, thereby achieving superior results for recovering both amplitude and phase objects.
•

Moreover, we provide theoretical analyses based on fractional Fourier time-frequency representation to clarify the rationality of the proposed SFrFPR. Through simulation, we demonstrate the single FrFT-based measurement effectively improves the uniqueness of the solution, relaxing the conditions on oversampled or multiple measurements in the Fourier domain. Last but not least, we present a promising imaging capability empowered by the proposed method, i.e., support-free CDI.

I-C Outline

The remainder of this paper is organized as follows. Section II introduces the proposed SFrFPR, including the FrFT-based measurement model, self-supervised reconstructing approaches, and theoretical analysis. Numerical simulation results are presented in Section III. The conclusion is drawn in Section IV. A limited version of this work with preliminary results was presented as a conference paper in 2023 IEEE ICASSP [30]. In this manuscript, we further address key issues such as fast discrete algorithms, scalable sampling, and self-supervised reconstruction methods, rendering the proposed SFrFPR suitable for practical applications.

II Method

In this section, we first recall the near-field diffraction theory and introduce the proposed FrFT-based measurement model. Then we formulate the SFrFPR problem and propose a self-supervised reconstructing approach based on an untrained neural network (UNN) scheme. Finally, theoretical analyses based on fractional Fourier time-frequency representation are provided.

II-A Near-field Measurement Model based on FrFT

The proliferation of near-field measurements and imaging in recent years can be attributed to advancements in numerical propagation models. Instead of far-field measurement formulated by the Fourier transform, it is very challenging to establish a unified framework that can effectively, precisely, and flexibly compute the near-field diffraction. In this part, we present a novel near-field measurement model based on FrFT to solve this dilemma.

II-A1 Near-field Diffraction

In the realm of optics, the Fresnel diffraction equation emerges as an approximation of the Kirchhoff–Fresnel diffraction formula, tailored for characterizing optical propagation in the near field. The diffraction pattern denoted as $U_{d}$ can be calculated when light traverses through an object aperture represented as $U_{0}$ by

U_{d}(x,y)=\frac{e^{i\frac{2\pi}{\lambda}d}}{i\lambda d}\iint U_{0}(x^{\prime},y^{\prime})e^{i\frac{\pi}{\lambda d}[(x-x^{\prime})^{2}+(y-y^{\prime})^{2}]}dx^{\prime}dy^{\prime},

(1)

where $d$ denotes the propagation distance, $\lambda$ is the wavelength of light, and $i$ is the imaginary unit.

Normally, this Fresnel integral (1) can be expressed by the single Fourier transform, called SFT-Fresnel, written as

U_{d}(x,y)=\frac{e^{i\frac{2\pi}{\lambda}d}}{i\lambda d}e^{i\frac{\pi(x^{2}+y^{2})}{\lambda d}}\mathcal{F}[U_{0}(x^{\prime},y^{\prime})e^{i\frac{\pi(x^{\prime 2}+y^{\prime 2})}{\lambda d}}](\frac{x}{\lambda d},\frac{y}{\lambda d}),

(2)

where $\mathcal{F}$ represents the two-dimensional Fourier transform.

Note that there is a quadratic phase factor in the Fourier transform and that the phase oscillates rapidly over short propagation distances, which poses a great challenge for accurate sampling and computing. Even worse, when employing Fast Fourier Transform (FFT) for discrete calculations, it suffers from serious aliasing artifacts [22].

As a dual version, this Fresnel integral (1) can be also seen as a convolution operation [31] and computed with the help of Fourier transform, stated as

U_{d}(x,y)=\mathcal{F}^{-1}[\mathcal{F}[U_{0}(x^{\prime},y^{\prime})]\times H(f_{x},f_{y})],

(3)

where $H(f_{x},f_{y})=\mathcal{F}[\frac{e^{i\frac{2\pi}{\lambda}d}}{i\lambda d}e^{i\frac{\pi(x^{2}+y^{2})}{\lambda d}}]$ denotes the transfer function of Fresnel diffraction and $(f_{x},f_{y})$ are the Fourier coordinates conjugate to the real space coordinates $(x,y)$ .

Moreover, this transfer function has an analytical expression as $H(f_{x},f_{y})=e^{i\pi d(\frac{2}{\lambda}-\lambda(f_{x}^{2}+f_{y}^{2}))}$ . However, it is increasingly challenging to appropriately sample the transfer function, characterized by a swift oscillation of the phase component over long propagation distances. Numerical errors still exist when using FFT calculations duet to the fixed sampling pitch [31]. While this problem can be solved by utilizing non-uniform FFT [32], the computational complexity also increases, sacrificing the calculation efficiency.

In summation, it is exceedingly arduous to formulate an efficient model that can compute the whole near-field diffraction without sampling problems. In the following, we will present an innovative FrFT-based model to address the sampling problem in the Fresnel integral without aliasing errors during the treatment of the chirp function, featuring adaptability to both short-distance and long-distance propagation scenarios.

II-A2 FrFT-based Measurement model

The fractional Fourier transform (FrFT) is the generalized form of the Fourier transform, and its kernel function is a quadratic phase term. The $p$ th-order FrFT of a continuous signal $f(x)$ is defined as [33]

\displaystyle X_{\alpha}(u)=\mathcal{F}^{p}[f](u)\triangleq\int f(x)K_{\alpha}(u,x)dx,

(4)

where $\mathcal{F}^{p}$ denotes the FrFT operator and $K_{\alpha}(u,x)$ is the transform kernel with $\alpha=\frac{\pi}{2}p$ given as follows

\displaystyle K_{\alpha}(u,x)\triangleq\left\{\begin{aligned} &A_{\alpha}e^{i\pi\left(\mathrm{cot}\alpha x^{2}-2\mathrm{csc}\alpha ux+\mathrm{cot}\alpha u^{2}\right)},\;\alpha\neq n\pi\\ &\delta(x-u),\;\alpha=2n\pi\\ &\delta(x+u),\;\alpha=(2n\pm 1)\pi\end{aligned},\right.

(5)

with $A_{\alpha}$ defined as $A_{\alpha}\triangleq\sqrt{1-i\mathrm{cot}\alpha}$ and $\delta(t)$ being the Dirac delta function. Especially, the FrFT reduces to the (inverse) Fourier transform when $p=\pm 1$ .

Correspondingly, the inverse FrFT of $X_{\alpha}(u)$ in (4) is

\displaystyle f(x)=\mathcal{F}^{-p}[X_{\alpha}](x)\triangleq\int X_{\alpha}(u)K_{-\alpha}(u,x)du,

(6)

where $\mathcal{F}^{-p}$ and $K_{-\alpha}(u,x)$ denote the inverse of $\mathcal{F}^{p}$ and the kernel $K_{\alpha}(u,x)$ obtained from (4) and (5), respectively.

Now, we consider introducing the two-dimensional FrFT of the input aperture $U_{0}$ , which can be defined as

\mathcal{F}^{p}[U_{0}](x,y)=\iint K_{\alpha}(x,x^{\prime})K_{\alpha}(y,y^{\prime})U_{0}(x^{\prime},y^{\prime})dx^{\prime}dy^{\prime},

(7)

where $K_{\alpha}(x,x^{\prime})$ and $K_{\alpha}(y,y^{\prime})$ are transform kernels on the x-axis and y-axis, respectively.

To connect the Fresnel integral (1) and FrFT (7), we introduce the scaled fields $\hat{U}_{d}(x,y)\equiv U_{d}(s_{2}x,s_{2}y),\hat{U}_{0}(x^{\prime},y^{\prime})\equiv U_{0}(s_{1}x^{\prime},s_{1}y^{\prime})$ with $s_{1}=\sqrt{\frac{\lambda d}{\mathrm{tan}\alpha}},s_{2}=\sqrt{\frac{\lambda d}{\mathrm{sin}\alpha\mathrm{cos}\alpha}}$ .

Then we can obtain the scaled diffraction field

	$\displaystyle\hat{U}_{d}(x,y)=\frac{e^{i\frac{2\pi}{\lambda}d}}{i\lambda d}\iint\hat{U}_{0}(x^{\prime},y^{\prime})e^{i\frac{\pi}{\lambda d}[(x-x^{\prime})^{2}+(y-y^{\prime})^{2}]}dx^{\prime}dy^{\prime},$		(8)
	$\displaystyle=\frac{e^{i\frac{2\pi}{\lambda}d}}{i\mathrm{tan}\alpha}\iint\hat{U}_{0}(x^{\prime},y^{\prime})e^{i\pi[\frac{x^{2}+y^{2}}{\mathrm{sin}\alpha\mathrm{cos}\alpha}+\frac{x^{\prime 2}+y^{\prime 2}}{\mathrm{tan}\alpha}-\frac{2(xx^{\prime}+yy^{\prime})}{\mathrm{sin}\alpha}]}dx^{\prime}dy^{\prime},$		(9)
	$\displaystyle=\frac{e^{i\frac{2\pi}{\lambda}d}}{i\mathrm{tan}\alpha+1}e^{i\pi\mathrm{tan}\alpha(x^{2}+y^{2})}\mathcal{F}^{p}[\hat{U}_{0}](x,y).$		(10)

Considering the intensity-only measurement, the spherical phase factor of (10) disappears and we can have

|\hat{U}_{d}(x,y)|=\frac{1}{\sqrt{\mathrm{tan}^{2}\alpha+1}}|\mathcal{F}^{p}[\hat{U}_{0}](x,y)|.\\

(11)

Thereby, we conclude that the scaled amplitude distribution of the near-field diffraction can be interpreted as the continuous FrFT magnitude. Given the scale factor $s_{1}$ and the physical propagation distance $d$ , we can know exactly the other scale factor $s_{2}=s_{1}\sqrt{1+(\lambda d)^{2}/s_{1}^{4}}$ and the corresponding fractional order $p=2/\pi\times\mathrm{arctan}(\lambda d/s_{1}^{2})$ . It can be seen that when the propagation distance $d$ increases, the corresponding fractional order $p$ gradually becomes larger with the range is [0,1]. This is consistent with the Fourier transform corresponding to propagation into the far field.

Discrete Computing. In practical scenarios, the acquisition of the diffraction field is inherently limited to sampling. Therefore, accurate discretization of both the input field and the transform kernel is crucial for numerical calculations to avoid aliasing artifacts. Benefiting from previous studies, various types of discrete fractional Fourier transform (DFrFT) have been developed with distinct strategies and properties [27] that could be applied here. Prominent examples include the eigenvector decomposition-type DFrFT (ED-DFrFT) [34], the improved sampling-type DFrFT (IP-DFrFT) [35], and the closed-form sampling-type DFrFT (CF-DFrFT) [36]. Specifically, ED-DFrFT offers orthogonality, additivity, and reversibility properties at the expense of high computational complexity while IP-DFrFT enjoys high discrete accuracy and can be implemented efficiently with lacking unitarity. CF-DFrFT achieves reversible properties by carefully considering sampling interval limits and involves low-complexity calculations in $O(NlogN)$ time. Nevertheless, it is essentially similar to SFT-Fresnel and also faces the problem of numerical errors. Therefore, we select the IP-DFrFT approach in this work and present some details as follows.

Specifically, the samples of the transformed function in (11) spaced at the interval $\triangle x$ and $\triangle y$ are obtained as

|\hat{U}_{d}(m\triangle x,n\triangle y)|=\frac{1}{\sqrt{\mathrm{tan}^{2}\alpha+1}}|\mathcal{F}^{p}[\hat{U}_{0}](m\triangle x,n\triangle y)|,\\

(12)

where $m,n$ goes from $-N/2$ to $N/2$ , and $N$ is the sampling number.

Following the computation scheme of IP-DFrFT, we can get

\begin{split}&|\mathcal{F}^{p}[\hat{U}_{0}](m\triangle x,n\triangle y)|=A_{\alpha}^{2}|\sum\limits_{m^{\prime}}\sum\limits_{n^{\prime}}\hat{U}_{0}(m^{\prime}\triangle x^{\prime},n^{\prime}\triangle y^{\prime})\\ &\times\triangle x^{\prime}\triangle y^{\prime}e^{i\pi\mathrm{csc}\alpha[(m-m^{\prime})^{2}\triangle x\triangle x^{\prime}+(n-n^{\prime})^{2}\triangle y\triangle y^{\prime}]}\\ &\times e^{i\pi(\mathrm{cot}\alpha-\mathrm{csc}\alpha)(m^{\prime 2}\triangle x^{\prime 2}+n^{\prime 2}\triangle y^{\prime 2})}|,\end{split}

(13)

where $\triangle x^{\prime}=\triangle x,\triangle y^{\prime}=\triangle y$ is the spacing interval and $m^{\prime},n^{\prime}$ represents the discrete grid in the source field.

Its calculation can be recognized that the discrete source field is first modulated with a chirp function $e^{i\pi(\mathrm{cot}\alpha-\mathrm{csc}\alpha)(m^{\prime 2}\triangle x^{\prime 2}+n^{\prime 2}\triangle y^{\prime 2})}$ and then convoluted by another chirp function $e^{i\pi\mathrm{csc}\alpha[(m^{\prime}\triangle x^{\prime})^{2}+(n^{\prime}\triangle y^{\prime})^{2}]}$ which can be efficiently implemented by FFT. Note that it can avoid the aliasing error due to the rapid oscillations of the kernel by exploiting the periodicity and additivity of the continuous FrFT¹¹1Note that there is an assumption as $0.5\leq|p|\leq 1.5$ . Taking advantage of the additivity property of FrFT, we can extend the range of parameter $p$ to cover all its values. For example, for the range $0<p<0.5$ , we have that $\mathcal{F}^{p}=\mathcal{F}^{p-1+1}=\mathcal{F}^{p-1}\mathcal{F}^{1}$ .. Given this, we can get an efficient, accurate, and unified method to calculate the whole diffraction field. It is worth mentioning that it is necessary to perform the dimensional normalization on the fractional order $p$ and the scale factor $s_{2}$ during the numerical calculations, in order to eliminate the sampling-related influences:

p=\frac{2}{\pi}\mathrm{arctan}(\frac{\lambda d}{s_{1}^{2}}\times\frac{N}{L^{2}}),s_{2}=s_{1}\sqrt{1+\frac{(\lambda d\times\frac{N}{L^{2}})^{2}}{s_{1}^{4}}},\\

(14)

where $L$ is the length of the input aperture.

TABLE I: A summary of commonly used Fresnel propagation models as compared to the proposed FrFT measurement model.

Method	Complexity	Pixel Pitch	Range
SFT-FR	1 FFT	$\triangle_{x}=\frac{\lambda d}{N\triangle_{x}^{\prime}}$	Long-distance
Fresnel-TF	2 FFTs	$\triangle_{x}=\triangle_{x}^{\prime}$	Short-distance
FrFT (this work)	2 FFTs	$\triangle_{x}=\sqrt{1+(\frac{\lambda d}{N\triangle_{x}^{\prime 2}})^{2}}\triangle_{x}^{\prime}$	All-distance

Scalable Sampling. Considering the scaling operator introduced in (11), it can be addressed by obtaining the rescaling samples in practice. Specifically, the scale factor $s_{1}$ can usually be set to 1 in order to be consistent with the real input field, i.e., $\hat{U}_{0}=U_{0}$ . Thereby, the original near-field measurement can be obtained by

|U_{d}(x,y)|=\frac{1}{\sqrt{\mathrm{tan}^{2}\alpha+1}}|\mathcal{F}^{p}[U_{0}](x/s_{2},y/s_{2})|,\\

(15)

where $s_{2}=\sqrt{1+(\lambda dN/L^{2})^{2}}$ can be seen as a sampling rate conversion factor.

Consistent with previous considerations, the sampling interval within the FrFT domain remains $\triangle x^{\prime}$ , consequently leading to the pixel pitch of $s_{2}\triangle x^{\prime}$ within the real observation plane. This, in turn, enables us to directly acquire the intensity measurements of the FrFT through near-field diffraction. To clarify, Table I shows a summary of the methods mentioned within this work, encompassing pertinent details such as the computational complexity, pixel pitch, and the respective suitable range. In addition, we can also adopt digital computation in the FrFT domain to eliminate the scaling operator. It can be noticed that the scaling factor $s_{2}$ gradually becomes larger as the propagation distance $d$ increases and is always greater than 1, which corresponds to the upsampling case. According to the digital sampling rate conversion, we can implement it through interpolation, low-pass filtering, and decimation in sequence, achieving flexible sampling intervals. Overall, the proposed FrFT-based measurement model is illustrated in Fig. 2.

II-B Self-supervised Reconstruction Approach based on UNN

This surge in research activity over the last decade has focused on developing reconstruction algorithms for Fourier PR, including iterative optimization approaches [10, 37, 38, 39] and neural network approaches [40, 41, 42, 43]. In this part, we define the SFrFPR problem for the first time and propose an untrained neural network (UNN) based reconstruction approach for SFrFPR.

II-B1 Problem Formulation

Based on the proposed FrFT-based measurement model, the SFrFPR problem can be mathematically stated as

Given\quad I=|\mathcal{F}^{p}O|,\qquad find\quad O,

(16)

where $I$ , $\mathcal{F}^{p}$ , and $O$ represent the intensity of the diffraction pattern, the corresponding $p$ th-order FrFT measurement model, and the underlying object, respectively.

Typically, the inverse problem of SFrFPR can be solved within a regularized optimization framework by minimizing the following cost functional:

\mathop{\mathrm{\mathop{minimize}}}_{O}\quad\frac{1}{2}||I-|\mathcal{F}^{p}O|||^{2}_{2}+\beta\mathcal{R}(O),

(17)

where $\mathcal{R}(O)$ indicates the regularization term associated with the prior knowledge of objects, and $\beta$ is a parameter that controls the weight of the regularizer.

This is a non-convex and non-linear ill-posed problem, mainly caused by the loss of phase. Existing iterative optimization PR methods such as Wirtinger gradient descent [10] and plug-and-play methods [38, 30], are expected to provide transferable ideas to solve this problem. However, these methods all rely on accurate forward and backward projections, resulting in the use of ED-DFrFT. The high computational complexity prevents its application to large-scale and real-time reconstruction. The fast discrete FrFT algorithm represented by IP-DFrFT can solve this dilemma, but it lacks unitarity and suffers from numerical errors during the inverse transformation [27], making it unsuitable for these iterative methods.

II-B2 UNN-based Method

To address this issue, we put forward an alternative solution, leveraging the concept of the deep image prior (DIP) [44] and employing an untrained neural network (UNN) for SFrFPR. The key idea is designing a neural network $f_{NN}(I,\Theta)$ to directly perform the inverse mapping from captured intensity measurement $I$ to the desired object $O$ by adjusting the network’s weight $\Theta$ based upon the following empirical risk:

\displaystyle\mathop{\mathrm{\mathop{minimize}}}_{\Theta}\quad\frac{1}{2}||I-|\mathcal{F}^{p}f_{NN}(I,\Theta)|||^{2}_{2}.

(18)

Compared with (17), the main difference is that we optimize the neural network’s parameters instead of the estimated object directly. This direct problem transformation has brought many benefits. First of all, thanks to the development of neural network technology, this optimization process can be well solved by the auto-differentiation technique. In this way, the proposed method solely requires the differentiability of the forward function while avoiding the need for the backward function. In addition, the proposed method operates in a self-supervised manner, utilizing the network’s weight adjustment to reconstruct the desired amplitude or phase object, guided by the captured intensity measurement and incorporating the FrFT-based measurement model. Therefore, the proposed method does not rely on obtaining a large number of pairs of ground truth data and corresponding observations.

Overall Pipeline. The overall pipeline of our method is outlined in Fig. 3. The input to the neural network is a diffraction pattern of an amplitude or phase object, captured in a single snapshot. The neural network processes this input and generates an estimated object. Subsequently, the estimated object is numerically propagated to simulate the resulting diffraction pattern using the proposed FrFT-based measurement model. Engaging a loss function computed between the measurement and the estimated diffraction pattern, the parameters of the neural network are adjusted via the auto-differentiation technique. Notably, this training process only involves the FrFT forward function and unlabeled simulated/measured diffraction patterns. After updating, the trained network can perform the direct inversion from a single intensity pattern to the real space object without requiring the iterative process.

Network Implementation. In UNN, the architecture of the neural network is very important, because it will introduce the neural network structure prior as a regularization term. Witnessing the powerful representation ability of the Transformer network in large models, we explore its potential as an untrained network for SFrFPR. To be specific, we adopt a general U-shaped Transformer architecture, as delineated by [45], and build a small hierarchical encoder-decoder network with a tiny computational burden. Given a snapshot pattern as the network input, a convolutional layer with LeakyReLU is used to extract low-level features. Subsequently, these features traverse through two encoder phases, each of which encompasses a LeWin Transformer block and one down-sampling layer. The LeWin Transformer block captures long-range dependencies via non-overlapping windows instead of global self-attention, resulting in the low computational cost of high-resolution feature maps [45]. Proceeding further, a bottleneck phase characterized by a LeWin Transformer block is appended to culminate the encoding progression. Then, two decoder phases are followed to recover the features, each of which contains an up-sampling layer and a LeWin Transformer block. Finally, a convolutional layer is applied to output the reconstructed object.

While Transformer models have undoubtedly demonstrated remarkable performance in numerous supervised learning tasks, their untapped potential as untrained neural networks remains to be further explored. To the best of our knowledge, this is the first attempt to introduce the Transformer-based architecture into an untrained neural network for solving the PR problem. In Section III, experimental results demonstrate that the Transformer structure priors leverage both local and global dependencies with better reconstruction performance than convolutional neural networks in a self-supervised scheme.

II-C Theoretical Analysis of SFrFPR

Next, we present theoretical analyses from the perspective of fractional Fourier space-frequency representation to clarify the rationality of the proposed SFrFPR compared to Fourier PR. Specifically, we take a one-dimensional energy-limited signal $f(x)$ as an example²²2The generalization to the two-dimensional spatial signals considered in this paper is obvious. and introduce the concept of fractional Wigner–Ville distribution [46], defined as

\mathcal{W}_{\alpha}(x,u)=\frac{|\mathrm{csc}\alpha|}{2\pi}\int f^{*}(x-\frac{\tau}{2})f(x+\frac{\tau}{2})e^{-i\tau(u\mathrm{csc}\alpha-x\mathrm{cot}\alpha)}d\tau,

(19)

where $u$ denotes the fractional frequency, $\alpha$ represents the fractional angle, and $*$ indicates conjugate operation.

In particular, the fractional Wigner–Ville distribution degenerates into the classical Wigner–Ville distribution when $\alpha=\pi/2$ , depicted as

\mathcal{W}(x,w)=\frac{1}{2\pi}\int f^{*}(x-\frac{\tau}{2})f(x+\frac{\tau}{2})e^{-i\tau w}d\tau,

(20)

where $w$ is the frequency.

It can be seen that the fractional Wigner–Ville distribution determines the signal representation of the joint space $x$ and fractional frequency $u$ , The classic Wigner–Ville distribution provides a signal representation that combines space $x$ and frequency $w$ . Comparing the two definitions (19) and (20), we can further obtain the relationship between the fractional Wigner–Ville distribution and the classic Wigner–Ville distribution, as

\mathcal{W}_{\alpha}(x,u)=\frac{|\mathrm{csc}\alpha|}{2\pi}\mathcal{W}(x,u\mathrm{csc}\alpha-x\mathrm{cot}\alpha).

(21)

Therefore, the relationship between fractional frequency $u$ and space $x$ and frequency $w$ is

w=u\mathrm{csc}\alpha-x\mathrm{cot}\alpha.

(22)

Further, we can get

u=w\mathrm{sin}\alpha+x\mathrm{cos}\alpha.

(23)

This shows that the variable of the fractional Fourier transform, that is, the fractional frequency, essentially contains space and frequency information that can depict the frequency changes of the signal over space. In view of this, for any finite energy signal, based on its fractional Fourier transform, a new space-frequency representation of the signal can be defined as

\mathcal{T}_{\alpha}(x,u)\triangleq\mathcal{F}_{\alpha}(w\mathrm{sin}\alpha+x\mathrm{cos}\alpha),

(24)

where $\mathcal{T}_{\alpha}(x,u)$ is called the fractional Fourier space-frequency representation of the signal.

Based on the above analysis, the intensity observation of the FrFT is equivalent to the modulus value of the space-frequency representation. Given this, the FrFT measurement has many unique and useful properties that the Fourier measurement does not have, which are beneficial to phase retrieval.

On the one hand, the FrFT measurement contains the amplitude information of the signal, owing to the space-frequency coupling characteristics of the FrFT [33]. In contrast to the pronounced discrepancy between the Fourier plane and the image plane, the FrFT domain exhibits data distributions that are closely related to those found in the spatial domain. Therefore, some changes in signal on the spatial domain also have effects on the FrFT measurement. For example, signals of space-shift and conjugate-inversion will produce the different FrFT measurements according to the spatial shift and reversal property of the FrFT [47], respectively, shown in Fig. 1. On the other hand, the FrFT measurement also contains the phase information of the signal. According to (24), we can obtain the modulus of the fractional Fourier space-frequency representation of the signal. Furthermore, we can observe how the frequency of the signal changes with space, which is exactly the information contained in the phase of the signal.

The above analysis shows that although we only collect the amplitude spectrum of the fractional Fourier transform and lose its phase spectrum, the amplitude and phase information about the original signal is not lost and is encoded in the FrFT measurement. Therefore, we can achieve signal recovery from a single FrFT measurement through the reconstruction algorithm. On the contrary, the magnitude of the Fourier transform completely loses the information about the original signal, making single-shot PR impossible.

TABLE II: Physical parameter configurations used in the numerical diffraction calculations.

Parameter	Value
Wavelength	$\lambda=500\,nm$
Spatial Window Length	$L=1000\,um$
Total Sampling Number	$N=512$
Rectangular Aperture Width	$W=500\,um$

III Numerical Simulations

In this section, numerical simulations evaluate the proposed method. First, we validate the effectiveness of the proposed FrFT measurement model in simulated optical settings. Then we provide the results of the proposed reconstructing method. Finally, we further introduce the potential of the proposed method for practical applications.

TABLE III: Average PSNR/SSIM performance comparisons of various reconstruction methods for “amplitude” and “phase” objects with different fractional Fourier orders on Set12 and Cell8. The best results are labeled in bold and the second are underlined.

Datasets	Method	Type	FrFT Measurement					Fourier
Datasets	Method	Type	$p=0.2$	$p=0.4$	$p=0.5$	$p=0.6$	$p=0.8$	$p=1$
Set12	WF	“amplitude”	5.92/0.03	5.87/0.04	22.15/0.72	22.09/0.70	20.58/0.58	10.99/0.11
	GAP-tv		6.22/0.01	9.04/0.15	24.13/0.78	23.26/0.76	20.42/0.65	11.21/0.12
	prDeep		6.60/0.20	6.07/0.16	26.59/0.85	26.88/0.85	27.57/0.85	9.37/0.27
	PhysenNet		29.40/0.92	28.03/0.92	29.88/0.93	29.96/0.89	27.58/0.86	10.95/0.24
	DeepMMSE		29.51/0.92	29.67/0.92	29.85/0.92	29.65/0.92	29.27/0.92	10.36/0.30
	Ours		39.28/0.99	37.75/0.98	36.93/0.98	35.81/0.97	29.98/0.90	11.52/0.29
	WF	“phase”	9.40/0.03	12.64/0.26	14.90/0.63	15.23/0.62	12.56/0.50	12.10/0.19
	GAP-tv		8.50/0.02	8.49/0.13	12.73/0.56	12.85/0.51	8.94/0.35	10.77/0.27
	PhysenNet		14.73/0.76	20.74/0.91	21.02/0.92	20.20/0.85	21.96/0.90	5.83/0.12
	DeepMMSE		18.38/0.86	19.01/0.87	20.00/0.88	21.02/0.88	21.90/0.86	9.56/0.28
	Ours		29.48/0.98	30.57/0.98	30.50/0.98	31.40/0.98	27.93/0.93	10.96/0.47
Cell8	WF	“amplitude”	7.03/0.02	6.96/0.02	21.45/0.76	21.06/0.73	17.92/0.57	11.56/0.05
	GAP-tv		7.34/0.02	10.26/0.15	20.78/0.73	19.85/0.69	16.97/0.55	12.05/0.06
	prDeep		8.41/0.09	8.62/0.10	23.93/0.77	23.16/0.77	21.53/0.71	7.39/0.06
	PhysenNet		25.21/0.85	28.12/0.93	29.54/0.94	29.30/0.94	22.65/0.74	11.71/0.15
	DeepMMSE		25.21/0.83	25.31/0.83	25.49/0.84	25.48/0.84	24.45/0.81	11.70/0.20
	Ours		37.57/0.99	35.41/0.98	33.66/0.97	30.64/0.93	24.78/0.81	11.90/0.17
	WF	“phase”	8.76/0.03	11.65/0.16	15.53/0.65	15.93/0.64	11.54/0.36	11.70/0.09
	GAP-tv		8.65/0.02	8.68/0.08	14.31/0.48	14.36/0.46	10.83/0.30	11.76/0.13
	PhysenNet		21.39/0.91	21.68/0.91	22.10/0.91	20.61/0.88	17.50/0.74	7.41/0.08
	DeepMMSE		16.68/0.70	17.68/0.72	18.11/0.73	17.54/0.73	17.76/0.72	9.66/0.15
	Ours		30.69/0.99	32.15/0.98	31.53/0.98	29.47/0.96	22.85/0.84	12.30/0.28

III-A Evaluation for the FrFT-based measurement model

III-A1 Implementation details

To facilitate a comprehensive assessment of the proposed method, we conducted a computational simulation involving the diffraction propagation of a two-dimensional rectangular aperture under typical physical parameter configurations. Specifically, the spatial length of the sampling windows and the number of samplings are $1000$ um and $512$ in the source and destination plane, respectively. The width of the rectangular aperture and the wavelength are $500$ um and $500$ nm, respectively. The range of propagation distance is $1\sim 50$ mm. The detailed parameter values in numerical calculations are listed in Table II. Moreover, the proposed FrFT-based measurement model was realized on Pytorch, thus embracing the modern GPU acceleration.

III-A2 Verify the accuracy of the proposed method

To verify the accuracy of the proposed FrFT-based measurement model, we conduct two-dimensional diffraction computations using three distinct methods, i.e., single Fourier-transform-based Fresnel model (SFT-Fresnel), Fresnel transfer function model (Fresnel-TF), and the proposed FrFT-based measurement model (FrFT). For quantitative analysis, the reference field is provided by numerical integration of the Fresnel integral (1) via the trapezium rule. And we achieved the scalable sampling via post-processing to ensure consistent resolution and pixel pitch size for comparison.

The accuracy of each method is assessed by comparing the peak-signal-to-noise ratio (PSNR) of the diffraction pattern against the reference, with the results presented as a function of propagation distance in Fig. 4. It can be observed that the accuracy of the Fresnel-TF method deteriorates with increasing propagation distance, while the SFT-Fresnel method is unsuitable for numerical propagation over short distances. In contrast, the proposed FrFT method demonstrates remarkable accuracy and features suitability for both short-distance and long-distance propagation. An illustrative examination of the diffraction intensity patterns computed through these methods is presented in Figure 5. Notably, the SFT-Fresnel method manifests pronounced numerical errors within regions corresponding to short propagation distances. Conversely, while the Fresnel-TF method shows good performance when close to the source field, numerical errors become increasingly apparent at longer propagation distances. In contrast, the proposed FrFT-based model demonstrates consistency with numerical integration, providing similar results over the entire propagation distance range without apparent aliasing errors in both short- and long-range scenarios.

III-B Evaluation for the UNN-based reconstructing method

III-B1 Implementation Details

The proposed reconstruction method was implemented based on the PyTorch version $2.0.0$ platform with Python $3.9$ via one Nvidia GeForce GTX 1080 Ti GPU. During the training process, the model was optimized using the Adam optimizer for a total of $10000$ epochs, with the learning rate empirically set at $2e^{-4}$ . Notably, our method does not require any external training data apart from the input measurements themselves. To evaluate the effectiveness of the proposed method, we utilized two testing datasets Set12 [48] and Cell8³³3This dataset comprises a collection of eight distinct cell images, namely Brown fat cell, Peritoneal macrophage, Intestinal epithelial cell, Epithelial cell, Blood cell, Auditory hair cell, Acinar cell, and Chromoffin cell. These original cell images are available for download from the website http://www.cellimagelibrary.org/browse/celltype.. For the sake of unification, all images were resized to a uniform dimension of $256\times 256$ .

III-B2 Benchmark on SFrFPR

To verify the performance of the proposed method, we mainly compare it against one classic PR approach, namely Wirtinger Flow (WF) [10], two plug-and-play approaches GAP-tv [30] and prDeep [38], and two state-of-the-art untrained neural network (UNN) approaches PhysenNet [43] and DeepMMSE [41], which are implemented to be extended to solve the SFrFPR problem. Among them, various hyperparameters of these algorithms are set to the optimal. Here we further consider the reconstruction of amplitude-only and phase-only objects from measurements in different fractional Fourier orders⁴⁴4prDeep is limited to real-valued reconstruction, so phase-only objects are not considered here [38].. Table III reports the average recovery accuracy in terms of mean peak-signal-to-noise ratio (PSNR) and structure similarity index measure (SSIM) on two testing datasets Set12 and Cell8. It can be observed that optimization-based iterative algorithms such as WF, GAP-tv, and prDeep, can effectively reconstruct objects from some fractional Fourier measurements ( $p=0.5,0.6,0.8$ ) while all fail to recover images from the Fourier transform measurement ( $p=1$ ). Unfortunately, due to the inverse transform error of the fast fractional Fourier transform discrete algorithm, these iterative algorithms cannot achieve good results at some specific FrFT orders ( $p=0.2,0.4$ ). On the contrary, all UNN methods consistently perform well on FrFT measurements with different fractional orders except $p=1$ which denotes the Fourier transform measurement. Moreover, the reconstructing performance can be dramatically improved by the proposed transformer-based UNN when using the FrFT measurement.

For visual comparison, Fig. 6 and Fig. 7 present the reconstruction results of different PR methods on an amplitude-only and phase-only object from various FrFT measurements, respectively. It can be seen that all UNN methods can reconstruct the satisfactory amplitude and phase of objects from the single FrFT measurement but fail from the Fourier transform measurement. Note that the proposed method can significantly improve the reconstruction quality. In addition, the classic iterative method WF can also reconstruct the object from a single FrFT measurement with a suitable order, which cannot be achieved in the Fourier case.

III-B3 Algorithmic Investigation

To further investigate the effectiveness of the proposed method, we show the convergence behaviors of the proposed method from different FrFT measurements. Fig. 8 presents that the measurement loss calculated by mean square error (MSE) and reconstruction quality indicated by PSNR varies with training epochs of the proposed method on testing an amplitude object. It can be found that the measurement loss can gradually decrease and converge to a very small value for each FrFT order. However, the proposed method suffers from serious stagnation and produces a poor reconstruction quality (low PSNR) from the Fourier transform measurement. On the contrary, the FrFT measurement can be well-combined with the UNN priors to effectively avoid the stagnation problem and improve the reconstruction quality with increasing PSNR.

In order to verify that the proposed FrFT measurement can effectively overcome the ambiguity problem, we use two special but representative initializations to illuminate it. Specifically, we shift and flip the original image adding Gaussian random noise as the input of reconstruction methods. Then we utilize the proposed UNN-based method to reconstruct the original image from its Fourier transform ( $p=1$ ) and FrFT ( $p=0.5$ ) measurement, respectively. Fig. 9 presents the reconstructed results of the intermediate reconstruction process. It can be found that the proposed method can gradually reconstruct a correct object by eliminating the effect of translation or inversion from the FrFT measurement. On the contrary, the reconstruction method suffers from serious stagnation and produces an inaccurate solution from the Fourier transform measurement. While such trivial ambiguities are probably acceptable, they will compete with the correct solution and confuse the reconstructing algorithms in practice. Thus, the removal of ambiguities via the FrFT measurement can greatly alleviate the stagnation problem and improve the reconstruction performance.

TABLE IV: Summary of numerical optical parameters for X-ray CDI experiments. The energy of the X-ray laser and the information in the source plane are determined, including the field of view, the number of discrete points, and the pixel size. According to different physical propagation distances, we can sequentially obtain corresponding information in the detector plane through the proposed FrFT-based measurement model.

Source Plane	Detector Plane
X-ray energy: $5\,keV$	Propagation distance: $d=0.1\,m$	Propagation distance: $d=0.25\,m$	Propagation distance: $d=10\,m$
Corresponding wavelength: $0.248\,nm$	Corresponding FrFT order: $p=0.7507$	Corresponding FrFT order: $p=0.8958$	Corresponding FrFT order: $p\approx 1$
Corresponding wavelength: $0.248\,nm$	Corresponding scale factor: $s_{2}=2.6198$	Corresponding scale factor: $s_{2}=6.1357$	Corresponding scale factor: $s_{2}=242.1505$
Field of view: $51.2\,um\times 51.2\,um$	Field of view: $134.13\,um\times 134.13\,um$	Field of view: $314.88\,um\times 314.88\,um$	Field of view: $12.40\,mm\times 12.40\,mm$
Sampling number: $256\times 256$	Sampling number: $256\times 256$	Sampling number: $256\times 256$	Sampling number: $256\times 256$
Pixel size: $200\,nm\times 200\,nm$	Pixel size: $0.52\,um\times 0.52\,um$	Pixel size: $1.23\,um\times 1.23\,um$	Pixel size: $48.43\,um\times 48.43\,um$

III-C Applications for coherent diffraction imaging

Coherent diffraction imaging (CDI) is a ”lensless” technique for 2D or 3D reconstruction of the image of nanoscale structures such as nanocrystals [49, 50], potential proteins [51, 52], and more [53]. Due to the ill-posed nature of Fourier PR, existing CDI technologies mainly rely on the support constraint [54, 55] or coded modulation conditions [7] to achieve reconstruction. In this part, we unveil a novel imaging capability empowered by the proposed SFrFPR, i.e., support-free coherent diffraction imaging. To this end, we take practical experimental setups into account and perform numerical simulations to verify the proposed method.

The overall scheme of the proposed support-free CDI is illustrated in Fig. 10. The incident X-ray energy is $5$ keV corresponding to the wavelength of $0.248$ nm. The sample is a 3D particle of Bacterial RNA-free RNase P [56], whose volume data can be downloaded from the public protein data bank⁵⁵5https://www.rcsb.org/structure/8SSG.. After the plane wave illuminates the particle, we use the projection approximation method [57] to project the 3D volume data into the 2D exitwave represented as a phase object. Then through near-field diffraction, a single-shot intensity pattern of the FrFT measurement is collected by a detector. For clarity, we follow the scalable sampling of the proposed FrFT measurement model and simulate the corresponding intensity pattern via the numerical integration of Fresnel Integral. More details of the optical setting can be found in Table. IV. Finally, the latent exitwave can be reconstructed from the single FrFT measurement using the proposed UNN-based method. It is worth emphasizing that the system does not require tight support of the object or additional physical equipment such as mask modulators.

We conducted three group experiments and collected intensity observations in different diffraction zones. The number of training epochs is 2000 for all reconstructions. Fig. 11 presents the reconstructed results when $d$ is 0.1 m, 0.25 m, and 10 m. It can be seen that the exitwave can be retrieved from the single intensity pattern in the FrFT regime while failing in the Fourier regime. Applying the intrinsic physical constraints of SFrFPR, we mitigate the inherent ambiguities of the reconstruction and achieve the support-free CDI technique. In particular, the proposed method is demonstrated to be robust to the inconsistency between the FrFT-based measurement model and the numerical integration model. All results support the potential of the proposed method in real experiments.

IV Conclusion

In this work, we have tackled the problem of single-shot phase retrieval from a fractional Fourier transform perspective. Specifically, we introduced the FrFT to resolve the perennial issue of numerical inaccuracies arising from the sampling constraints associated with the discretized transfer function involved in the Fresnel diffraction integral. Consequently, the FrFT-based measurement model, presented herein, emerges as a versatile solution that aptly addresses wave propagation scenarios spanning both short and long distances. In addition, we have embraced a self-supervised reconstruction framework that combines the inherent constraints of the FrFT measurement with untrained neural network priors, relaxing the previous conditions of oversampled or multiple measurements in the Fourier domain. Correspondingly, we demonstrate the rationale behind the proposed SFrFPR paradigm from the perspective of fractional Fourier time-frequency representation. Through numerical simulations, the results manifest a profound superiority of the single FrFT measurement in comparison to its Fourier transform counterparts. Moreover, the SFrFPR paradigm, as proposed, unveils the potential to revolutionize imaging paradigms, particularly in support-free coherent diffraction imaging. In the future, we believe the single-shot imaging capability of the proposed SFrFPR will have the potential to study dynamic processes in materials and biological science utilizing pulsed sources, such as X-ray free-electron lasers.

References

[1] J. Miao, D. Sayre, and H. Chapman, “Phase retrieval from the magnitude of the fourier transforms of nonperiodic objects,” JOSA A, vol. 15, no. 6, pp. 1662–1669, 1998.
[2] G. Zheng, R. Horstmeyer, and C. Yang, “Wide-field, high-resolution fourier ptychographic microscopy,” Nature photonics, vol. 7, no. 9, pp. 739–745, 2013.
[3] P. Chakravarthula, E. Tseng, T. Srivastava, H. Fuchs, and F. Heide, “Learned hardware-in-the-loop phase retrieval for holographic near-eye displays,” ACM Transactions on Graphics (TOG), vol. 39, no. 6, pp. 1–18, 2020.
[4] O. Katz, P. Heidmann, M. Fink, and S. Gigan, “Non-invasive single-shot imaging through scattering layers and around corners via speckle correlations,” Nature photonics, vol. 8, no. 10, pp. 784–790, 2014.
[5] Y. Shechtman, Y. C. Eldar, O. Cohen, H. N. Chapman, J. Miao, and M. Segev, “Phase retrieval with application to optical imaging: a contemporary overview,” IEEE signal processing magazine, vol. 32, no. 3, pp. 87–109, 2015.
[6] T. Bendory, R. Beinert, and Y. C. Eldar, “Fourier phase retrieval: Uniqueness and algorithms,” in Compressed Sensing and its Applications: Second International MATHEON Conference 2015. Springer, 2017, pp. 55–91.
[7] F. Zhang, B. Chen, G. R. Morrison, J. Vila-Comamala, M. Guizar-Sicairos, and I. K. Robinson, “Phase retrieval by coherent modulation imaging,” Nature communications, vol. 7, no. 1, p. 13367, 2016.
[8] J. R. Fienup, “Reconstruction of an object from the modulus of its fourier transform,” Optics letters, vol. 3, no. 1, pp. 27–29, 1978.
[9] E. J. Candes, X. Li, and M. Soltanolkotabi, “Phase retrieval from coded diffraction patterns,” Applied and Computational Harmonic Analysis, vol. 39, no. 2, pp. 277–299, 2015.
[10] ——, “Phase retrieval via wirtinger flow: Theory and algorithms,” IEEE Transactions on Information Theory, vol. 61, no. 4, pp. 1985–2007, 2015.
[11] M. H. Seaberg, J. J. Turner, and A. d’Aspremont, “Coherent diffractive imaging using randomly coded masks,” in Laser Science. Optica Publishing Group, 2015, pp. LM1H–3.
[12] A. Faridian, D. Hopp, G. Pedrini, U. Eigenthaler, M. Hirscher, and W. Osten, “Nanoscale imaging using deep ultraviolet digital holographic microscopy,” Optics express, vol. 18, no. 13, pp. 14 159–14 164, 2010.
[13] A. M. Maiden, J. M. Rodenburg, and M. J. Humphry, “Optical ptychography: a practical implementation with useful resolution,” Optics letters, vol. 35, no. 15, pp. 2585–2587, 2010.
[14] K. Jaganathan, Y. C. Eldar, and B. Hassibi, “Stft phase retrieval: Uniqueness guarantees and recovery algorithms,” IEEE Journal of selected topics in signal processing, vol. 10, no. 4, pp. 770–781, 2016.
[15] S. Takazawa, J. Kang, M. Abe, H. Uematsu, N. Ishiguro, and Y. Takahashi, “Demonstration of single-frame coherent x-ray diffraction imaging using triangular aperture: Towards dynamic nanoimaging of extended objects,” Optics Express, vol. 29, no. 10, pp. 14 394–14 402, 2021.
[16] H. Chan, Y. S. Nashed, S. Kandel, S. O. Hruszkewycz, S. K. Sankaranarayanan, R. J. Harder, and M. J. Cherukara, “Rapid 3d nanoscale coherent imaging via physics-aware deep learning,” Applied Physics Reviews, vol. 8, no. 2, 2021.
[17] C. Wang, M. Hu, Y. Takashima, T. J. Schulz, and D. J. Brady, “Snapshot ptychography on array cameras,” Optics Express, vol. 30, no. 2, pp. 2585–2598, 2022.
[18] Q. Ye, L.-W. Wang, and D. P. Lun, “Sisprnet: end-to-end learning for single-shot phase retrieval,” Optics Express, vol. 30, no. 18, pp. 31 937–31 958, 2022.
[19] E. Li, Y. Liu, X. Liu, K. Zhang, Z. Wang, Y. Hong, Q. Yuan, W. Huang, A. Marcelli, P. Zhu et al., “Phase retrieval from a single near-field diffraction pattern with a large fresnel number,” JOSA A, vol. 25, no. 11, pp. 2651–2658, 2008.
[20] G. Williams, H. Quiney, B. Dhal, C. Tran, K. A. Nugent, A. Peele, D. Paterson, and M. De Jonge, “Fresnel coherent diffractive imaging,” Physical review letters, vol. 97, no. 2, p. 025506, 2006.
[21] M. Iwen, M. Perlmutter, and M. P. Roach, “Toward fast and provably accurate near-field ptychographic phase retrieval,” Sampling Theory, Signal Processing, and Data Analysis, vol. 21, no. 1, p. 6, 2023.
[22] W. Zhang, H. Zhang, and G. Jin, “Frequency sampling strategy for numerical diffraction calculations,” Optics Express, vol. 28, no. 26, pp. 39 916–39 932, 2020.
[23] H. M. Ozaktas, Z. Zalevsky, and M. A. Kutay, The Fractional Fourier Transform with Applications in Optics and Signal Processing. Wiley, New York, 2001.
[24] M. Jinming, M. Hongxia, S. Xinhua, G. Chang, K. Xuejing, and T. Ran, “Research progress in theories and applications of the fractional fourier transform,” Opto-Electronic Engineering, vol. 45, no. 6, pp. 170 747–1, 2018.
[25] R. Tao, B. Deng, W.-Q. Zhang, and Y. Wang, “Sampling and sampling rate conversion of band limited signals in the fractional fourier transform domain,” IEEE Transactions on Signal Processing, vol. 56, no. 1, pp. 158–171, 2007.
[26] B. Deng, R. Tao, and Y. Wang, “Convolution theorems for the linear canonical transform and their applications,” Science in China Series F: Information Sciences, vol. 49, pp. 592–603, 2006.
[27] X. Su, R. Tao, and X. Kang, “Analysis and comparison of discrete fractional fourier transforms,” Signal Processing, vol. 160, pp. 284–298, 2019.
[28] H. M. Ozaktas and D. Mendlovic, “Fractional fourier optics,” JOSA A, vol. 12, no. 4, pp. 743–751, 1995.
[29] X. Su, R. Tao, and Y. Li, “Phase retrieval from multiple frft measurements based on nonconvex low-rank minimization,” Signal Processing, vol. 198, p. 108601, 2022.
[30] Y. Yang and R. Tao, “Single-shot fractional fourier phase retrieval,” in ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2023, pp. 1–5.
[31] J. Li, Z. Peng, and Y. Fu, “Diffraction transfer function and its calculation of classic diffraction formula,” Optics communications, vol. 280, no. 2, pp. 243–248, 2007.
[32] W. Zhang, H. Zhang, and G. Jin, “Band-extended angular spectrum method for accurate diffraction calculation in a wide propagation range,” Optics letters, vol. 45, no. 6, pp. 1543–1546, 2020.
[33] L. B. Almeida, “The fractional fourier transform and time-frequency representations,” IEEE Transactions on signal processing, vol. 42, no. 11, pp. 3084–3091, 1994.
[34] C. Candan, M. Kutay, and H. Ozaktas, “The discrete fractional fourier transform,” IEEE Transactions on Signal Processing, vol. 48, no. 5, pp. 1329–1337, 2000.
[35] H. M. Ozaktas, O. Arikan, M. A. Kutay, and G. Bozdagt, “Digital computation of the fractional fourier transform,” IEEE Transactions on signal processing, vol. 44, no. 9, pp. 2141–2150, 1996.
[36] S.-C. Pei and J.-J. Ding, “Closed-form discrete fractional and affine fourier transforms,” IEEE transactions on signal processing, vol. 48, no. 5, pp. 1338–1353, 2000.
[37] G. Wang, G. B. Giannakis, Y. Saad, and J. Chen, “Phase retrieval via reweighted amplitude flow,” IEEE Transactions on Signal Processing, vol. 66, no. 11, pp. 2818–2833, 2018.
[38] C. Metzler, P. Schniter, A. Veeraraghavan, and R. Baraniuk, “prdeep: Robust phase retrieval with a flexible deep network,” in International Conference on Machine Learning. PMLR, 2018, pp. 3501–3510.
[39] K. Wei, A. I. Avilés-Rivero, J. Liang, Y. Fu, H. Huang, and C.-B. Schönlieb, “Tfpnp: Tuning-free plug-and-play proximal algorithms with applications to inverse imaging problems.” J. Mach. Learn. Res., vol. 23, no. 16, pp. 1–48, 2022.
[40] Y. Yang, R. Tao, K. Wei, and Y. Fu, “Dynamic proximal unrolling network for compressive imaging,” Neurocomputing, vol. 510, pp. 203–217, 2022.
[41] M. Chen, P. Lin, Y. Quan, T. Pang, and H. Ji, “Unsupervised phase retrieval using deep approximate mmse estimation,” IEEE Transactions on Signal Processing, vol. 70, pp. 2239–2252, 2022.
[42] E. Cha, C. Lee, M. Jang, and J. C. Ye, “Deepphasecut: Deep relaxation in phase for unsupervised fourier phase retrieval,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 44, no. 12, pp. 9931–9943, 2021.
[43] F. Wang, Y. Bian, H. Wang, M. Lyu, G. Pedrini, W. Osten, G. Barbastathis, and G. Situ, “Phase imaging with an untrained neural network,” Light: Science & Applications, vol. 9, no. 1, p. 77, 2020.
[44] D. Ulyanov, A. Vedaldi, and V. Lempitsky, “Deep image prior,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 9446–9454.
[45] Z. Wang, X. Cun, J. Bao, W. Zhou, J. Liu, and H. Li, “Uformer: A general u-shaped transformer for image restoration,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 17 683–17 693.
[46] R. Torres and E. Torres, “Fractional fourier analysis of random signals and the notion of/spl alpha/-stationarity of the wigner-ville distribution,” IEEE transactions on signal processing, vol. 61, no. 6, pp. 1555–1560, 2012.
[47] R. Tao, B. Deng, Y. Wang et al., “Fractional fourier transform and its applications,” Beijing: Tsinghua University, pp. 285–96, 2009.
[48] K. Zhang, W. Zuo, Y. Chen, D. Meng, and L. Zhang, “Beyond a gaussian denoiser: Residual learning of deep cnn for image denoising,” IEEE transactions on image processing, vol. 26, no. 7, pp. 3142–3155, 2017.
[49] M. A. Pfeifer, G. J. Williams, I. A. Vartanyants, R. Harder, and I. K. Robinson, “Three-dimensional mapping of a deformation field inside a nanocrystal,” Nature, vol. 442, no. 7098, pp. 63–66, 2006.
[50] J. Miao, T. Ishikawa, I. K. Robinson, and M. M. Murnane, “Beyond crystallography: Diffractive imaging using coherent x-ray light sources,” Science, vol. 348, no. 6234, pp. 530–535, 2015.
[51] J. Miao, R. L. Sandberg, and C. Song, “Coherent x-ray diffraction imaging,” IEEE Journal of selected topics in quantum electronics, vol. 18, no. 1, pp. 399–410, 2011.
[52] G. Brändén, G. Hammarin, R. Harimoorthy, A. Johansson, D. Arnlund, E. Malmerberg, A. Barty, S. Tångefjord, P. Berntsen, D. P. DePonte et al., “Coherent diffractive imaging of microtubules using an x-ray laser,” Nature communications, vol. 10, no. 1, p. 2589, 2019.
[53] S. Marchesini, H. Chapman, S. Hau-Riege, R. London, A. Szoke, H. He, M. Howells, H. Padmore, R. Rosen, J. Spence et al., “Coherent x-ray diffractive imaging: applications and limitations,” Optics Express, vol. 11, no. 19, pp. 2344–2353, 2003.
[54] D. Yang, J. Zhang, Y. Tao, W. Lv, S. Lu, H. Chen, W. Xu, and Y. Shi, “Dynamic coherent diffractive imaging with a physics-driven untrained learning method,” Optics Express, vol. 29, no. 20, pp. 31 426–31 442, 2021.
[55] J. Kang, S. Takazawa, N. Ishiguro, and Y. Takahashi, “Single-frame coherent diffraction imaging of extended objects using triangular aperture,” Optics Express, vol. 29, no. 2, pp. 1441–1453, 2021.
[56] C. A. Wilhelm, L. Mallik, A. L. Kelly, S. Brotzman, J. Mendoza, A. G. Anders, S. Leskaj, C. Castillo, B. T. Ruotolo, M. A. Cianfrocco et al., “Bacterial rna-free rnase p: structural and functional characterization of multiple oligomeric forms of a minimal protein-only ribonuclease p,” Journal of Biological Chemistry, p. 105327, 2023.
[57] K. S. Morgan, K. K. W. Siu, and D. Paganin, “The projection approximation and edge contrast for x-ray propagation-based phase contrast imaging of a cylindrical edge,” Optics Express, vol. 18, no. 10, pp. 9865–9878, 2010.

Measurement	WF	GAP-tv	prDeep	PhysenNet	DeepMMSE	Ours	Ground-Truth

Fourier ( $p=1$ )	9.64 dB	10.57 dB	8.82 dB	9.16 dB	8.46 dB	10.08 dB	PSNR

FrFT ( $p=0.6$ )	22.17 dB	23.90 dB	30.06 dB	32.73 dB	28.61 dB	33.79 dB	PSNR

FrFT ( $p=0.4$ )	5.62 dB	9.49 dB	6.16 dB	30.17 dB	27.72 dB	38.42 dB	PSNR

Measurement	WF	GAP-tv	PhysenNet	DeepMMSE	Ours	Ground-Truth

Fourier ( $p=1$ )	14.28 dB	13.56 dB	5.17 dB	11.78 dB	12.00 dB	PSNR

FrFT ( $p=0.5$ )	19.59 dB	16.25 dB	11.55 dB	16.81 dB	25.51 dB	PSNR

FrFT ( $p=0.2$ )	9.78 dB	9.15 dB	20.22 dB	17.72 dB	21.85 dB	PSNR

Initialization	Epoch-100	Epoch-200	Epoch-500

Fourier ( $p=1$ )	9.46 dB	9.21 dB	9.23 dB

FrFT ( $p=0.5$ )	14.02 dB	17.68 dB	20.40 dB

Fourier ( $p=1$ )	9.14 dB	9.09 dB	9.05 dB

FrFT ( $p=0.5$ )	14.07 dB	16.65 dB	20.56 dB

FrFT Regime ( $d=0.1\,m$ )	FrFT Regime ( $d=0.25\,m$ )	Fourier Regime ( $d=10\,m$ )	Ground-Truth

Single-shot Phase Retrieval from a Fractional Fourier Transform Perspective