Model-Guided Multi-Contrast Deep Unfolding Network for MRI Super-resolution Reconstruction

Gang Yang [email protected] 0000-0001-9403-5818 University of Science and Technology of China43017-6221 , Li Zhang [email protected] 0000-0003-1610-6056 University of Science and Technology of China , Man Zhou [email protected] 0000-0003-2872-605X University of Science and Technology of China , Aiping Liu [email protected] 0000-0001-8849-5228 University of Science and Technology of ChinaUSTC IAT-Huami Joint Laboratory for Brain-Machine Intelligence, Institute of Advanced Technology , Xun Chen [email protected] 0000-0002-4922-8116 University of Science and Technology of ChinaUSTC IAT-Huami Joint Laboratory for Brain-Machine Intelligence, Institute of Advanced Technology , Zhiwei Xiong [email protected] 0000-0002-9787-7460 University of Science and Technology of ChinaInstitute of Artificial Intelligence, Hefei Comprehensive National Science Center and Feng Wu [email protected] University of Science and Technology of ChinaInstitute of Artificial Intelligence, Hefei Comprehensive National Science Center

(2022)

Abstract.

Magnetic resonance imaging (MRI) with high resolution (HR) provides more detailed information for accurate diagnosis and quantitative image analysis. Despite the significant advances, most existing super-resolution (SR) reconstruction network for medical images has two flaws: 1) All of them are designed in a black-box principle, thus lacking sufficient interpretability and further limiting their practical applications. Interpretable neural network models are of significant interest since they enhance the trustworthiness required in clinical practice when dealing with medical images. 2) most existing SR reconstruction approaches only use a single contrast or use a simple multi-contrast fusion mechanism, neglecting the complex relationships between different contrasts that are critical for SR improvement. To deal with these issues, in this paper, a novel Model-Guided interpretable Deep Unfolding Network (MGDUN) for medical image SR reconstruction is proposed. The Model-Guided image SR reconstruction approach solves manually designed objective functions to reconstruct HR MRI. We show how to unfold an iterative MGDUN algorithm into a novel model-guided deep unfolding network by taking the MRI observation matrix and explicit multi-contrast relationship matrix into account during the end-to-end optimization. Extensive experiments on the multi-contrast IXI dataset and BraTs 2019 dataset demonstrate the superiority of our proposed model.

Model-Guided Network, MRI Super-Resolution, Deep Unfolding Network.

^†^†copyright: acmcopyright^†^†journalyear: 2022^†^†conference: Proceedings of the 30th ACM International Conference on Multimedia; October 10–14, 2022; Lisboa, Portugal^†^†booktitle: Proceedings of the 30th ACM International Conference on Multimedia (MM ’22), October 10–14, 2022, Lisboa, Portugal^†^†price: 15.00^†^†doi: 10.1145/3503161.3548068^†^†isbn: 978-1-4503-9203-7/22/10^†^†submissionid: 1308^†^†ccs: Computing methodologies Reconstruction

1. Introduction

Magnetic resonance imaging (MRI) has been widely adopted in clinical and medical research. In comparison to other imaging modalities such as computed tomography (CT) and nuclear imaging, MRI enjoys the advantage of delivering detailed images of tissue architecture without the use of ionizing radiation (Feng et al., 2021a). The MRI system may be configured in a number of ways using pulse sequences to provide multi-contrast images such as T1, T2, and proton density (PD) weighted images that include essential physiological and pathological features. However, in real-world cases, HR MR images are often obtained with a longer scanning time, lower signal-to-noise ratio, and small spatial converge (Zhang et al., 2021; Plenge et al., 2012). Additionally, the quality of MR images acquired in clinical practice may be insufficient owing to patients’ involuntary physiological movements (e.g., heart pounding and breathing) during the acquisition process. This is particularly problematic when protocols requiring a long echo time (TE) or repetition time (TR) are used. These scans may lead to inaccurate diagnosis as limited structural and textural information is provided in subsequent quantitative medical image analysis (Feng et al., 2021d). As a result, there is emerging interest in developing super-resolution (SR) techniques for reconstructing high-resolution (HR) outputs from low-resolution (LR) images to increase the spatial resolution of magnetic resonance imaging.

The utility of super-resolution is capable to improve the quality of MR images without modifying the hardware and overcome the challenges in obtaining HR MRI scans. MRI super-resolution approaches can be broadly classified into two categories depending on the number of imaging modalities involved: single-contrast super-resolution (SCSR) methods and multi-contrast super-resolution (MCSR) methods. SCSR approaches have been extensively studied over the last several decades, with the goal of reconstructing the high-resolution counterpart of a given low-resolution image in a single contrast mode, thereby ignoring the complementary multi-contrast information. In contrast to SCSR methods, MCSR methods recover a target modality by synthesizing information from multiple modalities. Clinically, MRI generates multi-contrast images under a variety of imaging settings but with the same anatomical structure, which includes T1 and T2 weighted images (T1WIs and T2WIs), as well as proton density and fast-suppressed proton density weighted images (PDWIs and FS-PDWIs), providing complementary information to each other (Zeng et al., 2018; Mai et al., 2011). Noting that contrasts with shorter acquisition times are easier to obtain, they can be used to supplement a single LR image with extra information. For example, relevant HR information from T1WIs or PDWIs may be utilized as auxiliary contrasts to aid in the generation of target contrasts.

Existing techniques for image super-resolution reconstruction include model-based and learning-based approaches. Model-based techniques utilize domain knowledge when modeling the physical mechanism underlying the issue. Typical optimization algorithms include alternating direction method of multipliers (ADMM) algorithm (Sun et al., 2016), and iterative shrinkage-thresholding algorithm (ISTA) (Zhang and Ghanem, 2018). Regardless of its theoretical attractiveness, model-based approaches are incapable of performing end-to-end optimization, resulting in limited performance. Alternatively, deep learning-based SR approaches have gained growing attention in recent years. For example, various architectures such as residual networks (Chaudhari et al., 2018), generative adversarial networks (Lyu et al., 2020), and densely connected networks (Chen et al., 2018a) are utilized to reconstruct an MR image. Nevertheless, neural networks lack transparency (i.e., the black-box design) with generalized structures, and it is unclear how domain knowledge can be incorporated. When dealing with medical images, the accuracy and trustworthiness of reconstruction are critical for discovery and diagnosis. Therefore, balancing accuracy and interpretability is a non-trivial problem. The objective of designing interpretable neural networks is to bridge the gap between model-based and learning-based methods which can be accomplished by unfolding the iterations of an inference algorithm into deep neural networks, thus making the learning process interpretable.

In this paper, a novel Model-Guided interpretable Deep Unfolding Network (MGDUN) for medical image SR reconstruction is proposed. The motivation of our approach has two folds. On the one hand, to fully exploit domain knowledge of MRI SR and improve prediction performance, we formulate two manually designed objective functions for reconstructing HR MRI, each corresponding to a recovery process and incorporating domain knowledge. We then show how to solve these functions iteratively and how to unfold the iterative MGDUN algorithm into a neural network form by implementing specially designed modules. In contrast to conventional neural networks, model-guided design results in transparent network architectures that are well-aligned with the emerging interpretable machine learning framework. On the other hand, we formulate MGDUN as a network to reconstruct an HR image of a target contrast from an LR input with the aid of other guide contrasts, which is capable to exploit the complex relationships between different modalities better.

The main contributions of this paper are as follows:

•

We design a novel Model-Guided Deep Unfolding Network (MGDUN) for medical image SR reconstruction, which models multi-contrast MRI SR with other contrast images in an interpretable manner.
•

We elaborate on how to solve the manually designed objective functions and how to unfold the iterative algorithm into a neural network by incorporating domain knowledge with specially designed modules.
•

We reconstruct an HR image of a target contrast from an LR input with the aid of other guide contrasts, providing a new strategy for multi-contrast fusion.
•

Extensive experiments on the multi-contrast IXI dataset and BraTs 2019 dataset demonstrate the superiority of MGDUN.

2. RELATED WORK

2.1. Multi-contrast MR image representation

Clinically, MR images are usually acquired with multiple contrasts under a variety of imaging settings for comprehensive evaluation (Lyu et al., 2020; Feng et al., 2021b), and each provides unique and complementary structural information about tissues (Zeng et al., 2018; Brown and Semelka, 2011). As a result, Multi-contrast has been proposed to improve representation ability for a variety of MR image tasks, including segmentation and SR. For example, Huo et al. trained and evaluated MRI segmentation using T1WIs and T2WIs. (Huo et al., 2018). Different from the multi-contrast MRI segmentation task, the SR task requires the division of images into auxiliary and target contrasts. The auxiliary contrast, which is usually easier to obtain, is utilized to assist the reconstruction of the target contrast. Rough et al., for instance, super-resolved the target image using anatomical intermodality priors from a reference image (Rousseau et al., 2010). Meanwhile, similar anatomical structures in the auxiliary contrast may be utilized to reconstruct an SR image from its LR counterpart (Jafari-Khouzani, 2014; Zheng et al., 2017, 2018). However, most current methods are limited in constructing a model-based interpretable network, making the investigation of the relationship between multiple contrasts challenging.

2.2. Medical Image Super-Resolution

Medical image super-resolution methods are classified into two broad categories: single-contrast super-resolution (SCSR) (Lim et al., 2017; Zhao et al., 2018; Kuklisova-Murgasova et al., 2012; Pham et al., 2017; McDonagh et al., 2017; Zhao et al., 2019; Zhang et al., 2018; Feng et al., 2021c) and multi-contrast super-resolution (MCSR) (Feng et al., 2021d; Zeng et al., 2018; Zheng et al., 2017; Lu et al., 2015; Manjón et al., 2010). Traditionally, because of their simplicity, bicubic and b-spline interpolations are two of the most frequently used SCSR methods in MRI practice. However, both methods invariably generate fuzzy edges and block artifacts. To address these issues, EDSR and MDSR (Lim et al., 2017) employ multiple blocks with linear residuals, contributing to improved performance in image super-resolution, whereas RCAN (Zhang et al., 2018) employs numerous residual groups with long skip connections and several residual blocks with short skip connections within each residual group. Recently, T²Net (Feng et al., 2021c) with joint MRI reconstruction and SR enables representation and feature sharing across tasks, producing higher-quality, super-resolved, and motion-artifact-free images. In another line of studies, MCSR methods have shown superior performance and the ability to make full use of domain knowledge from multiple modalities. For example, Feng et al. (Feng et al., 2021d) develop a novel and successful solution called SANet, which consists of a separable attention network for exploring foreground and background areas in forward and backward directions using auxiliary contrast.

2.3. Deep unfolding network

As a pioneer work, deep unfolding is first reported in (Gregor and LeCun, 2010), and it designs a learned version of the iterative soft thresholding algorithm (ISTA) that can be unfolded into a neural network form. Since then, a series of works (Kokkinos and Lefkimmiatis, 2018; You et al., 2021; Song et al., 2021; Ning et al., 2020; Zhang et al., 2020) demonstrate that deep unfolding methods are applicable to certain optimization algorithms since they can not only optimize the parameters in an end-to-end manner by minimizing the loss function over a large training set, but also integrate model-based and learning-based methods well. For instance, a novel fast network (Afonso et al., 2010; Ning et al., 2020) based on half-quadratic splitting is proposed to solve the unconstrained optimization problem in the task of image restoration and reconstruction. Another important work in image SR (Zhang et al., 2020) proposes an end-to-end trainable unfolding network that integrates the flexibility of model-based methods with the advantages of learning-based methods. Additionally, CoISTA (Deng and Dragotti, 2019) introduces a novel joint multi-modal dictionary learning (JMDL) method for modeling cross-modal dependency. It converts the JMDL model into a deep neural network by unfolding the iterative shrinkage and thresholding algorithm (ISTA).

3. METHOD

Refer to caption — Figure 1. The overall architecture of MGDUN. It is a model-guided interpretable network with T stages.

3.1. Motivation

An HR image $I_{HR}$ can return the LR image $I_{LR}$ obtained following the down-sampling process, and the process of down-sampling $f$ can be expressed as follows:

(1)

\displaystyle I_{LR}\;=\;f(I_{HR})\;=\;\phi(I_{HR})\;+\;{\mathcal{N}}

where $\phi$ denotes the function for down-sampling or blurring, and ${\mathcal{N}}$ denotes the system noise. The SR process is theoretically aimed at exploring the inverse solution $f^{-1}$ to the original down-sampling function $f$ . Owing to the fact that the SR process is an ill-posed problem, it is impossible to obtain an exact inverse solution; only approximate solutions are possible. The goal of the SR imaging process is to find the most desirable inverse function $g$ of the theory inverse solution $f^{-1}$ .

(2)

\displaystyle I_{SR}\;=\;g(I_{LR})\;\approx\;I_{HR}

where $I_{SR}$ denotes the corresponding SR image. To obtain such an approximate solution $g$ , it is necessary to use the image prior. There are still limitations on the prior information of single contrast images, so we take advantage of multi-contrast MR images. Based on the prior information of multi-contrast MRI, we propose a novel Model-Guided interpretable Deep Unfolding Network (MGDUN) for medical image SR reconstruction reconstruction.

3.2. Model-Guided MRI SR algorithm

3.2.1. The objective functions

Let $X\in\mathbb{R}^{n\times C}$ represents the degraded observations and $Z\in\mathbb{R}^{N\times C}$ represents the unknown original image, where $C$ denotes the number of channels, and $n=h\times w$ and $N=H\times W$ . It is assumed that an LR image is obtained by down-sampling and blurring an HR image, so the linear relationship between the observed image and the original HR image can be typically formulated as follows:

(3)

\displaystyle X\;=\;DKZ\;+\;{\mathcal{N}}_{1}

where $D\in\mathbb{R}^{n\times N}$ and $K\in\mathbb{R}^{N\times N}$ represent the process of down-sampling and blurring, respectively, and ${\mathcal{N}}_{1}$ denotes the noise.

Transform modal relationship considering multi-contrast MR images in MCSR task, the transform relationship between the guide image $Y\in\mathbb{R}^{N\times C}$ and the unknown original image can be formulated by:

(4)

\displaystyle Y\;=\;PZ\;+\;{\mathcal{N}}_{2}

where $P\in\mathbb{R}^{N\times N}$ is the transform function, and the ${\mathcal{N}}_{2}$ represents the noise in this process. As a result, $Z$ can be obtained by solving the following objective function:

	$\displaystyle Z\;=\;\underset{Z}{argmin}$	$\displaystyle\frac{1}{2}{\|\|X\;-\;DKZ\|\|}_{2}^{2}\;+\;\frac{\eta}{2}{\|\|Y\;-\;PZ\|\|}_{2}^{2}\;$
(5)			$\displaystyle+\;\lambda_{1}{\mathfrak{R}}_{1}(Z)\;+\;\lambda_{2}{\mathfrak{R}}_{2}(Z)$

where the hyper parameters ( $\eta,\lambda_{1},\lambda_{2}$ ) are the trade-off coefficient, and the last two regularization terms correspond to the prior domain knowledge of the MCSR task, which are noise prior in the typical degradation process and transform modal noise prior in multi-contrast image, respectively. The choice of various regularization functions reflects different ways of incorporating prior knowledge about the unknown original HR MR image.

In the next section, we describe how to solve the objective function with an iterative algorithm. Then, we unfold the iterative algorithm into neural networks for image SR, obtaining an end-to-end reconstruction architecture.

3.2.2. The Proximal Gradient Descent Algorithm

Following the framework of half-quadratic splitting (HQS) to introduce two auxiliary splitting parameters $U$ and $V$ for $Z$ with different prior knowledge of the MCSR task, the Eq. 5 can be formulated as a non-constrained optimization problem, which can be written as:

	$\displaystyle\underset{Z,\;U,\;V}{argmin}$	$\displaystyle\frac{1}{2}{\|\|X\;-\;DKZ\|\|}_{2}^{2}\;+\;\frac{\eta}{2}{\|\|Y\;-\;PZ\|\|}_{2}^{2}\;$
		$\displaystyle+\frac{\beta_{1}}{2}{\|\|U\;-\;Z\|\|}_{2}^{2}\;+\frac{\beta_{2}}{2}{\|\|V\;-\;Z\|\|}_{2}^{2}\;$
(6)			$\displaystyle+\;\lambda_{1}{\mathfrak{R}}_{1}(U)\;+\;\lambda_{2}{\mathfrak{R}}_{2}(V)$

where $\beta_{1}$ , $\beta_{2}$ , $\lambda_{1}$ , and $\lambda_{2}$ are the penalty parameters. To obtain an unrolling inference, Eq. 6 can be divided into the following three sub-problems and solved alternatively:

(7)		$\displaystyle U^{(t)}\;=\;$	$\displaystyle\underset{U}{argmin}\frac{\beta_{1}}{2}{\|\|U\;-\;Z^{(t)}\|\|}_{2}^{2}\;+\;\lambda_{1}{\mathfrak{R}}_{1}(U)\;$
(8)		$\displaystyle V^{(t)}\;=\;$	$\displaystyle\underset{V}{argmin}\frac{\beta_{2}}{2}{\|\|V\;-\;Z^{(t)}\|\|}_{2}^{2}\;+\;\lambda_{2}{\mathfrak{R}}_{2}(V)\;$
	$\displaystyle Z^{(t+1)}\;=\;$	$\displaystyle\underset{Z}{argmin}\frac{1}{2}{\|\|X\;-\;DKZ\|\|}_{2}^{2}\;+\;\frac{\eta}{2}{\|\|Y\;-\;PZ\|\|}_{2}^{2}\;$
(9)			$\displaystyle+\frac{\beta_{1}}{2}{\|\|U^{(t)}\;-\;Z\|\|}_{2}^{2}\;+\frac{\beta_{2}}{2}{\|\|V^{(t)}\;-\;Z\|\|}_{2}^{2}\;$

here, $t$ denotes the HQS iteration index.

For the objective function of Eq. 5 with the prior knowledge of the MCSR task, we employ the efficient Proximal Gradient Descent (PGD) to solve the above three sub-problems:

	$\displaystyle U^{(t)}\;$	$\displaystyle=\;{Prox}_{{\mathfrak{R}}_{1}}(U^{(t-1)}\;-\;\delta_{1}\nabla_{U}\;\mathcal{F}(U^{(t-1)}))\;$
(10)			$\displaystyle=\;\;{Prox}_{{\mathfrak{R}}_{1}}(U^{(t-1)}\;-\;\delta_{1}(\beta_{1}(U^{(t-1)}\;-\;Z^{(t)})))$
	$\displaystyle V^{(t)}\;$	$\displaystyle=\;{Prox}_{{\mathfrak{R}}_{2}}(V^{(t-1)}\;-\;\delta_{2}\nabla_{V}\;\mathcal{F}(V^{(t-1)}))\;$
(11)			$\displaystyle=\;\;{Prox}_{{\mathfrak{R}}_{2}}(V^{(t-1)}\;-\;\delta_{2}(\beta_{2}(V^{(t-1)}\;-\;Z^{(t)})))$
(12)		$\displaystyle Z^{(t+1)}\;$	$\displaystyle=\;Z^{(t)}\;-\;\delta_{3}\nabla_{Z}\;\mathcal{F}(Z^{(t)})$

where $Prox_{{\mathfrak{R}}_{1}}(\cdot)$ and $Prox_{{\mathfrak{R}}_{2}}(\cdot)$ are proximal operators corresponding to penalty ${{\mathfrak{R}}_{1}}(\cdot)$ and ${{\mathfrak{R}}_{2}}(\cdot)$ , which integrate the information coming from target contrast LR MRI and guidance contrast HR MRI. And the gradient related notations are detailed as:

	$\displaystyle\nabla_{Z}\;\mathcal{F}(Z^{(t)})\;=\;$	$\displaystyle{(DK)}^{T}{(DKZ^{(t)}\;-\;X})\;+\;\eta P^{T}(PZ^{(t)}\;-\;Y)\;$
(13)			$\displaystyle+\beta_{1}(Z^{(t)}\;-\;U^{(t)})+{\beta_{2}}(Z^{(t)}\;-\;V^{(t)})$

In summary, the iterative algorithm for solving the MCSR task of Eq. 5 is given above, where we initialize $Z^{(0)}$ with a bicubic interpolated version of $X$ . Under the framework of the interpretable MGDUN model, the PGD algorithm usually requires dozens of iterations to converge.

3.3. Model-Guided deep unfolding network

In Sec. 3.2, we have proposed a general model-guided MRI SR algorithm for MCSR tasks. Like other model-based image restoration, it is difficult to optimize Eq. 3.2.2 and Eq. 3.2.2 due to the nonlinearity. Meantime, in traditional model-based approaches (Boyd et al., 2011; He et al., 2016), alternatively solving the above three optimization problems requires many iterations to converge leading to prohibitive computational cost. An alternative approach is to unfold the iterative optimization into a series of network implementations as demonstrated in recent years (Wisdom et al., 2017; Chen et al., 2018b; Dong et al., 2018; Bertocchi et al., 2020). The total number of unfolding stages naturally corresponds to that of PGD iterations.

3.3.1. Model represent and model overview

The main idea behind deep unfolding network is that conventional iterative soft-thresholding algorithm (ISTA) can be implemented equivalently by a stack of recurrent neural networks (Wisdom et al., 2017). Inspired by the principle of model-driven deep learning, we generalize Eq. 3.2.2 to Eq. 12 as a network block. Each step is translated with deep learning terminologies.

In each network, two auxiliary variables ( $U$ and $V$ ) are updated first. Due to the existence of noise, the $Prox(\cdot)$ operator can be implemented by a deep denoising module ( $\boldsymbol{DM}$ ). In Eq. 3.2.2, given the evaluated HR image $Z^{(t)}$ of the current stage and auxiliary splitting parameter $U^{(t-1)}$ of the previous stage, it generates the auxiliary splitting parameter $U^{(t)}$ of the current stage. The same as $V^{(t)}$ in the form of Eq. 3.2.2. In neural networks, these two steps are implemented by:

(14)		$\displaystyle U^{(t)}\;=$	$\displaystyle\;\boldsymbol{{DM}_{1}}(U^{(t-1)}\;+\;\xi_{1}Z^{(t)};C,C^{\prime},C)$
(15)		$\displaystyle V^{(t)}\;=$	$\displaystyle\;\boldsymbol{{DM}_{2}}(V^{(t-1)}\;+\;\xi_{2}Z^{(t)};C,C^{\prime},C)$

where $\boldsymbol{DM}(\cdot;C,C^{\prime},C)$ is used to obtain more expressive auxiliary splitting parameter. $C^{\prime}$ is the number of channels for feature maps.

Then, we reconstruct our estimated HR image according to Eq. 12 and Eq. 13. In our network, the reconstruction process which is implemented by a reconstruction module consists of cross-modal transform module, down-sampling block, and up-sampling block, takes auxiliary various $U^{(t)}$ , $V^{(t)}$ , the evaluated HR image $Z^{(t)}$ , the LR MRI and the guide image $Y$ as inputs and outputs the reconstructed image $Z^{(t+1)}$ . Therefore, the reconstruction process in the neural network is as follows:

	$\displaystyle Z^{(t+1)}\;=$	$\displaystyle\;Z^{(t)}\;-\;\;\delta_{3}(\boldsymbol{Up}(\boldsymbol{Down}(Z^{(t)})\;-\;X)\;$
		$\displaystyle+\;\eta\boldsymbol{P}^{\boldsymbol{T}}(\boldsymbol{P}\boldsymbol{(}Z^{(\mathbf{t})}\boldsymbol{)}\;-\;Y)\;$
(16)			$\displaystyle+\beta_{1}(Z^{(t)}\;-\;U^{(t)})+{\beta_{2}}(Z^{(t)}\;-\;V^{(t)})\;)\;$

where $\boldsymbol{Up}(\cdot)$ and $\boldsymbol{Down}(\cdot)$ denote the up-sampling and down-sampling operator in spatial resolution, respectively, and $\boldsymbol{P}(\cdot)$ and $\boldsymbol{P^{T}}(\cdot)$ perform the cross-modal transform functions.

Then, the updated $Z^{(t+1)}$ is fed into the next stage to refine the estimate $U$ and $V$ again. The denoising module and the reconstruction module are alternatively updated $T$ times until reaching the final reconstruction.

The overall network architecture of the MGDUN is shown in Fig. 1, which contains $T$ stages that are intentionally designed to correspond to $T$ iterations in the PGD optimization algorithm. Each stage of MGDUN consists of three specified network modules, containing a deep denoising module for denoising and updating for auxiliary variables, a cross-modal transform Module for cross-modal transform function, and a reconstruction module for reconstruction and updating for $Z$ . We will elaborate on each module next.

3.3.2. The deep denoising module

The design of deep denoising modules corresponds to the computing the updated estimate $U^{(t)}$ and $V^{(t)}$ in Eq. 3.2.2 and Eq. 3.2.2, respectively. In general, any existing image denoising network can be used as the denoising module here.

As shown in Fig. 1, the intermediate estimates $U^{(t-1)}$ and $V^{(t-1)}$ are fed into the proximal operator after weighting with the intermediate estimate $Z^{(t)}$ for further refinement. In this paper, we have adopted a variant of U-net as the backbone of the deep denoising module. Other more effective networks for medical image denoising can be also adopted. The U-net denoising network consists of an encoder and a decoder. As shown in Fig. 2(a), the encoder consists of four encoding blocks, which contain two convolutional layers with $3\times 3$ kernels and ReLU nonlinearity. Corresponding to the encoder, the decoder also consists of four decoding blocks, which contain two convolutional layers with $3\times 3$ kernels and ReLU nonlinearity. Instead of predicting the refine auxiliary various directly, we enable the denoising module to predict the residual by adding a skip connection from the input to the output. To reduce the number of network parameters and the effect of overfitting, we opt to enforce all denoising modules sharing the same network parameters.

3.3.3. Cross-Modal Transform Module

Noted that Eq. 16 involves the cross-modal transform matrix $\boldsymbol{P}$ and $\boldsymbol{P^{T}}$ that are expensive to calculate. We find that $\boldsymbol{P}$ and $\boldsymbol{P^{T}}$ perform a cross-modal transform operation, and the two processes are inverted: one from the target contrast image to the guide contrast image, and the other from the guide contrast image to the target contrast image. The process can be formulated as:

(17)		$\displaystyle\mathfrak{T}^{(t)}\;=$	$\displaystyle\;\boldsymbol{P}(Z^{(t)})$
(18)		$\displaystyle Z^{(t)}\;=$	$\displaystyle\;\boldsymbol{P}^{\mathbf{T}}(\mathfrak{T}^{(t)})$

As a result, we design cross-modal transform modules using the principle of invertible neural networks (INNs). INNs have been adopted in various inference tasks and achieved excellent performance due to their flexibility (Kingma et al., 2016; Lu et al., 2021). We formulate an INN architecture design to serve as cross-modal transform operation. It consists of two pixel shuffling layers (Dinh et al., 2016) and several INN blocks. As shown in Figure 2(b), relevant invertible modules are embedded in the cross-modal transform module.

For the t-th stage, given an evaluation HR MRI ( $Z^{(t)}$ ) to be refined, we first put it to one pixel shuffling layer for changing dimension, then pass through several INN blocks to execute the cross-modal transform function, and finally restore the original dimension through the other pixel shuffling layer.

For the forward operation, one pixel shuffling layer executes dimension addition first. Then the input ( $Z^{(t)}$ ) is divided into ( $Z_{1}^{(t)}$ ) and ( $Z_{2}^{(t)}$ ) along the channel axis, and the corresponding cross-modal transform output is $\mathfrak{T}_{1}^{(t)}$ and $\mathfrak{T}_{2}^{(t)}$ (two components of $\mathfrak{T}^{(t)}$ ). This process corresponds to the operation of cross-modal transform matrix $\boldsymbol{P}$ , in which an INN block can be expressed as :

(21)

\displaystyle\left\{\begin{array}[]{l}\mathfrak{T}_{1}^{(t)}\;=\;Z_{1}^{(t)}\;+\;\varphi(Z_{2}^{(t)})\\ \mathfrak{T}_{2}^{(t)}\;=\;Z_{2}^{(t)}\odot exp(\rho(\mathfrak{T}_{1}^{(t)}))\;+\;\eta(\mathfrak{T}_{1}^{(t)})\end{array}\right.

where $\varphi(\cdot)$ , $\eta(\cdot)$ and $\rho(\cdot)$ are arbitrary functions, $exp(\cdot)$ is Exponential functions, and $\odot$ is the Hadamard product.

Accordingly, for the backward operation, given [ $\mathfrak{T}_{1}^{(t)}$ , $\mathfrak{T}_{2}^{(t)}$ ], it is easy to calculate [ $Z_{1}^{(t)}$ , $Z_{2}^{(t)}$ ] as:

(24)

\displaystyle\left\{\begin{array}[]{l}Z_{2}^{(t)}\;=\;(\mathfrak{T}_{2}^{(t)}\;-\;\eta(\mathfrak{T}_{1}^{(t)}))\odot exp(-\rho(\mathfrak{T}_{1}^{(t)}))\\ Z_{1}^{(t)}\;=\;\mathfrak{T}_{1}^{(t)}\;-\;\varphi(Z_{2}^{(t)})\end{array}\right.

This process corresponds to the operation of matrix $\boldsymbol{P^{T}}$ in Eq. 18. The forward and backward operations are shown in Fig. 2.

3.3.4. Reconstruction Module

The design of the reconstruction module corresponds to the update of intermediate evaluated $Z^{(T)}$ as described in Eq. 16. With the output of denoising modules ( $U^{(t)}$ , $V^{(t)}$ ), the evaluated HR image $Z^{(t)}$ , the LR MRI and the guide image $Y$ , we can reconstruct the updated image $Z^{(t+1)}$ . The architecture of the reconstruction module is shown in Fig. 1 and Eq. 16, which still involve the degradation operations ( ${(DK)^{T}}$ and ${DK}$ ). The pair of operation ${(DK)^{T}}$ and ${DK}$ can be implemented by up-sampling and down-sampling layers for modeling capability.

The operators ${(DK)^{T}}$ and ${DK}$ are simulated using a convolution network layer respectively. Specifically, ${DK}$ is simulated by a network called down-sampling-blocks ( $Down$ ) consisting of a convolutional layer with $3\times 3$ kernels and 64 channels, one max pool layer to decrease the spatial resolution, and two convolutional layers with $3\times 3$ kernels for reprojection to the original dimension (as shown in Figure 2(c)). Similarly, the ${(DK)^{T}}$ is simulated by a network call Up-sampling-blocks ( $Up$ ) consisting of a convolutional layer with $3\times 3$ kernels and 64 channels, and one upsample layer to increase the spatial resolution and two convolutional layers with $3\times 3$ kernels for reprojection to the original dimension as shown in Figure 2(d).

3.3.5. network training

We will apply this model for the super-resolution reconstruction of LR T2WI with the aid of HR PDWI (or HR T1WI).

Our MGDUN is supervised by the $\mathcal{L}_{1}$ loss between $Z^{(T)}$ and the groundtruth $Z$ . Then, the overall network is trained by minimizing the following loss function:

(25)

\displaystyle\Theta\;=\;\underset{\Theta}{argmin}\sum_{i=1}^{N}{\left\|g(X_{i},\;Y_{i};\;\Theta)\;-\;Z_{i}\right\|}_{1}

where $X_{i}$ , $Y_{i}$ , and $Z_{i}$ denote the $i^{th}$ pair of target contrast LR T2WI, guide HR PDWI, and the original target contrast HR T2WI, respectively. $g(\cdot;\Theta)$ denotes the reconstructed HR T2WI by the network with parameter $\Theta$ .

Table 1. The comparisons of average PSNR, SSIM, and MSE on IXI and BraTs datasets with

2\times

and

4\times

enlargements. The best and second best results are highlighted in red and blue color, respectively. The up or down arrows indicate higher or lower values corresponding to better results.

Datasets	IXI						BraTs
Scales	2 $\times$ SR			4 $\times$ SR			2 $\times$ SR			4 $\times$ SR
Metrics	PSNR $\uparrow$	SSIM $\uparrow$	MSE $\downarrow$	PSNR $\uparrow$	SSIM $\uparrow$	MSE $\downarrow$	PSNR $\uparrow$	SSIM $\uparrow$	MSE $\downarrow$	PSNR $\uparrow$	SSIM $\uparrow$	MSE $\downarrow$
Bicubic	24.5537	0.7584	15.4654	24.3658	0.7508	15.8045	26.5771	0.8413	12.1369	26.3637	0.8360	12.4472
EDSR (Lim et al., 2017)	31.4066	0.9290	7.0586	29.5377	0.9028	8.7625	33.2810	0.9557	5.6549	31.9166	0.9419	6.6225
MDSR (Lim et al., 2017)	30.9519	0.9242	7.5883	29.6663	0.9042	8.6254	33.3624	0.9567	5.5991	32.0175	0.9430	6.5372
RCAN (Zhang et al., 2018)	31.9391	0.9351	6.6395	31.3783	0.9301	7.0721	34.1138	0.9631	5.1123	32.7279	0.9504	6.0093
CoISTA (Deng and Dragotti, 2019)	31.4199	0.9121	7.0657	29.4435	0.8757	8.516	29.2042	0.9028	9.088	27.4678	0.8812	11.0865
T²Net (Feng et al., 2021c)	30.0556	0.9117	8.2676	29.4629	0.9014	8.8596	32.9922	0.9530	5.8623	31.7000	0.9389	6.8210
SANet (Feng et al., 2021d)	36.7565	0.9683	3.7998	35.2765	0.9616	4.5115	35.0775	0.9636	4.4661	33.5158	0.9578	4.9808
MGDUN(Ours)	37.3366	0.9691	3.5598	35.9786	0.9637	4.1639	35.9690	0.9703	4.1529	34.5577	0.9615	4.8933

Table 2. The results of different configurations on IXI dataset. The up or down arrows indicate higher or lower values corresponding to better results.

configuration	Stages	PSNR $\uparrow$	SSIM $\uparrow$	MSE $\downarrow$
I	3	36.7565	0.9683	3.7998
II(Ours)	4	37.3366	0.9691	3.5598
III	5	37.9190	0.9716	3.3378
IV	6	38.0198	0.9722	3.3206
configuration	Guide image	PSNR	SSIM	MSE
V	✕	34.7081	0.9568	4.8183
VI(concat)	✕	35.9719	0.9630	4.1703
VII(Ours)	✓	37.3366	0.9691	3.5598
configuration	INN blocks	PSNR	SSIM	MSE
VIII	1	36.8480	0.9682	3.7681
IX(Ours)	2	37.3366	0.9691	3.5598
X	3	37.4740	0.9703	3.5038
configuration	Denoiser	PSNR	SSIM	MSE
XI(Ours)	U-Net	37.3366	0.9691	3.5598
XII	Resnet	37.1130	0.9686	3.6551

Table 3. The parameters and testing time results of

2\times

enlargement

Methods	Bicubic	EDSR	MDSR	RCAN	CoISTA	T²Net	SANet	MGDUN
# Params (M)	-	1.37	0.34	12.46	16.43	0.68	11.41	1.68
# Testing time (s)	-	4.16	2.71	1.60	1.50	1.65	2.21	1.97

4. EXPERIMENT

4.1. Datasets

IXI Dataset. The IXI dataset contains registered T2 weighted and PD weighted MR images of 578 patients. T2 weighted images were used for SR and the PD weighted images served as the guidance. We excluded a few slices of each volume as the frontal slices are much noisier than the others, making their distribution different¹¹1More details can be obtained from http://brain-development.org/ixi-dataset/.. We splitted the IXI dataset patient-wisely into a ratio of 7:1:2 for training/validation/testing. The size of original HR images of both T2 and PD weighted images is $256\times 256$ .

BraTs Dataset. The BraTs dataset (2019) contains multimodal brain data, including registered T1, T1ce, T2, and PD weighted images. Similar to the IXI dataset, we chose T2 weighted images for SR and T1 weighted images for guidance. The size of an original HR image is $240\times 240$ . 3350 pairs were used for training, and 1250 paired images were used for validation.

Finally, each T2 image was blurred with a 3×3 Gaussian filter and down-sampled. We obtained an LR image of desired dimensions after bicubic interpolation. Before training, for numerical stability, all images were normalized over the range of [ $0,1$ ].

Metrics: The peak signal-to-noise ratio (PSNR), structural similarity (SSIM), and mean-square error (MSE) were used to evaluate the image quality of MCSR results (the greater values indicate better performance).

4.2. Implementation Details

We implement MGDUN in the Pytorch framework with an NVIDIA GeForce RTX 3080TI GPU. In the training phase, our model is trained using the ADAM optimizer with a learning rate of $1e-5$ , for $200$ epochs. The parameters $\alpha$ and $\beta$ are empirically set to be 0.9 and 0.999. The batch size is set as 4. We also unfold the whole image with this approach during testing. The default stage number $T$ is set to be 4, and the number of INN blocks in the cross-modal transform module is set to 2. And the source code is available at https://github.com/yggame/MGDUN.

4.3. Results

We compare our results with a number of models including classical methods, single-contrast methods, multi-contrast methods, and unfolding methods. More specifically, the proposed method is compared with Bicubic, EDSR (Lim et al., 2017), MDSR (Lim et al., 2017), RCAN (Zhang et al., 2018), CoISTA (Deng and Dragotti, 2019), T²Net (Feng et al., 2021c), and SANet (Feng et al., 2021d). The benchmark results are trained using the same ways and training datasets as described in their corresponding works.

4.3.1. Quantitative results

For quantitative evaluation, we utilize the average PSNR, SSIM, and MSE. Tab. 1 reports the target contrast reconstruction performance on different datasets under $2\times$ and $4\times$ enlargements where the best and second best values are highlighted in bold red and underline blue, respectively. It is clearly noted that our method achieves the best performance on different enlargements of different datasets. The results demonstrate that our model can effectively fuse the two contrasts, which is beneficial to the SR reconstruction of the target contrast. This substantiates the effectiveness and flexibility of our method with a certain degree of generalization.

4.3.2. Qualitative results

We provide qualitative comparison results on the IXI dataset as well as the BraTs dataset and their corresponding error maps in Fig. 3 and Fig. 4. The texture of error maps represents the restoration error, the smoother the texture, the better the reconstruction. As we can see, the input has significant aliasing artifacts and lacks anatomical details. It can be noted that our model recovers the image with fewer visible artifacts and reconstructs more details than other competing methods. The quality improvement achieved by MDCUN may be associated with the full usage of the feature maps from the former stages to refine the final results.

4.4. Ablation Study

To further verify the performance of the proposed model under different configurations, a series of ablation studies are carried out, including 1) Effects of the number of stages; 2) Influence of the guide images; 3) Effect of the number of INN blocks in cross-modal transform Module; 4) Effect of the denoising modules. In this section, only IXI dataset is used.

Effect of number of stages To explore the impact of the number of unfolding stages on the performance, we report the results for different realizations of the proposed model with varying number t of unfolding stages as described in Eq. 3.2.2 to Eq. 12. Tab. 2(I-IV) shows the performance of different stages from 3 to 6. It can be observed that the performance increases as the number of stages increases. We choose $T=4$ in our implementation to balance the performance and computational complexity.

Influence of the guide images The guide images are used to provide complementary information for recovering a target modality. In order to verify the effectiveness of the guide images, we conduct a series of ablation studies (e.g., removing the guide image and using a simple multi-contrast fusion mechanism). The results are shown in Tab. 2(V-VII). As we can see, our approach is an effective strategy and improve performance.

Effect of the number of INN blocks in cross-modal transform Module We additionally perform a comparative experiment to verify the effectiveness of INN blocks in the cross-modal transform module. As shown in Tab. 2(VIII-X), the performance will increase as the number of INN blocks increases. In other words, when dealing with cross-modal transform functions, the reconstruction capability of our method could be improved by appropriately increasing the number of blocks.

Effect of the denoising modules To verify the effectiveness of U-net denoising modules, we further implement an ablation study, in which U-net denoising modules are replaced by Resnet denoising modules containing a similar number of parameters. As we can see in Tab. 2(XI-XII), the proposed methods with U-net denoiser outperforms Resnet denoiser.

4.5. Cost-performance Trade-off

To demonstrate the trade-off between the cost and the performance, we compare this against several existing SCSR methods in Tab. 3. It’s noted that our model can achieve better performance than others with comparable model sizes. Although it does not achieve the best performance in terms of testing time, the proposed method still has notable advantages over other competing methods. The results demonstrate that our method can yield satisfying performance with a good trade-off between cost and performance compared to other deep learning-based methods.

5. CONCLUSION

The interpretable deep learning model is a promising approach for the recovery of medical images as trustworthiness is required in clinical practice. In this paper, we propose a novel Model-Guided interpretable Deep Unfolding Network (MGDUN) for medical image SR reconstruction and show how to unfold it by deep convolutional network implementation for multi-contrast medical image SR reconstruction. Our MGDUN is capable of better exploring domain knowledge in MCSR tasks and optimizing model-guided SR reconstruction algorithm in an interpretable manner.

Acknowledgements.

This work was supported by the National Natural Science Foundation of China (NSFC) under Grant 61922075, the USTC Research Funds of the Double First-Class Initiative under Grants YD2100002004 and KY2100000123 as well as the University Synergy Innovation Program of Anhui Province No. GXXT-2019-025. We acknowledge the support of GPU cluster built by MCC Lab of Information Science and Technology Institution, USTC.

References

(1)
Afonso et al. (2010) Manya V Afonso, José M Bioucas-Dias, and Mário AT Figueiredo. 2010. Fast image recovery using variable splitting and constrained optimization. IEEE transactions on image processing 19, 9 (2010), 2345–2356.
Bertocchi et al. (2020) Carla Bertocchi, Emilie Chouzenoux, Marie-Caroline Corbineau, Jean-Christophe Pesquet, and Marco Prato. 2020. Deep unfolding of a proximal interior point method for image restoration. Inverse Problems 36, 3 (2020), 034005.
Boyd et al. (2011) Stephen Boyd, Neal Parikh, Eric Chu, Borja Peleato, Jonathan Eckstein, et al. 2011. Distributed optimization and statistical learning via the alternating direction method of multipliers. Foundations and Trends® in Machine learning 3, 1 (2011), 1–122.
Brown and Semelka (2011) Mark A Brown and Richard C Semelka. 2011. MRI: basic principles and applications. John Wiley & Sons.
Chaudhari et al. (2018) Akshay S Chaudhari, Zhongnan Fang, Feliks Kogan, Jeff Wood, Kathryn J Stevens, Eric K Gibbons, Jin Hyung Lee, Garry E Gold, and Brian A Hargreaves. 2018. Super-resolution musculoskeletal MRI using deep learning. Magnetic resonance in medicine 80, 5 (2018), 2139–2154.
Chen et al. (2018b) Chang Chen, Zhiwei Xiong, Xinmei Tian, and Feng Wu. 2018b. Deep boosting for image denoising. In Proceedings of the European Conference on Computer Vision (ECCV). 3–18.
Chen et al. (2018a) Yuhua Chen, Yibin Xie, Zhengwei Zhou, Feng Shi, Anthony G Christodoulou, and Debiao Li. 2018a. Brain MRI super resolution using 3D deep densely connected neural networks. In 2018 IEEE 15th International Symposium on Biomedical Imaging (ISBI 2018). IEEE, 739–742.
Deng and Dragotti (2019) Xin Deng and Pier Luigi Dragotti. 2019. Deep coupled ISTA network for multi-modal image super-resolution. IEEE Transactions on Image Processing 29 (2019), 1683–1698.
Dinh et al. (2016) Laurent Dinh, Jascha Sohl-Dickstein, and Samy Bengio. 2016. Density estimation using real nvp. arXiv preprint arXiv:1605.08803 (2016).
Dong et al. (2018) Weisheng Dong, Peiyao Wang, Wotao Yin, Guangming Shi, Fangfang Wu, and Xiaotong Lu. 2018. Denoising prior driven deep neural network for image restoration. IEEE transactions on pattern analysis and machine intelligence 41, 10 (2018), 2305–2318.
Feng et al. (2021a) Chun-Mei Feng, Huazhu Fu, Shuhao Yuan, and Yong Xu. 2021a. Multi-Contrast MRI Super-Resolution via a Multi-Stage Integration Network. In International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI).
Feng et al. (2021b) Chun-Mei Feng, Huazhu Fu, Shuhao Yuan, and Yong Xu. 2021b. Multi-contrast mri super-resolution via a multi-stage integration network. In International Conference on Medical Image Computing and Computer-Assisted Intervention. Springer, 140–149.
Feng et al. (2021c) Chun-Mei Feng, Yunlu Yan, Huazhu Fu, Li Chen, and Yong Xu. 2021c. Task Transformer Network for Joint MRI Reconstruction and Super-Resolution. In International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI).
Feng et al. (2021d) Chun-Mei Feng, Yunlu Yan, Chengliang Liu, Huazhu Fu, Yong Xu, and Ling Shao. 2021d. Exploring Separable Attention for Multi-Contrast MR Image Super-Resolution. arXiv preprint arXiv:2109.01664 (2021).
Gregor and LeCun (2010) Karol Gregor and Yann LeCun. 2010. Learning fast approximations of sparse coding. In Proceedings of the 27th international conference on international conference on machine learning. 399–406.
He et al. (2016) Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 770–778.
Huo et al. (2018) Yuankai Huo, Zhoubing Xu, Shunxing Bao, Camilo Bermudez, Hyeonsoo Moon, Prasanna Parvathaneni, Tamara K Moyo, Michael R Savona, Albert Assad, Richard G Abramson, et al. 2018. Splenomegaly segmentation on multi-modal MRI using deep convolutional networks. IEEE transactions on medical imaging 38, 5 (2018), 1185–1196.
Jafari-Khouzani (2014) Kourosh Jafari-Khouzani. 2014. MRI upsampling using feature-based nonlocal means approach. IEEE transactions on medical imaging 33, 10 (2014), 1969–1985.
Kingma et al. (2016) Durk P Kingma, Tim Salimans, Rafal Jozefowicz, Xi Chen, Ilya Sutskever, and Max Welling. 2016. Improved variational inference with inverse autoregressive flow. Advances in neural information processing systems 29 (2016), 4743–4751.
Kokkinos and Lefkimmiatis (2018) Filippos Kokkinos and Stamatios Lefkimmiatis. 2018. Deep image demosaicking using a cascade of convolutional residual denoising networks. In Proceedings of the European Conference on Computer Vision (ECCV). 303–319.
Kuklisova-Murgasova et al. (2012) Maria Kuklisova-Murgasova, Gerardine Quaghebeur, Mary A. Rutherford, Joseph V. Hajnal, and Julia A. Schnabel. 2012. Reconstruction of fetal brain MRI with intensity matching and complete outlier removal. Medical Image Analysis 16, 8 (2012), 1550–1564. https://doi.org/10.1016/j.media.2012.07.004
Lim et al. (2017) Bee Lim, Sanghyun Son, Heewon Kim, Seungjun Nah, and Kyoung Mu Lee. 2017. Enhanced deep residual networks for single image super-resolution. In Proceedings of the IEEE conference on computer vision and pattern recognition workshops. 136–144.
Lu et al. (2021) Shao-Ping Lu, Rong Wang, Tao Zhong, and Paul L Rosin. 2021. Large-capacity image steganography based on invertible neural networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 10816–10825.
Lu et al. (2015) Xiaoqiang Lu, Zihan Huang, and Yuan Yuan. 2015. MR image super-resolution via manifold regularized sparse learning. Neurocomputing 162 (2015), 96–104.
Lyu et al. (2020) Qing Lyu, Hongming Shan, Cole Steber, Corbin Helis, Chris Whitlow, Michael Chan, and Ge Wang. 2020. Multi-contrast super-resolution MRI through a progressive network. IEEE transactions on medical imaging 39, 9 (2020), 2738–2749.
Mai et al. (2011) Zhenhua Mai, Jeny Rajan, Marleen Verhoye, and Jan Sijbers. 2011. Robust edge-directed interpolation of magnetic resonance images. Physics in Medicine & Biology 56, 22 (2011), 7287.
Manjón et al. (2010) José V Manjón, Pierrick Coupé, Antonio Buades, D Louis Collins, and Montserrat Robles. 2010. MRI superresolution using self-similarity and image priors. International journal of biomedical imaging 2010 (2010).
McDonagh et al. (2017) Steven McDonagh, Benjamin Hou, Amir Alansary, Ozan Oktay, Konstantinos Kamnitsas, Mary Rutherford, Jo V Hajnal, and Bernhard Kainz. 2017. Context-sensitive super-resolution for fast fetal magnetic resonance imaging. In Molecular Imaging, Reconstruction and Analysis of Moving Body Organs, and Stroke Imaging and Treatment. Springer, 116–126.
Ning et al. (2020) Qian Ning, Weisheng Dong, Guangming Shi, Leida Li, and Xin Li. 2020. Accurate and lightweight image super-resolution with model-guided deep unfolding network. IEEE Journal of Selected Topics in Signal Processing 15, 2 (2020), 240–252.
Pham et al. (2017) Chi-Hieu Pham, Aurélien Ducournau, Ronan Fablet, and François Rousseau. 2017. Brain MRI super-resolution using deep 3D convolutional networks. In 2017 IEEE 14th International Symposium on Biomedical Imaging (ISBI 2017). IEEE, 197–200.
Plenge et al. (2012) Esben Plenge, Dirk HJ Poot, Monique Bernsen, Gyula Kotek, Gavin Houston, Piotr Wielopolski, Louise van der Weerd, Wiro J Niessen, and Erik Meijering. 2012. Super-resolution methods in MRI: can they improve the trade-off between resolution, signal-to-noise ratio, and acquisition time? Magnetic resonance in medicine 68, 6 (2012), 1983–1993.
Rousseau et al. (2010) François Rousseau, Alzheimer’s Disease Neuroimaging Initiative, et al. 2010. A non-local approach for image super-resolution using intermodality priors. Medical image analysis 14, 4 (2010), 594–605.
Song et al. (2021) Jiechong Song, Bin Chen, and Jian Zhang. 2021. Memory-Augmented Deep Unfolding Network for Compressive Sensing. In Proceedings of the 29th ACM International Conference on Multimedia. 4249–4258.
Sun et al. (2016) Jian Sun, Huibin Li, Zongben Xu, et al. 2016. Deep ADMM-Net for compressive sensing MRI. Advances in neural information processing systems 29 (2016).
Wisdom et al. (2017) Scott Wisdom, Thomas Powers, James Pitton, and Les Atlas. 2017. Building recurrent networks by unfolding iterative thresholding for sequential sparse recovery. In 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 4346–4350.
You et al. (2021) Di You, Jingfen Xie, and Jian Zhang. 2021. ISTA-Net++: flexible deep unfolding network for compressive sensing. In 2021 IEEE International Conference on Multimedia and Expo (ICME). IEEE, 1–6.
Zeng et al. (2018) Kun Zeng, Hong Zheng, Congbo Cai, Yu Yang, Kaihua Zhang, and Zhong Chen. 2018. Simultaneous single-and multi-contrast super-resolution for brain MRI images based on a convolutional neural network. Computers in biology and medicine 99 (2018), 133–141.
Zhang and Ghanem (2018) Jian Zhang and Bernard Ghanem. 2018. ISTA-Net: Interpretable optimization-inspired deep network for image compressive sensing. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1828–1837.
Zhang et al. (2020) Kai Zhang, Luc Van Gool, and Radu Timofte. 2020. Deep unfolding network for image super-resolution. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 3217–3226.
Zhang et al. (2021) Yulun Zhang, Kai Li, Kunpeng Li, and Yun Fu. 2021. MR Image Super-Resolution with Squeeze and Excitation Reasoning Attention Network. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 13425–13434.
Zhang et al. (2018) Yulun Zhang, Kunpeng Li, Kai Li, Lichen Wang, Bineng Zhong, and Yun Fu. 2018. Image super-resolution using very deep residual channel attention networks. In Proceedings of the European conference on computer vision (ECCV). 286–301.
Zhao et al. (2018) Can Zhao, Aaron Carass, Blake E Dewey, Jonghye Woo, Jiwon Oh, Peter A Calabresi, Daniel S Reich, Pascal Sati, Dzung L Pham, and Jerry L Prince. 2018. A deep learning based anti-aliasing self super-resolution algorithm for MRI. In International conference on medical image computing and computer-assisted intervention. Springer, 100–108.
Zhao et al. (2019) Xiaole Zhao, Yulun Zhang, Tao Zhang, and Xueming Zou. 2019. Channel splitting network for single MR image super-resolution. IEEE Transactions on Image Processing 28, 11 (2019), 5649–5662.
Zheng et al. (2017) Hong Zheng, Xiaobo Qu, Zhengjian Bai, Yunsong Liu, Di Guo, Jiyang Dong, Xi Peng, and Zhong Chen. 2017. Multi-contrast brain magnetic resonance image super-resolution using the local weight similarity. BMC medical imaging 17, 1 (2017), 1–13.
Zheng et al. (2018) Hong Zheng, Kun Zeng, Di Guo, Jiaxi Ying, Yu Yang, Xi Peng, Feng Huang, Zhong Chen, and Xiaobo Qu. 2018. Multi-contrast brain MRI image super-resolution with gradient-guided edge enhancement. IEEE Access 6 (2018), 57856–57867.