This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Recovering high-quality FODs from a reduced number of diffusion-weighted images using a model-driven deep learning architecture

Joseph Bartlett, Catherine E. Davey, Leigh A. Johnston, , and Jinming Duan J. Bartlett, and J. Duan are with the School of Computer Science, the University of Birmingham, Birmingham, UK.J. Duan is with the Alan Turing Institute, London, UK.J. Bartlett, C. E. Davey and L. A. Johnston are with the Department of Biomedical Engineering, the Melbourne Brain Centre Imaging Unit and the Graeme Clark Institute, the University of Melbourne, Melbourne, Australia.The corresponding author is J. Duan ([email protected]).
Abstract

Fibre orientation distribution (FOD) reconstruction using deep learning has the potential to produce accurate FODs from a reduced number of diffusion-weighted images (DWIs), decreasing total imaging time. Diffusion acquisition invariant representations of the DWI signals are typically used as input to these methods to ensure that they can be applied flexibly to data with different b-vectors and b-values; however, this means the network cannot condition its output directly on the DWI signal. In this work, we propose a spherical deconvolution network, a model-driven deep learning FOD reconstruction architecture, that ensures intermediate and output FODs produced by the network are consistent with the input DWI signals. Furthermore, we implement a fixel classification penalty within our loss function, encouraging the network to produce FODs that can subsequently be segmented into the correct number of fixels and improve downstream fixel-based analysis. Our results show that the model-based deep learning architecture achieves competitive performance compared to a state-of-the-art FOD super-resolution network, FOD-Net. Moreover, we show that the fixel classification penalty can be tuned to offer improved performance with respect to metrics that rely on accurately segmented of FODs. Our code is publicly available at https://github.com/Jbartlett6/SDNet.

Index Terms:
Diffusion MRI, model-based deep learning, FOD reconstruction

I Introduction

Fibre orientation distributions (FODs) relate signal attenuation in diffusion-weighted magnetic resonance images to the volume fractions and orientations of fibre populations in the brain [1, 2, 3]. Their flexibility and capacity to discern intra-voxel fibre populations facilitates a range of subsequent quantitative analyses; tractography algorithms can be used to obtain tractograms, and FOD segmentation can provide discrete fibre bundle elements (fixels) [4, 5].

Multi-shell, high angular resolution diffusion imaging datasets are required to fit FODs with sufficient angular detail, and for separating the contribution of different tissue types [6, 3]. The approximately linear relationship between the time a subject spends in the scanner and the number of diffusion-weighted images (DWIs) collected means acquiring such datasets is time consuming.

Deep learning can help to alleviate this issue by performing FOD reconstruction, the task of fitting high-fidelity FODs to a reduced number of DWI signals. To ensure their flexibility, such deep learning methods should be invariant to changes in diffusion MRI acquisition arising due to inter-facility variability or DWI volume corruption. Resampling techniques such as spherical harmonics (SH) [7, 8, 9] and nearest neighbour [10] interpolation have been explored to resample arbitrary DWI acquisitions onto a pre-defined spherical grid. Alternatively, an SH representation of the signal can be used as input to the network [11, 12, 13, 14]. FOD super-resolution methods [15, 16, 17] perform constrained spherical deconvolution (CSD) as a pre-processing step and take the SH representation of the FOD as input. Results in the literature vary due to the range of acquisitions and CSD algorithms used to fit the FODs such as: single-shell-single-tissue [15], two-tissue [16] and single-shell-three-tissue [17] FODs.

High computational costs and the risk of overfitting mean it is not feasible to process all signals in the spatial and diffusion-acquisition dimensions concurrently. By predicting the central FOD from a limited spatial neighbourhood of the input [11, 17, 13], a compromise can be found between reducing the computational burden and exploiting the abundance of spatial correlations present in the data. Such methods commonly utilise a 3D convolutional neural network (CNN) for feature extraction, followed by fully connected or transformer layers for FOD prediction [9].

It is common practice for FODs to be fit using CSD with a maximum SH order of eight [17, 11] in order to to capture angular frequency content of the DWI signal at a maximum b-value of 3000 s/mm23000\text{ s/mm}^{2} [6]. Some tractography algorithms require only the orientations of fibre populations in each voxel as input, so a number of FOD reconstruction algorithms predict only these quantities [7, 10]. Alternatively, an unsupervised loss function with sparsity inducing regularisation can be used to reconstruct FODs with an increased maximum order of 20 [8]. Whilst improving the angular separation, these methods change the FOD model, meaning it is likely that fixel-derived scalars, such as apparent fibre density and peak amplitude, also deviate. Therefore, it would be infeasible to apply such methods within a fixel-based analysis pipeline.

Model-based deep learning exploits domain knowledge of a process to inspire neural network architectures. Many approaches alternate between CNN-based denoising and data consistency blocks [18, 19, 20, 21]. Data consistency blocks use prior knowledge of an appropriate forward model to ensure a network produces solutions consistent with the input signal.

When calculating acquisition invariant representations of the DWI signal, fitting errors are incurred. We conjecture that such errors lead to the degradation of FOD reconstruction performance since the subsequently applied neural networks cannot directly condition their output on the true DWI signal, and model-based deep learning has the potential to lessen the impact of these errors by ensuring intermediate and output FODs are consistent with the DWI signal. In the context of FOD reconstruction, data consistency blocks minimise a linear combination of the CSD data consistency and an additional, deep learning based, regularisation term. Current implementations use a pre-trained autoencoder based regularisation term [15], however this means the network will not be optimised for FOD reconstruction performance. Model-based deep learning has to this point not been combined with techniques proven successful in end-to-end FOD reconstruction architecture.

In this paper Spherical Deconvolution Network (SDNet) is introduced, a model-based deep learning architecture that utilises spatial information from surrounding voxels and is optimised to perform FOD reconstruction of multi-shell data. Additionally, we propose a fixel classification penalty within our loss function to improve angular separation without distorting the shape of the reconstructed FODs, which can be tuned to suit the requirements of the reconstructed FODs. The efficacy is evaluated by extensive comparisons with a state-of-the-art FOD super-resolution method, FOD-Net, as well as an ablation study. Our results show that including model-based deep learning improves the performance of the network.

II Method

II-A Network Architecture

Constrained spherical deconvolution is used to fit FODs to DWI signal by optimising the following objective function:

min𝐜12m𝒜𝒬𝐜𝐛22+(𝐜)\mathop{\min\limits_{\bf{c}}}{{\frac{1}{2m}\|{{\cal A}{\cal Q}{\bf{c}}-{\bf{b}}}\|_{2}^{2}}+{\cal R}\left({\bf{c}}\right)} (1)

Refer to caption

Figure 1: SDNet architecture, made up of alternating deep regularisation blocks and DWI consistency blocks. Each DWI consistency block is made up of 3D convolution blocks, the values above each set of layers represents the number of channels, which increase as follows: {94,128,192,256,320,384,448}\{94,128,192,256,320,384,448\}. The DWI consistency block shows the matrix inversion that is solved for each voxel independently.

where 𝐜n{\bf{c}}\in\mathbb{R}^{n} are the SH coefficients of the FOD, 𝐛m{\bf{b}}\in\mathbb{R}^{m} are the DWI signals, and 𝒜𝒬m×n{\cal AQ}\in\mathbb{R}^{m\times n} spherically convolves the FOD with the response functions of the tissue types being modelled. To facilitate a data-driven regularisation term, optimised for FOD reconstruction, we consider an arbitrary regularisation term, (){\cal R}(\cdot), in place of the ubiquitous non-negativity constraint. In the following we outline how the variable splitting methods used in Jia et al. [20], Duan et al. [21] can be adapted to solve (1).

First, we introduce an auxiliary splitting variable 𝐰n{\bf{w}}\in\mathbb{R}^{n}, converting (1) into the following equivalent form:

min𝐜,𝐰12m𝒜𝒬𝐜𝐛22+(𝐰)s.t.𝐜=𝐰\mathop{\min\limits_{{\bf{c,w}}}}{{\frac{1}{2m}\|{{\cal A}{\cal Q}{\bf{c}}-{\bf{b}}}\|_{2}^{2}}+{\cal R}\left({\bf{w}}\right)}\;s.t.\;{\bf{c=w}} (2)

Using the penalty function method, we add these constraints back into the model and minimise the joint objective:

min𝐜,𝐰12m𝒜𝒬𝐜𝐛22+(𝐰)+λ2𝐜𝐰22\mathop{\min\limits_{\bf{c,w}}}{{\frac{1}{2m}\|{{\cal A}{\cal Q}{\bf{c}}-{\bf{b}}}\|_{2}^{2}}+{\cal R}\left({\bf{w}}\right)}+\frac{\lambda}{2}\|{\bf{c-w}}\|^{2}_{2} (3)

Eq. (3) can be solved for 𝐜{\bf{c}} and 𝐰{\bf{w}} using an alternating optimisation scheme:

{𝐜k+1=argmin𝐜12m𝒜𝒬𝐜𝐛22+λ2𝐜𝐰k22𝐰k+1=argmin𝐰λ2𝐜k+1𝐰22+(𝐜k+1).\left\{\begin{array}[]{l}{{\bf{c}}^{k+1}}=\mathop{\arg\min\limits_{\bf{c}}}\frac{1}{2m}\|{{\cal A}{\cal Q}{\bf{c}}-{\bf{b}}}\|_{2}^{2}+\frac{\lambda}{2}\|{\bf{c-w}}^{k}\|_{2}^{2}\\ {\bf{w}}^{k+1}=\mathop{\arg\min\limits_{{\bf{w}}}}{\frac{\lambda}{2}\|{\bf{c}}^{k+1}-{\bf{w}}\|_{2}^{2}}+{\cal R}\left({\bf{c}}^{k+1}\right)\\ \end{array}\right.. (4)

The first convex optimisation can be solved using matrix inversion. The second equation is a denoising problem with arbitrary regularisation, the optimal form of which is unknown. In order to learn the regularisation to improve FOD reconstruction performance, the iterative process can be unrolled and the denoising step solved using a neural network, 𝒩𝒩(){\cal{NN}}(\cdot):

{𝐜k+1=(1m𝒬𝒯𝒜𝒯𝒜𝒬+λ)1(1m𝒬𝒯𝒜𝒯𝐛+λ𝐰k)𝐰k+1=𝒩𝒩(𝐜k+1).\left\{\begin{array}[]{l}{{\bf{c}}^{k+1}}=\left(\frac{1}{m}{\cal{Q}^{T}}{\cal{A}^{T}}{\cal{A}}{\cal{Q}}+\lambda{\cal{I}}\right)^{-1}\left(\frac{1}{m}{\cal{Q}^{T}}{\cal{A}^{T}}{\bf{b}}+\lambda{\bf{w}}^{k}\right)\\ {\bf{w}}^{k+1}={\cal{NN}}\left({\bf{c}}^{k+1}\right)\end{array}\right.. (5)

The network architecture (Fig. 1) takes nine voxels in each spatial dimension for 30 different diffusion gradients, resulting in a 9×9×9×309\times 9\times 9\times 30 volume of DWI signals as input, and passes them through alternating DWI consistency and deep regularisation blocks. The network outputs a vector 𝐜^n{\hat{\bf{c}}}\in\mathbb{R}^{n}, a high-fidelity prediction of the FOD from the central voxel of the 9×9×99\times 9\times 9 input patch.

II-A1 DWI Consistency

Each DWI consistency block solves the matrix inversion in (5) independently for each voxel, maintaining spatial resolution. The initial DWI consistency block optimises only for the first three even orders of spherical harmonic coefficients (lmax=4)(l_{max}=4) to ensure robustness to aggressive DWI undersampling.

II-A2 Deep Regularisation

Each deep regularisation block is applied to a concatenation of the previous two DWI consistency blocks, meaning the block is conditioned on both earlier representations. Validation tests showed these connections improve network performance (data not shown). The initial 3×3×33\times 3\times 3 convolution kernels are applied with one layer of zero padding in each dimension, as to maintain spatial resolution, and are followed by 3D batch normalisation layers and parametric rectified linear unit (PReLU) activation functions. The number of channels is increased in this manner until it has reached 448 (Fig. 1). No padding is applied in the final 3×3×33\times 3\times 3 convolution kernel followed by a PReLU function, reducing the resolution in each spatial dimension by two. Finally, a 1×1×11\times 1\times 1 convolution kernel is then applied to the 512-channel feature maps to obtain a 94-channel input to a gated linear unit (GLU) activation function,which is the output of the block. Residual connections, referencing the output of the previous DWI consistency block, are used to improve gradient flow through the network. The deep regularisation block reduces each spatial dimension of its input by two.

II-B Loss Functions

In addition to the customary MSE loss, a fixel classification penalty is proposed to give greater control over the angular separation of the reconstructed FODs. The mechanics of this method can be considered similar to the microstructure sensitive loss proposed for DWI signal reconstruction in [22]. To overcome the inherent non-differentiable nature of the fast marching level set FOD segmentation algorithm [23], a fixel classification network is applied to predict the number of fixels each voxel contains. The output is passed into a cross-entropy component of the loss function. Since we are concerned with the white matter components of the FODs, the loss function and performance metrics are not functions of the grey matter and cerebrospinal fluid components of the FOD. For notational simplicity, from this point onwards 𝐜{\bf{c}} refers only to the white matter component of the FOD. The overall loss function is as follows:

(𝐜^)=1Nbatchi=1Nbatch(𝐜^i𝐜i22+κ(𝐟^(𝐜^i),𝐟i)){\cal{L}}(\hat{{\bf{c}}})=\frac{1}{N_{batch}}\sum^{N_{batch}}_{i=1}\left(\|\hat{{\bf{c}}}_{i}-{\bf{c}}_{i}\|^{2}_{2}+\kappa{\cal{E}}(\hat{{\bf{f}}}(\hat{{\bf{c}}}_{i}),{\bf{f}}_{i})\right) (6)

where NbatchN_{batch} is the number of data points in the mini-batch, 𝐜^i,𝐜in2\hat{{\bf{c}}}_{i},\;{\bf{c}}_{i}\in\mathbb{R}^{n-2} are the reconstructed and fully sampled white matter FODs, (,){\cal{E}}(\cdot,\cdot) is the cross-entropy, 𝐟^(𝐜^i),𝐟i5\hat{{\bf{f}}}(\hat{{\bf{c}}}_{i}),\;{\bf{f}}_{i}\in\mathbb{R}^{5} are the predicted logits and the one-hot encoding of the number of fixels respectively and κ\kappa is a hyperparameter to balance the two components of the loss function.

TABLE I: Count and percentage values of fixels in white matter voxels of an individual from the HCP dataset before and after thresholding at 4 fixels. Before thresholding there is a severe class imbalance.
    Number of Fixels     Count     Percentage before thresholding     Percentage after thresholding    
    1     310994     49%     49%    
    2     200673     32%     32%    
    3     76672     12%     12%    
    4     24095     4%     6.7%    
    5     10975     2%     -    
    6     4979     0.8%     -    
    7     1800     0.3%     -    

When training the fixel classification network, the number of fixels in each voxel were thresholded to four (Tab. I), reducing the inclusion of spurious peaks and class imbalance. A simple, fully-connected architecture was used, with layers containing {\{45, 1000, 800, 600, 400, 200, 100, 5}\} neurons. Between each layer there are ReLU activation and 1D batch normalisation functions, other than between the penultimate and final layer where the batch normalisation is omitted. A softmax activation function, followed by cross-entropy loss, were then applied to the output of the network. The classification network was trained using the same training set as SDNet. Fully sampled FODs were used as the input, and the ground truth targets were calculated using the fast level set marching algorithm [23].

Refer to caption

Figure 2: Qualitative results showing reconstructed FODs for HCP subject 130821 centred at voxel [38,98,70]. The top row consists of the a. Fully Sampled, b. SDNet, c. SDNetκ{\text{SDNet}}_{\kappa}, d. FOD-Net, and e. MSMT CSD FODs. The bottom row shows a zoomed-in area of FODs, corresponding to the region highlighted by the white square, consisting of the f. Fully Sampled, g. SDNet, h. SDNetκ{\text{SDNet}}_{\kappa}, i. FOD-Net, and j. MSMT CSD FODs. Where κ\kappa is the hyperparameter which balances the SH error and fixel classification penalty terms in the loss function as per Eq. 6.

II-C Implementation Details

To demonstrate the impact of the fixel classification penalty, experiments were carried out with κ=0\kappa=0 and κ=1.6×104\kappa=1.6\times 10^{-4}. The ADAM optimiser [24], with learning rate warm-up, was used for parameter optimisation, with an initial learning rate of 10610^{-6}, increasing to 10410^{-4} after 10,00010,000 iterations. To minimise hyperparameter tuning, λ\lambda was optimised simultaneously with the network weights. From validation experiments (data not included), we found that the most effective way to utilise the classification loss to train SDNet was to initially train the model with κ=0\kappa=0 and then to increase κ\kappa to its final value after this initial training stage. To do so we trained SDNet with only MSE loss until convergence, then trained the network until convergence with κ=1.6×104\kappa=1.6\times 10^{-4}.

III Experiments

III-A Dataset

A subset of the WU-Minn Human Connectome Project (HCP) dataset [25], consisting of 30 subjects, was split 20/3/720/3/7 and used for training, validation, and testing, respectively. The HCP images have 1.25mm1.25\text{mm} isotropic resolution with 90 gradient directions for b=1000,2000 and 3000 s/mm2b=1000,2000\text{ and }3000\text{ s/mm}^{2} and 18 b0b_{0} images. The HCP dataset was minimally pre-processed in accordance with [26].

Additionally, prior to applying SDNet, each subject’s data was normalised using MRtrix3’s dwinormalise function. The fully sampled FODs were fit to all 288 DWIs; first, the response functions were calculated using the method proposed in [27], then the FODs calculated using MSMT-CSD [3]. White matter response functions and FODs were modelled with lmax=8l_{max}=8 and the grey matter and cerebrospinal fluid component response functions and FODs were modelled with lmax=0l_{max}=0, resulting in a total of 47 SH coefficients.

The sampling pattern Caruyer et al. [28] utilised in the HCP is such that for any kk, selection of the first kk DWI volumes results in evenly spread b-vectors. To prepare the input data, the first 9 DWIs from each non-zero shell were selected with an additional 3 b0b_{0} images, resulting in a total of 30 DWI signals.

Refer to caption

Figure 3: FODs taken from HCP subject 130821, centred at voxel [84,110,70]. The top row are the a. Fully Sampled, b. SDNet, and c. SDNetκ{\text{SDNet}}_{\kappa} FODs. The second row consists a zoomed-in region of FODs, corresponding to the region highlighted by the white square the contain the d. Fully Sampled, e. SDNet, and f. SDNetκ{\text{SDNet}}_{\kappa}. FODs. Where κ\kappa balances the SH error and fixel classification penalty terms in the loss function as per Eq. 6.

Only patches in which the central voxel is classified as grey matter or white matter are used for training. The grey matter voxels were included to improve performance near the boundary of the two tissue types, as highlighted in [17]. The grey and white matter masks were calculated using the method outlined in [29], which is implemented using the FSL software package [30].

From this point onwards, for notational convenience, SDNet (κ=0\kappa=0) will be referred to as SDNet and SDNet (κ=1.6×104\kappa=1.6\times 10^{-4}) will be referred to as SDNetκ{\text{SDNet}}_{\kappa}. To evaluate the performance of the introduced methods, SDNet, SDNetκ{\text{SDNet}}_{\kappa}, FOD-Net [17], and super-resolved MSMT CSD, referred to as MSMT CSD for notational simplicity, were all compared. In the original implementation, FOD-Net maps FODs fit using the single shell three tissue CSD algorithm [27] to 32 DWIs (4 b0 and 28 b=1000/2000/3000 s/mm2b_{0}\text{ and }28\text{ }b=1000/2000/3000\text{ s/mm}^{2}) to the desired MSMT CSD obtained FODs. To allow a fair comparison between FOD-Net and the proposed networks, FOD-Net was trained using the same training set as SDNet. Since the final block in the SDNet architecture is a DWI consistency block, it cannot map to normalised FODs, therefore the target training data is not normalised. It should be noted that the normalisation can still be performed as a post-processing step. Otherwise, the same configuration settings found in the Github repository released by the FOD-Net authors were used.

III-B Performance Metrics

To evaluate the performance of the FOD reconstruction algorithms, performance metrics were calculated voxel-wise then averaged over regions of interest. The regions considered were the white matter and intersections of individual tracts within the white matter. The tracts considered were: the corpus callosum (CC), the middle cerebellar peduncle (MCP), the corticospinal tract (CST), and the superior longitudinal fascicle (SLF). To understand how the algorithm performs in voxels containing different numbers of fibres, we considered the intersections of these tracts as in [17]. For voxels containing a single fibre, we considered voxels in the CC containing a single fixel, which we refer to as ROI-1-CC. For two crossing fibres, we considered voxels in the intersection of the MCP and CST containing two fixels, which we refer to as ROI-2-MCP. For three crossing fibres, we considered voxels in the intersection of the SLF, CST and CC containing three fixels, which we refer to as ROI-3-SLF. The white matter mask was calculated using the FSL five tissue type segmentation algorithm in MRtrix3. The segmentation masks for the white matter fibre tracts were obtained using TractSeg [31].

The SSE between the reconstructed FODs, 𝐜^\hat{{\bf{c}}}, and the fully sampled FODs, 𝐜{\bf{c}}, was computed as follows:

SSE(𝐜,𝐜^)=𝐜𝐜^22\text{SSE}\left({\bf{c}},\hat{{\bf{c}}}\right)=\left\|{\bf{c}}-\hat{{\bf{c}}}\right\|^{2}_{2} (7)

The angular correlation coefficient (ACC) [32] was computed as follows:

ACC(𝐜,𝐜^)=i=14j=2i2i𝐜2i,j𝐜^2i,j(i=14j=2i2i𝐜2i,j2)(i=14j=2i2i𝐜^2i,j2)\text{ACC}({\bf{c}},\hat{{\bf{c}}})=\frac{\sum\limits^{4}_{i=1}\sum\limits_{j=-2i}^{2i}{\bf{c}}_{2i,j}\hat{{\bf{c}}}_{2i,j}}{\sqrt{\left(\sum\limits^{4}_{i=1}\sum\limits^{2i}_{j=-2i}{\bf{c}}_{2i,j}^{2}\right)\left(\sum\limits^{4}_{i=1}\sum\limits^{2i}_{j=-2i}\hat{{\bf{c}}}_{2i,j}^{2}\right)}} (8)

We refer to SSE and ACC as FOD-based performance metrics, since they compare the SH representation of the FODs prior to any further processing.

Fixel-based analysis requires each FOD to be segmented into fixels, each of which has associated apparent fibre density and peak amplitude [23]. To calculate the associated error metrics, peak amplitude and apparent fibre density vectors must be assembled. Each vector consists of the respective scalar for each fixel ordered according to the peak amplitude and are padded to a fixed length. The remaining metrics are referred to as fixel-based performance metrics since they require the FOD to be segmented into fixels prior to evaluation.

Fixel accuracy was defined for a region of interest as the proportion of voxels in which the FOD is segmented into the correct number of fixels.

The peak amplitude error (PAE) was calculated between the reconstructed, 𝒇^P\hat{{\bm{f}}}^{P}, and fully sampled FOD’s, 𝒇P{\bm{f}}^{P}, peak amplitude vectors:

PAE(𝒇P,𝒇^P)=i|fiPf^iP|\text{PAE}\left({\bm{f}}^{P},\hat{{\bm{f}}}^{P}\right)=\sum\limits_{i}\left|f^{P}_{i}-\hat{f}^{P}_{i}\right| (9)

The apparent fibre density error (AFDE) was calculated between the reconstructed, 𝒇^A\hat{{\bm{f}}}^{A}, and fully sampled FOD’s, 𝒇A{\bm{f}}^{A}, apparent fibre density vectors:

AFDE(𝒇A,𝒇^A)=i|fiAf^iA|\text{AFDE}\left({\bm{f}}^{A},\hat{{\bm{f}}}^{A}\right)=\sum\limits_{i}\left|f^{A}_{i}-\hat{f}^{A}_{i}\right| (10)

III-C Ablation Study

To investigate the impact of the DWI consistency block on the performance of the network, an ablation study was conducted. The network was trained without the DWI consistency blocks, and all other aspects of the architecture and network training remained the same. We compared this model to SDNet with the DWI consistency blocks included.

Refer to caption

Figure 4: SSE and fixel difference error maps for slice 72 from HCP subject 130821. Top row: SSE error maps between the fully sampled FODs and the FODs reconstructed by a. SDNet, b. SDNetκ{\text{SDNet}}_{\kappa}, c. FOD-Net and d. MSMT CSD. Bottom row: Number of fixels calculated for the fully sampled FOD minus the number of fixels calculated from the FODs reconstructed by e. SDNet, f. SDNetκ{\text{SDNet}}_{\kappa}, g. FOD-Net, and h. MSMT CSD. Where κ\kappa balances the SH error and fixel classification penalty terms in the loss function as per Eq. 6. Blue voxels indicate underestimates, and red areas overestimates, of the number of fixels. Large SSE and fixel differences are highlighted by the black and red arrows respectively.

III-D Statistical Analysis

Shapiro-Wilk tests for normality (α=0.05\alpha=0.05) were applied for each performance metric and method; unless otherwise stated there is insufficient evidence to reject the null hypothesis that the groups are normally distributed.

Since the data was normally distributed, and each method was applied to the same set of test subjects, a repeated measures one-way ANOVA (α=0.05\alpha=0.05) was applied to each performance metric to determine whether there was a main effect between the conditions. Finally, to determine which methods contributed to the main effect, post-hoc t-tests with Bonferroni correction (adjusted for α=0.05\alpha=0.05) were used to identify effects between the FOD reconstruction algorithms.

IV Results

IV-A Qualitative Results

The qualitative results comparing all methods (Fig. 2) show that the deep learning methods reconstructed FODs that more closely resembled the ground truth when compared to MSMT CSD. The primary difference is the presence of spurious peaks produced by MSMT CSD, whereas the deep learning based algorithms coherently captured the major tracts in this region due to their denoising effect.

The highlighted region in Fig. 2 (panels f.-j.) shows an area where FOD-Net produced distorted FODs compared to SDNet and SDNetκ{\text{SDNet}}_{\kappa}. MSMT CSD reconstructed particularly noisy FODs in this area, which the results obtained by FOD-Net resembled some similarities to. The FODs produced by SDNet underestimated the amplitude in this region but more accurately distinguished between fibre populations and captured their directions. In this region, which contains dominant fibre populations with large angular separation, the impact of increasing κ\kappa on the reconstructed FODs is minimal; only a small change in the direction of the fibres is observed. In the larger tracts in panels Fig. 2 a.-e., such as the green fibre population going upwards in the bottom left corner, all deep learning methods performed similarly.

The qualitative results comparing SDNet with SDNetκ{\text{SDNet}}_{\kappa} (Fig. 3) illustrate that SDNetκ{\text{SDNet}}_{\kappa} better separated fibre populations. The presence of fibre populations going from the lower left to upper right of panels Fig. 3 d.-f. are separated from the larger fibre population by SDNetκ{\text{SDNet}}_{\kappa} but not by SDNet without the fixel classification penalty. The FODs reconstructed in the broader region, captured in panels Fig. 3 a.-c., show that larger fibre populations are reconstructed similarly for both SDNet and SDNetκ{\text{SDNet}}_{\kappa}.

IV-B FOD-based Results

Refer to caption

Figure 5: Mean test-time FOD-based performance (Left: ACC, Right: SSE) in the white matter (WM), ROI-1-CC: corpus callosum containing a single fixel, ROI-2-MCP: intersection between the middle cerebellar peduncle and superior longitudinal fascicle containing 2 fixels, and ROI-3-SLF: intersection between the superior longitudinal fascicle, corticospinal tract and the corpus callosum containing 3 fixels. κ\kappa balances the SH error and fixel classification penalty terms in the loss function as per Eq. 6. Error bars indicate the standard error of the metrics, which have been averaged over the 7 test subjects.

The SSE error maps (Fig. 4) show that lower SSE is achieved throughout the brain by all deep learning methods compared to MSMT CSD. SDNet generally achieved smaller errors than the other deep learning methods. This is particularly evident in, but not restricted to, the areas highlighted by the red arrows. The error maps produced by SDNetκ{\text{SDNet}}_{\kappa} and FOD-Net are similar.

The average FOD-based performance results (Fig. 5 and Tab. II) show that SDNet reconstructed FODs with significantly lower SSE and higher ACC than the compared methods in all regions of interest considered. The training curves (Fig. 6) show that increasing κ\kappa caused the validation ACC to decrease over the validation set.

In the white matter voxels, SDNet achieved the lowest SSE by a statistically significant margin over all compared methods, followed by SDNetκ{\text{SDNet}}_{\kappa} and FOD-Net, between which there was no statistically significant difference in SSE. SDNet also achieved the strongest ACC performance in the white matter, where it improved over all other methods by a statistically significant margin. There was no statistically significant difference between SDNetκ{\text{SDNet}}_{\kappa} and FOD-Net with respect to ACC in the white matter.

In all of ROI-1-CC, ROI-2-MCP, and ROI-3-SLF, SDNet achieved the strongest SSE and ACC results (Fig. 5 and Tab. II) by a statistically significant margin. FOD-Net and SDNetκ{\text{SDNet}}_{\kappa} showed no statistically significant differences with respect to SSE and ACC in ROI-1-CC and ROI-2-MCP but in ROI-3-SLF SDNetκ{\text{SDNet}}_{\kappa} achieved a statistically significant improvement over FOD-Net with respect to both SSE and ACC. In all regions, all deep learning based FOD reconstruction methods outperformed MSMT CSD with respect to SSE and ACC by a statistically significant margin.

IV-C Fixel-based Results

The fixel-based performance results (Fig. 7 and Tab. II) show greater variation between regions and an increased dependence on κ\kappa. The training curves (Fig. 6) show that increasing κ\kappa caused the validation fixel accuracy to increase over the validation set. In the white matter, SDNetκ{\text{SDNet}}_{\kappa} achieved the strongest fixel accuracy by a significant margin, followed by SDNet and FOD-Net between which there was no statistically significant difference.

In ROI-1-CC, ROI-2-MCP, and ROI-3-SLF, we see that the fixel accuracy of the deep learning FOD reconstruction methods decreased as the number of fixels increased. In ROI-1-CC, SDNet achieved the strongest performance by a statistically significant margin, followed by FOD-Net and SDNetκ{\text{SDNet}}_{\kappa}, between which there is no statistically significant difference in fixel accuracy in the same region.

As the number of fixels in the ROIs increased, the fixel accuracy of SDNetκ{\text{SDNet}}_{\kappa} increased relative to other methods. In ROI-2-MCP, SDNetκ{\text{SDNet}}_{\kappa} achieved the highest fixel accuracy but not by a statistically significant margin over FOD-Net. Both methods outperformed SDNet by a statistically significant margin. In ROI-3-SLF this pattern continued as SDNetκ{\text{SDNet}}_{\kappa}’s performance further improved, and it achieved a statistically significant fixel accuracy increase over the other deep learning methods. There was no statistically significant difference in fixel accuracy between FOD-Net and SDNet in ROI-3-SLF. In all regions other than ROI-3-SLF, MSMT performed worse than all other methods by a statistically significant margin.

For AFDE in the white matter, SDNetκ{\text{SDNet}}_{\kappa} achieved the lowest error by a statistically significant margin, followed by FOD-Net and SDNet between which there is no statistically significant difference in AFDE in the white matter. For PAE in the white matter, SDNetκ{\text{SDNet}}_{\kappa} achieved the lowest error, which was a statistically significant improvement over SDNet but not FOD-Net. For both AFDE and PAE in the white matter, MSMT CSD achieved a higher error than all compared methods by a statistically significant margin.

TABLE II: Quantitative results for all performance metrics, methods and regions of interest. The arrows below each metric represent the direction of improved performance. The p-values for the paired t-test results between the respective methods and SDNet and SDNetκ\text{SDNet}_{\kappa} are indicated by pSD\text{p}_{SD} and pSDκ\text{p}_{SD\kappa}, respectively. The performance metric of the strongest method in each row is bold, significant p-values (Bonferroni corrected, adjusted for α=0.05\alpha=0.05) are also bold.
    Metric     Region Method     SDNet SDNetκ\kappa (pSD) FOD-Net (pSD, pSDκ{\kappa}) MSMT CSD (pSD, pSDκ\kappa)    
    SSE (\downarrow)     White Matter     0.011±\pm0.001 0.012±\pm0.001 (<<0.001) 0.013±\pm0.001 (<<0.001, 0.034) 0.041±\pm0.002 (<<0.001, <<0.001)    
        ROI-1-CC     0.007±\pm0.001 0.008±\pm0.001 (<<0.001) 0.008±\pm0.001 (<<0.001, 0.295) 0.028±\pm0.002 (<<0.001, <<0.001)    
        ROI-2-MCP     0.016±\pm0.001 0.018±\pm0.001 (<<0.001) 0.017±\pm0.001 (0.001, 0.175) 0.045±\pm0.002 (<<0.001, <<0.001)    
        ROI-3-SLF     0.014±\pm0.001 0.015±\pm0.001 (0.005) 0.017±\pm0.001 (<<0.001, <<0.001) 0.063±\pm0.002 (<<0.001, <<0.003)    
    ACC (\uparrow)     White Matter     92.209±\pm0.003 91.152±\pm0.003 (<<0.001) 91.184±\pm0.003 (<<0.001, 0.484) 79.268±\pm0.005 (<<0.001, <<0.001)    
        ROI-1-CC     95.090±\pm0.002 93.994±\pm0.002 (<<0.001) 94.297±\pm0.002 (<<0.001, 0.009) 84.662±\pm0.004 (<<0.001, <<0.001)    
        ROI-2-MCP     92.762±\pm0.003 91.746±\pm0.003 (<<0.001) 92.046±\pm0.003 (<<0.001, 0.032) 79.796±\pm0.006 (<<0.001, <<0.001)    
        ROI-3-SLF     94.577±\pm0.005 94.233±\pm0.005 (0.001) 93.291±\pm0.005 (<<0.001, <<0.001) 74.844±\pm0.011 (<<0.001, <<0.001)    
    Fix Acc (\uparrow)     White Matter     0.640±\pm0.011 0.664±\pm0.008 (<<0.001) 0.645±\pm0.009 (0.037, <<0.001) 0.536±\pm0.006 (<<0.001, <<0.001)    
        ROI-1-CC     0.901±\pm0.002 0.851±\pm0.005 (<<0.001) 0.867±\pm0.003 (<<0.001, 0.036) 0.469±\pm0.010 (<<0.001, <<0.001)    
        ROI-2-MCP     0.754±\pm0.011 0.791±\pm0.010 (<<0.001) 0.772±\pm0.009 (0.001, 0.018) 0.548±\pm0.009 (<<0.001, <<0.001)    
        ROI-3-SLF     0.606±\pm0.032 0.648±\pm0.031 (0.001) 0.588±\pm0.029 (0.023, <<0.001) 0.548±\pm0.009 (0.076, 0.163)    
    PAE (\downarrow)     White Matter     0.155±\pm0.006 0.147±\pm0.005 (0.001) 0.152±\pm0.005 (0.065, 0.011) 0.244±\pm0.007 (<<0.001, <<0.001)    
        ROI-1-CC     0.062±\pm0.002 0.072±\pm0.002 (<<0.001) 0.069±\pm0.002 (<<0.001, 0.053) 0.210±\pm0.007 (<<0.001, <<0.001)    
        ROI-2-MCP     0.135±\pm0.003 0.136±\pm0.003 (0.393) 0.136±\pm0.002 (0.843, 0.973) 0.219±\pm0.004 (<<0.001, <<0.001)    
        ROI-3-SLF     0.179±\pm0.009 0.178±\pm0.007 (0.779) 0.194±\pm0.010 (0.001, 0.002) 0.278±\pm0.006 (<<0.001, <<0.001)    
    AFDE (\downarrow)     White Matter     0.164±\pm0.005 0.151±\pm0.004 (<<0.001) 0.160±\pm0.005 (0.012, 0.002) 0.208±\pm0.006 (<<0.001, <<0.001)    
        ROI-1-CC     0.065±\pm0.001 0.074±\pm0.001 (<<0.001) 0.073±\pm0.002 (0.002, 0.711) 0.187±\pm0.007 (<<0.001, <<0.001)    
        ROI-2-MCP     0.107±\pm0.002 0.105±\pm0.001 (0.526) 0.106±\pm0.001 (0.713, 0.489) 0.171±\pm0.003 (<<0.001, <<0.001)    
        ROI-3-SLF     0.151±\pm0.007 0.149±\pm0.006 (0.462) 0.165±\pm0.007 (<<0.001, <<0.001) 0.230±\pm0.006 (<<0.001, <<0.001)    

In ROI-1-CC, ROI-2-MCP and ROI-3-SLF, both AFDE and PAE generally increased as the number of fixels increased. In ROI-1-CC, SDNet achieved strongest results with respect to both AFDE and PAE and in ROI-2-MCP all three deep learning methods performed similarly with respect to both AFDE and PAE. In ROI-3-SLF, SDNet and SDNetκ{\text{SDNet}}_{\kappa} achieved similar AFDE and PAE, with no statistically significant difference between them, but both achieved a statistically significant improvement compared to FOD-Net.

IV-D Ablation Study

TABLE III: The results of all five performance metrics (mean ±\pm standard error), averaged over all white matter voxels in all 7 test subjects. 𝟐𝐧𝐝{\bf 2^{nd}} column: SDNet, 𝟑𝐫𝐝{\bf 3^{rd}} column: SDNet without the DWI consistency block, 𝟒𝐭𝐡{\bf 4^{th}} column: percentage difference between SDNet with and without the DWI consistency blocks, 𝟓𝐭𝐡{\bf 5^{th}} column: pairwise t-test p-values. Bold p-values indicate a significant (α=0.05\alpha=0.05) effect.
    Metric     SDNet SDNet w/o DC Percentage Change p value    
    SSE (\downarrow)     0.011±0.001{\bf{0.011\pm 0.001}} 0.012±0.0010.012\pm 0.001 9.1 % << 0.05    
    ACC (\uparrow)     92.209±0.003{\bf{92.209\pm 0.003}} 91.679±0.00391.679\pm 0.003 0.57% << 0.05    
    Fix Acc (\uparrow)     0.640±0.011{\bf{0.640\pm 0.011}} 0.625±0.0100.625\pm 0.010 2.3% << 0.05    
    AFDE (\downarrow)     0.164±0.005{\bf{0.164\pm 0.005}} 0.177±0.0050.177\pm 0.005 1.3 % << 0.05    
    PAE (\downarrow)     0.155±0.006{\bf{0.155\pm 0.006}} 0.163±0.0060.163\pm 0.006 0.8% << 0.05    

The results of the ablation study (Tab. III) clearly demonstrate that removing the DWI consistency blocks from the SDNet architecture caused the performance of the network to degrade significantly with respect to all metrics. The greatest relative degradation of performance occurred with respect to SSE, however consistent reductions in the performance of all other metrics was also observed.

Refer to caption

Figure 6: Validation training curves for SDNet, the red cross marks the point when κ\kappa is increased from 0 to 1.6×1041.6\times 10^{-4}

V Discussion

SDNet is a model-based deep learning architecture that employs DWI consistency blocks to ensure intermediate FODs are consistent with the DWI signal, whilst making use of spatial information and multi-shell DWI data to reconstruct FODs. We compared our network to FOD-Net [17], a FOD super-resolution network, which fits FODs to the DWI signal prior to the network’s forward pass. Our results show that SDNet improved over FOD-Net in terms of FOD-based performance, and performed similarly with respect to most fixel-based metrics. We conjecture that FOD-Net loses some details of the DWI signal in the FOD fitting stage. Our qualitative results (Fig. 2) support this since the FODs reconstructed by FOD-Net more closely resembled the unstable input MSMT-CSD FODs, whereas by ensuring consistency with the DWI signal, SDNet more robustly reconstructed FODs which closely resembled the ground truth. The quantitative results collected from our comparison and ablation studies highlighted the improvement in FOD-based performance enabled by including DWI consistency blocks.

The ultimate goal of deep learning based FOD reconstruction is to produce FODs that are useful for quantitative analysis. FOD registration [33], a key component of longitudinal and group FOD analyses, relies on L2L_{2} distance between SH coefficients to captures FOD similarity. By achieving a low SSE, the SH representations will bear increased similarity to the ground truth FODs. We therefore anticipate that SDNet will help ensure that FOD registration is minimally impacted by DWI undersampling, and so too the subsequent analysis.

Refer to caption

Figure 7: Mean test-time Fixel-based performance (Left: Fixel Accuracy, Centre: Apparent Fibre Density Error, Right: Peak Amplitude Error) in the white matter (WM), ROI-1-CC, voxels in the corpus callosum containing single fixels, ROI-2-MCP voxels in the intersection between the middle cerebellar peduncle and superior longitudinal fascicle containing 2 fixels, and ROI-3-SLF voxels in the intersection between the superior longitudinal fascicle, corticospinal tract and the corpus callosum containing 3 fixels. Error bars indicate the standard error of the metrics which have been averaged over the 7 test subjects.

Another factor that may impact such analyses is data containing abnormalities, such as pathologies. Such data will likely not be abundant in the datasets used for training deep learning based FOD reconstruction networks, and as a consequence, reduced performance caused by overfitting becomes probable. Since the DWI consistency blocks ensure that solutions will be consistent with the measured DWI data, we expect that SDNet will be less likely to overfit therefore performing comparatively well compared to networks without DWI consistency blocks. However, further investigation is beyond the scope of the current work.

The outcome of such quantitative analysis is also dependent on the post-registration steps in the pipeline, which, in the case of a fixel-based analysis [4], will be predominantly impacted by the fixel-based performance. Comparing multiple FOD reconstruction algorithms revealed that strong FOD-based performance doesn’t directly translate to strong fixel-based performance. The disconnect between FOD and fixel-based performance is evident in the statistically significant difference in SSE over the white matter between SDNet and FOD-Net, but the absence of a statistically significant effect in fixel accuracy over the same set of voxels. This effect can be attributed to FOD segmentation’s dependence on the angular separation of the FOD lobes, which is dependent on the higher order SH coefficients, which only contribute a small amount to the SSE. This highlights that SSE loss alone may not be optimal for reconstructing FODs that are to be used in a fixel-based analysis pipeline.

By introducing an additional loss component, which penalises reconstructed FODs judged to be made up of the incorrect number of fixels, we have demonstrated that fixel-based performance can be improved. The impact of the proposed loss function is illustrated by the statistically significant increase in fixel accuracy in the white matter achieved by SDNetκ\text{SDNet}_{\kappa} compared to SDNet and FOD-Net. The qualitative results (Fig. 3) highlighted the improved angular separation of fibres with low angular separation. It is also evident that the overall shape of the FOD is captured, as opposed to discrete, or Dirac-like FODs [8, 7, 10]. Furthermore, statistically significant improvements were recorded in fixel accuracy, PAE and AFDE by SDNetκ\text{SDNet}_{\kappa} across the white matter.

However, the introduction of fixel classification penalty in ROI-1-CC led to a reduction in fixel-based performance. This highlighted a potential bias of SDNet towards over-estimating the number of fixels in each voxel. The input of FOD reconstruction networks are necessarily derived from a DWI acquisition with low angular resolution, so do not have sufficient information to reconstruct FODs that contain all fixels, as observed in Fig. 4. Therefore, the effect of the fixel classification penalty will generally be to correct these underestimations by encouraging the network to increase the number of fixels. Since ROI-1-CC contains only single fixel voxels, the fixel-classification penalty may have increased the number of over-estimations in this region, which, when combined with the already strong performance of SDNet and FOD-Net, led to the observed decrease in performance. On the other hand, in ROI-3-SLF, a region containing 3 crossing fibres, the use of fixel classification penalty improved performance compared to the other two deep learning methods, and despite worse performance in ROI-1-CC, SDNetκ\text{SDNet}_{\kappa} resulted in an improvement in performance over the white matter voxels for all fixel-based performance metrics.

In the current work, the fixel classification network is trained on the ground truth data alone, which, depending on the efficacy of the FOD reconstruction algorithm, will have a different distribution to the reconstructed FODs. One possible approach to further improving performance is to devise an algorithm to jointly train the FOD reconstruction network and the fixel classification network, similar to the method used to train generative adversarial networks [34].

The fixel classification penalty component of the loss function appears to share some characteristics with regularisation terms that are ubiquitous in model-based methods for solving ill-posed inverse problems. In particular, to minimise a combination of SSE loss and the fixel classification penalty, a decrease in SSE was incurred, and we have identified in our validation experiments that the extent of such a sacrifice can be controlled by the adjustment of κ\kappa (data not included). This suggests that the solution that obtains the lowest SSE may fail to capture certain desirable features of the FOD. In this work, we have highlighted this impact on the separation of fibre populations with similar orientations, but it is possible other features such as the continuity of fibre populations through space could also be improved using similar methods.

VI Conclusion

In this work we have proposed SDNet, a model-based deep learning architecture optimised for FOD reconstruction. In addition to the learned regularisation blocks, are trained directly in an end-to-end fashion and therefore optimised for the task of FOD reconstruction, the network also takes a neighbourhood of multi-shell DWI signals as input to an architecture containing multiple cascades. We further show that there is a trade-off between FOD-based and fixel-based performance, and propose a fixel classification penalty term in our loss function, as implemented in SDNetκ\text{SDNet}_{\kappa}, as a method of controlling the the trade-off between these performance metrics. We show that, when compared to a state-of-the-art FOD super-resolution network, FOD-Net, gains in FOD-based and fixel-based performance were achieved by SDNet and SDNetκ\text{SDNet}_{\kappa}, respectively.

Acknowledgment

We would like to thank Xi Jia from University of Birmingham for the fruitful discussion on network architecture and parameter tuning in this research. The computations described in this research were performed using the Baskerville Tier 2 HPC service (https://www.baskerville.ac.uk/). Baskerville was funded by the EPSRC and UKRI through the World Class Labs scheme (EP/T022221/1) and the Digital Research Infrastructure programme (EP/W032244/1) and is operated by Advanced Research Computing at the University of Birmingham.

References

  • Tournier et al. [2004] J.-D. Tournier, F. Calamante, D. G. Gadian, and A. Connelly, “Direct estimation of the fiber orientation density function from diffusion-weighted MRI data using spherical deconvolution,” Neuroimage, vol. 23, no. 3, pp. 1176–1185, 2004.
  • Tournier et al. [2007] J.-D. Tournier, F. Calamante, and A. Connelly, “Robust determination of the fibre orientation distribution in diffusion MRI: non-negativity constrained super-resolved spherical deconvolution,” Neuroimage, vol. 35, no. 4, pp. 1459–1472, 2007.
  • Jeurissen et al. [2014] B. Jeurissen, J.-D. Tournier, T. Dhollander, A. Connelly, and J. Sijbers, “Multi-tissue constrained spherical deconvolution for improved analysis of multi-shell diffusion MRI data,” NeuroImage, vol. 103, pp. 411–426, 2014.
  • Raffelt et al [2012] D. Raffelt et al, “Apparent fibre density: a novel measure for the analysis of diffusion-weighted magnetic resonance images,” Neuroimage, vol. 59, no. 4, pp. 3976–3994, 2012.
  • Raffelt et al. [2017] D. A. Raffelt, J. D. Tournier, R. E. Smith, D. N. Vaughan, G. Jackson, G. R. Ridgway, and A. Connelly, “Investigating white matter fibre density and morphology using fixel-based analysis,” NeuroImage, vol. 144, pp. 58–73, 1 2017.
  • Tournier et al. [2013] J. D. Tournier, F. Calamante, and A. Connelly, “Determination of the appropriate b value and number of gradient directions for high-angular-resolution diffusion-weighted imaging,” NMR in Biomedicine, vol. 26, pp. 1775–1786, 12 2013.
  • Koppers and Merhof [2016] S. Koppers and D. Merhof, “Direct estimation of fiber orientations using deep learning in diffusion imaging,” in International Workshop on Machine Learning in Medical Imaging.   Springer, 2016, pp. 53–60.
  • Elaldi et al. [2021] A. Elaldi, N. Dey, H. Kim, and G. Gerig, “Equivariant spherical deconvolution: Learning sparse orientation distribution functions from spherical data,” in International Conference on Information Processing in Medical Imaging.   Springer, 2021, pp. 267–278.
  • Hosseini et al. [2022] S. Hosseini, M. Hassanpour, S. Masoudnia, S. Iraji, S. Raminfard, and M. Nazem-Zadeh, “Cttrack: A CNN+ transformer-based framework for fiber orientation estimation & tractography,” Neuroscience Informatics, vol. 2, no. 4, p. 100099, 2022.
  • Karimi et al. [2021] D. Karimi, L. Vasung, C. Jaimes, F. Machado-Rivas, S. K. Warfield, and A. Gholipour, “Learning to estimate the fiber orientation distribution function from diffusion-weighted MRI,” NeuroImage, vol. 239, p. 118316, 2021.
  • Lin et al. [2019] Z. Lin, T. Gong, K. Wang, Z. Li, H. He, Q. Tong, F. Yu, and J. Zhong, “Fast learning of fiber orientation distribution function for MR tractography using convolutional neural network,” Medical physics, vol. 46, no. 7, pp. 3101–3116, 2019.
  • Nath et al. [2020] V. Nath, S. K. Pathak, K. G. Schilling, W. Schneider, and B. A. Landman, “Deep learning estimation of multi-tissue constrained spherical deconvolution with limited single shell DW-MRI,” in Medical Imaging 2020: Image Processing, vol. 11313.   SPIE, 2020, pp. 162–171.
  • kop [2017] “Reconstruction of diffusion anisotropies using 3D deep convolutional neural networks in diffusion imaging,” in Modeling, Analysis, and Visualization of Anisotropy.   Springer, 2017, pp. 393–404.
  • Jha et al. [2022] R. R. Jha, S. K. Pathak, V. Nath, W. Schneider, B. R. Kumar, A. Bhavsar, and A. Nigam, “VRfRNet: Volumetric ROI fODF reconstruction network for estimation of multi-tissue constrained spherical deconvolution with only single shell dMRI,” Magnetic Resonance Imaging, vol. 90, pp. 1–16, 2022.
  • Patel et al. [2018] K. Patel, S. Groeschel, and T. Schultz, “Better fiber ODFs from suboptimal data with autoencoder based regularization,” in International Conference on Medical Image Computing and Computer-Assisted Intervention.   Springer, 2018, pp. 55–62.
  • Lucena et al. [2021] O. Lucena, S. B. Vos, V. Vakharia, J. Duncan, K. Ashkan, R. Sparks, and S. Ourselin, “Enhancing the estimation of fiber orientation distributions using convolutional neural networks,” Computers in Biology and Medicine, vol. 135, p. 104643, 2021.
  • Zeng et al. [2022] R. Zeng, J. Lv, H. Wang, L. Zhou, M. Barnett, F. Calamante, and C. Wang, “FOD-Net: A deep learning method for fiber orientation distribution angular super resolution,” Medical Image Analysis, vol. 79, p. 102431, 2022.
  • Aggarwal et al. [2018] H. K. Aggarwal, M. P. Mani, and M. Jacob, “MoDL: Model-based deep learning architecture for inverse problems,” IEEE Transactions on Medical Imaging, vol. 38, no. 2, pp. 394–405, 2018.
  • Schlemper et al. [2017] J. Schlemper, J. Caballero, J. V. Hajnal, A. Price, and D. Rueckert, “A deep cascade of convolutional neural networks for MR image reconstruction,” in International Conference on Information Processing in Medical Imaging.   Springer, 2017, pp. 647–658.
  • Jia et al. [2021] X. Jia, A. Thorley, W. Chen, H. Qiu, L. Shen, I. B. Styles, H. J. Chang, A. Leonardis, A. De Marvao, D. P. O’Regan et al., “Learning a model-driven variational network for deformable image registration,” IEEE Transactions on Medical Imaging, vol. 41, no. 1, pp. 199–212, 2021.
  • Duan et al. [2019] J. Duan, J. Schlemper, C. Qin, C. Ouyang, W. Bai, C. Biffi, G. Bello, B. Statton, D. P. O’regan, and D. Rueckert, “VS-Net: Variable splitting network for accelerated parallel MRI reconstruction,” in International Conference on Medical Image Computing and Computer-assisted Intervention.   Springer, 2019, pp. 713–722.
  • Chen et al. [2023] G. Chen, Y. Hong, K. M. Huynh, and P.-T. Yap, “Deep learning prediction of diffusion MRI data with microstructure-sensitive loss functions,” Medical Image Analysis, p. 102742, 2023.
  • Smith et al. [2013] R. E. Smith, J.-D. Tournier, F. Calamante, and A. Connelly, “Sift: Spherical-deconvolution informed filtering of tractograms,” Neuroimage, vol. 67, pp. 298–312, 2013.
  • Kingma and Ba [2014] D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” arXiv preprint arXiv:1412.6980, 2014.
  • Van Essen et al. [2013] D. C. Van Essen, S. M. Smith, D. M. Barch, T. E. Behrens, E. Yacoub, K. Ugurbil, W.-M. H. Consortium et al., “The WU-Minn human connectome project: an overview,” Neuroimage, vol. 80, pp. 62–79, 2013.
  • Sotiropoulos et al. [2013] S. N. Sotiropoulos, S. Jbabdi, J. Xu, J. L. Andersson, S. Moeller, E. J. Auerbach, M. F. Glasser, M. Hernandez, G. Sapiro, M. Jenkinson et al., “Advances in diffusion MRI acquisition and processing in the human connectome project,” Neuroimage, vol. 80, pp. 125–143, 2013.
  • Dhollander et al. [2019] T. Dhollander, R. Mito, D. Raffelt, and A. Connelly, “Improved white matter response function estimation for 3-tissue constrained spherical deconvolution,” in Proc. Intl. Soc. Mag. Reson. Med, vol. 555, no. 10, 2019.
  • Caruyer et al. [2013] E. Caruyer, C. Lenglet, G. Sapiro, and R. Deriche, “Design of multishell sampling schemes with uniform coverage in diffusion MRI,” Magnetic Resonance in Medicine, vol. 69, no. 6, pp. 1534–1540, 2013.
  • Zhang et al. [2001] Y. Zhang, M. Brady, and S. Smith, “Segmentation of brain MR images through a hidden Markov random field model and the expectation-maximization algorithm,” IEEE Transactions on Medical Imaging, vol. 20, no. 1, pp. 45–57, 2001.
  • Jenkinson et al. [2012] M. Jenkinson, C. F. Beckmann, T. E. Behrens, M. W. Woolrich, and S. M. Smith, “FSL,” Neuroimage, vol. 62, no. 2, pp. 782–790, 2012.
  • Wasserthal et al. [2018] J. Wasserthal, P. Neher, and K. H. Maier-Hein, “TractSeg-Fast and accurate white matter tract segmentation,” NeuroImage, vol. 183, pp. 239–253, 2018.
  • Anderson [2005] A. W. Anderson, “Measurement of fiber orientation distributions using high angular resolution diffusion imaging,” Magnetic Resonance in Medicine: An Official Journal of the International Society for Magnetic Resonance in Medicine, vol. 54, no. 5, pp. 1194–1206, 2005.
  • Raffelt et al. [2011] D. Raffelt, J.-D. Tournier, J. Fripp, S. Crozier, A. Connelly, and O. Salvado, “Symmetric diffeomorphic registration of fibre orientation distributions,” Neuroimage, vol. 56, no. 3, pp. 1171–1180, 2011.
  • Goodfellow et al. [2014] I. J. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio, “Generative adversarial networks,” 6 2014. [Online]. Available: http://arxiv.org/abs/1406.2661