Regression Networks For Calculating Englacial Layer Thickness

Abstract

Ice thickness estimation is an important aspect of ice sheet modelling. In this work, we use convolutional neural networks (CNN) with multiple output nodes to regress and learn the thickness of internal ice layers in Snow Radar images captured over northwest Greenland. We experiment with some state-of-the-art CNNs to obtain a mean absolute error of 1.251 pixels of thickness estimation over the test set. Such regression-based networks can further be improved by embedding domain knowledge and radar information in the neural network in order to reduce the requirement of manual annotations.

Index Terms— Regression, Englacial Ice Thickness, Radar, Convolutional Neural Networks

1 Introduction

Rapidly changing climate is adversely affecting the world resulting in a negative ice sheet mass balance, a primary contributor to sea level rise. The Intergovernmental Panel on Climate Change (IPCC) reports [1] that the sea level increase over the next century could potentially cause floods that risk millions of livelihoods. It is imperative to estimate the seasonal changes in ice sheets accurately, and build systems that would help in mitigating the possible calamity.

Annual snow accumulation rates can be calculated from detecting the englacial ice layer thickness [2]. The internal ice layers can be detected with ground penetrating radar sensors, such as the Snow Radar flown for the NASA Operation IceBridge (OIB) mission. This sensor captures the vertical profile of an ice sheet where the varying contrast in radar reflections can be annually dated [2]. The captured images showcase the most recent layer, the ice surface, at the top and with layers from preceding years appearing underneath it. These images are noisy, and the multiple ice layers cannot be easily extracted even with modern image processing techniques [3, 4].

Convolutional Neural Networks (CNNs) are intelligent algorithms which have recently shown advantages for computer vision tasks such as image classification, object detection and semantic segmentation. In this paper, we use them for numerical regression to predict the thickness of each internal ice layer present in a Snow Radar image. This work is an improvement over [5] where ice thickness was calculated as a separate, post-processing step after semantic segmentation of snow radar images. The rest of the paper is divided as follows. Section II describes the related work in ice layer detection with ground penetrating radar images and Section III describes the CNNs we use for regression. Section IV showcases the results and Section V concludes the paper.

2 Related Work

In the past, techniques such as [4, 3, 6] were proposed to detect all the internal ice sheet layers. These were semi-supervised methods which required human correction, and their outputs were sparse in nature [5]. Recent efforts such as [7, 8, 9, 5] applied multi-scale deep learning techniques to track and identify internal ice layers in a pixel-wise manner. The purpose of tracking all internal layers is ultimately to calculate changes in accumulation rate over time. In our work, rather than creating image level pixel outputs, we build regression networks to directly learn and predict the thickness of ice layers, given an input radar image. By using well trained generalized networks, these models can be scaled to larger datasets captured from a variety of regions. The methodology of how we build these regression based CNNs is described in Section 4.

Some popular works on image regression have been [10, 11] where the former used a VGG-16 architecture to predict the real and apparent age of a person from their facial image; whereas the latter created cell density maps through fully convolutional regression networks. For ice thickness estimation, works such as [12, 13, 14, 15] were proposed. [12] used a deep residual network to regress Raman spectral data, the output of which is later used for sea ice thickness measurements. [13] used a simple 3 layer CNN with a single output node to regress Antarctic sea ice thickness from lidar data, whereas [14] used decision trees on CyroSat-2 and MODIS images to detect sea ice freeboard; which is useful in calculating sea ice thickness. [15] also built a deep CNN with a single output node to calculate sea ice concentration, and compared its performance with random forest and a linear regressor.

In this work, we set up multi-output regression networks where each output node can estimate the thickness of each individual internal ice layer.

3 Dataset

We use ultra-wideband Snow Radar data captured over northwest Greenland by the Center for Remote Sensing of Ice Sheets (CReSIS) in 2012 during the OIB Arctic campaign. This instrument has a vertical resolution of 4cm per pixel in snow and is designed to detect shallow annual layers in the ice sheet. [3]. This dataset contains 2361 training images along with 260 test images. Ice layers in this dataset were sparsely detected by [3], which were further processed by [5] to remove incomplete layers. We use the latter work’s processed layers for calculating the mean thickness of each layer, and treating these thickness values as our ‘ground truth’ for regression. A sample Snow Radar image, its layers from [5] and their mean thickness values are shown in Figure 1. This dataset contains a maximum of 27 detected layers, which has been pre-computed from the labelled dataset.

4 Methodology

Regression networks are those that predict continuous values of a variable, rather than discrete labels as those predicted in classification networks. The predicted variables can take any value, including decimal values. For preparing our ground truth for regression, we calculated the thickness of each layer in every image from the ground truth (Figure 1(b)). For calculating the thickness of each layer, we first calculated the number of pixels for each label in the ground truth image, and then divided it by the total number of columns (width) of the image. This would give the mean number of rows per column for every label, which corresponds to the average thickness of each layer (Figure 1(c)). We also pre-calculated the maximum number of layers in our ground truth, which was found to be 27.

Once we had the ground truth thickness values prepared, we built our regression network. We used classification networks like InceptionV3 [16], DenseNet [17], ResNet50 [18], Xception [19] and MobileNetV2 [20] as baseline models and added a global average pooling layer along with a fully connected layer of 1024 neurons, to the output of the baseline models. Finally, we added a fully connected layer of 27 nodes to each model, which corresponds to the number of layers and thickness values we want to predict in every image during inference. The overall architecture is shown in Figure 2. Here, ‘DCNN’ is a deep convolutional neural network, such as InceptionV3, DenseNet, Xception etc. ‘FC’ represents fully connected layers, with the number of nodes shown in the parenthesis. FC1 contains 1024 nodes and FC2 contains 27 nodes. ‘GT’ is the ground truth in terms of thickness estimates. As this was a regression problem, instead of classification, we used a ReLU activation instead of Softmax. We also compared our work with the output of [5] where thickness was calculated as a post-processing step.

4.1 Baseline CNN models

Each of the networks InceptionV3, DenseNet, ResNet50 [18], Xception, and MobileNetV2 have significantly contributed to the field of computer vision by introducing a unique design element, such as the skip connections and residual blocks in ResNet, the multi-scale architecture in Inception and depthwise separable convolutions in Xception. DenseNet contains shorter connections between layers close to the input, and those layers close to the output. For each layer, feature maps of all preceding layers are used as input and thus help in feature propagation and reducing the vanishing gradient problem. MobileNetV2 helps in taking a low-dimensional compressed representation of the input, and filters it with lightweight depthwise separable convolutions. Such an architecture made it efficient for processing images through mobile devices.

4.2 Experimental Setup

All networks were trained for 100 epochs with the Adam optimizer [21] having a learning rate of 0.0001 and a batch size of 32. They were trained to minimize the mean absolute error (MAE) given by Equation 1, where $p_{i}$ is the predicted thickness (in pixels) by the neural networks and $t_{i}$ is their thickness (in pixels) from ground truth labels. $k$ is the number of labels, which is 27 in our case. We used an NVIDIA GeForce RTX 2080 Ti GPU with an Intel Core i9 processor for our experiments.

MAE=\frac{\sum\limits_{i=1}^{k}\mid p_{i}-t_{i}\mid}{k}

(1)

5 Results and Disucssion

Baseline Model (DCNN)	Train	Test
InceptionV3 [16]	1.645	2.145
DenseNet121 [17]	0.642	1.307
ResNet50 [18]	0.595	1.251
Xception [19]	1.472	1.966
MobileNetV2 [20]	1.490	2.132
post-DeepLabv3+[5]	2.36	3.59

Table 1: Mean absolute error (MAE) in pixels computed over the training and test set through our regression networks.

This section highlights the results that we achieved and gives a brief analysis. The loss curves of the five networks we trained are shown in Figure 3 and the MAE achieved by each of them is shown in Table 1. This table also compares the MAE achieved by [5] as the post-DeepLabv3+ model. From Figure 3 we see that ResNet50 trained much more quickly, achieving a lesser MAE before any other network. On the other hand, Table 1 shows that InceptionV3 gave the largest MAE followed by MobileNetV2, Xception, DenseNet121, and then ResNet50. InceptionV3 was built to cater to images having the same object of different sizes. Hence, the architecture was introduced with multiple 5 $\times$ 5 and 3 $\times$ 3 convolutions on the same input image. We note that although such a feature might be helpful for a dataset such as ImageNet [22], this is not useful for the CReSIS dataset where ice layers are generally of the same size and width. We see that ResNet50, with its skip connections in the residual blocks and DenseNet, with its densely connected network give the lowest (best) MAE values. Further, depthwise convolutions, present in Xception and MobileNetV2 architectures, were found to be much more useful than the inception modules of InceptionV3. Thus, residual learning, or densely connected networks, where each layer is connected with all of its succeeding layer helps in network learning and reducing the number of weights required. Such network strategies help in feature propagation and efficient extraction of image features.

Overall, obtaining a mean absolute error of approximately 0.6 to 1.2 pixels is good since it translates to just 2 to 5 centimeters of error in ice layer thickness, which are on the order of 1 meter in thickness. Thus, using regression networks has been especially useful, since they directly learn the layer thickness values, rather than first learning the spatio-contextual (pixel-wise) distribution of ice layers as in [5].

6 Conclusion

This work saw the successful use of CNNs in regressing ice layer thickness values, which is an important requirement for climate studies. Further, the skip or dense connections from ResNet and DenseNet were found to be more useful than depthwise convolutions or the inception module. Directly regressing for layer thickness values is much more useful than performing semantic segmentation and calculating thickness as a post-processing step. For future work, such regression networks can be combined with domain knowledge in order to retrieve accurate information with lesser number of labels.

References

[1] T. Stocker, D. Qin, G. Plattner, M. Tignor, S. Allen, J. Boschung, A. Nauels, Y. Xia, V. Bex, and P. Midgley, “Summary for policymakers,” 2014.
[2] B. Medley, I. Joughin, S. B. Das, E. J. Steig, H. Conway, S. Gogineni, A. S. Criscitiello, J. R. McConnell, B. E. Smith, M. R. van den Broeke, J. T. M. Lenaerts, D. H. Bromwich, and J. P. Nicolas, “Airborne-radar and ice-core observations of annual snow accumulation over thwaites glacier, west antarctica confirm the spatiotemporal variability of global and regional atmospheric models,” Geophysical Research Letters, vol. 40, no. 14, pp. 3649–3654, 2013.
[3] L. S. Koenig, A. Ivanoff, P. M. Alexander, J. A. MacGregor, X. Fettweis, B. Panzer, J. D. Paden, R. R. Forster, I. Das, J. R. McConnell, M. Tedesco, C. Leuschen, and P. Gogineni, “Annual Greenland accumulation rates (2009–2012) from airborne snow radar,” The Cryosphere, vol. 10, no. 4, pp. 1739–1752, 2016.
[4] J. A. MacGregor, M. A. Fahnestock, G. A. Catania, J. D. Paden, S. Gogineni, S. K. Young, S. C. Rybarski, A. N. Mabrey, B. M. Wagman, and M. Morlighem, “Radiostratigraphy and age structure of the Greenland Ice Sheet,” Journal of Geophysical Research: Earth Surface, vol. 120, no. 2, pp. 212–241, 2015.
[5] D. Varshney, M. Rahnemoonfar, M. Yari, and J. Paden, “Deep Ice Layer Tracking and Thickness Estimation using Fully Convolutional Networks,” in 2020 IEEE International Conference on Big Data (Big Data). IEEE, 2020, pp. 3943–3952.
[6] V. d. P. Onana, L. S. Koenig, J. Ruth, M. Studinger, and J. P. Harbeck, “A Semiautomated Multilayer Picking Algorithm for Ice-Sheet Radar Echograms Applied to Ground-Based Near-Surface Data,” IEEE Transactions on Geoscience and Remote Sensing, vol. 53, no. 1, pp. 51–69, 2015.
[7] M. Yari, M. Rahnemoonfar, J. Paden, I. Oluwanisola, L. Koenig, and L. Montgomery, “Smart Tracking of Internal Layers of Ice in Radar Data via Multi-Scale Learning,” in 2019 IEEE International Conference on Big Data (Big Data), 2019, pp. 5462–5468.
[8] M. Yari, M. Rahnemoonfar, J. Paden, L. Koenig, and I. Oluwanisola, “Multi-Scale and Temporal Transfer Learning for Automatic Tracking of Internal Ice Layers,” in IEEE International Geoscience and Remote Sensing Symposium. IEEE, 2020, p. in press.
[9] M. Rahnemoonfar, M. Yari, and J. Paden, “Radar Sensor Simulation With Generative Adversarial Network,” in IEEE International Geoscience and Remote Sensing Symposium. IEEE, 2020, p. in press.
[10] R. Rothe, R. Timofte, and L. Van Gool, “Deep expectation of real and apparent age from a single image without facial landmarks,” International Journal of Computer Vision, vol. 126, no. 2-4, pp. 144–157, 2018.
[11] W. Xie, J. Alison. Noble, and A. Zisserman, “Microscopy cell counting and detection with fully convolutional regression networks,” Computer Methods in Biomechanics and Biomedical Engineering: Imaging & Visualization, vol. 6, no. 3, pp. 283–292, 2018.
[12] M. Shan, Q. Cheng, Z. Zhong, B. Liu, and Y. Zhang, “Deep-learning-enhanced ice thickness measurement using raman scattering,” Opt. Express, vol. 28, no. 1, pp. 48–56, Jan 2020.
[13] M. J. Mei, T. Maksym, B. Weissling, and H. Singh, “Estimating early-winter antarctic sea ice thickness from deformed ice morphology,” The Cryosphere, vol. 13, no. 11, pp. 2915–2934, 2019.
[14] S. Lee, J. Im, J. Kim, M. Kim, M. Shin, H. Kim, and L. J. Quackenbush, “Arctic sea ice thickness estimation from cryosat-2 satellite data using machine learning-based lead detection,” Remote Sensing, vol. 8, no. 9, 2016.
[15] Y. Kim, H. Kim, D. Han, S. Lee, and J. Im, “Prediction of monthly arctic sea ice concentrations using satellite and reanalysis data based on convolutional neural networks,” The Cryosphere, vol. 14, no. 3, pp. 1083–1104, 2020.
[16] C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, and Z. Wojna, “Rethinking the inception architecture for computer vision,” in 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Los Alamitos, CA, USA, jun 2016, pp. 2818–2826, IEEE Computer Society.
[17] G. Huang, Z. Liu, L. Van Der Maaten, and K. Q. Weinberger, “Densely connected convolutional networks,” in 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017, pp. 2261–2269.
[18] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 770–778.
[19] F. Chollet, “Xception: Deep learning with depthwise separable convolutions,” in 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017, pp. 1800–1807.
[20] M. Sandler, A. Howard, M. Zhu, A. Zhmoginov, and L. Chen, “Mobilenetv2: Inverted residuals and linear bottlenecks,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 4510–4520.
[21] D. P. Kingma and J. L. Ba, “Adam: A method for stochastic optimization,” in ICLR (Poster), 2015.
[22] J. Deng, W. Dong, R. Socher, L. Li, Kai Li, and Li Fei-Fei, “Imagenet: A large-scale hierarchical image database,” in 2009 IEEE Conference on Computer Vision and Pattern Recognition, 2009, pp. 248–255.