This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

\headers

Covid-19 Analysis Using Tensor MethodsD. Dulal, R. Goudarzi Karim, and C. Navasca

Covid-19 Pandemic Data Analysis using Tensor Methods thanks: Submitted to the editors December 29, 2022. \fundingThis work was funded by National Science Foundation under Grant No. DMS-1439786 and Grant No. MCB-2126374.

Dipak Dulal Department of Mathematics, University of Alabama at Birmingham, Birmingham, AL 35294 (. [email protected])    Ramin Goudarzi Karim Department of Computational and Information Sciences, Stillman College, Tuscaloosa, AL 35401 (. [email protected])    Carmeliza Navasca Department of Mathematics, University of Alabama at Birmingham, Birmingham, AL 35294 (. [email protected])
Abstract

In this paper, we use tensor models to analyze Covid-19 pandemic data. First, we use tensor models, canonical polyadic and higher-order Tucker decompositions, to extract patterns over multiple modes. Second, we implement a tensor completion algorithm using canonical polyadic tensor decomposition to predict spatiotemporal data from multiple spatial sources and to identify Covid-19 hotspots. We apply a regularized iterative tensor completion technique with a practical regularization parameter estimator to predict the spread of Covid-19 cases and to find and identify hotspots. Our method can predict weekly and quarterly Covid-19 spreads with high accuracy. Third, we analyze Covid-19 data in the US using a novel sampling method for alternating least-squares. Moreover, we compare the algorithms with standard tensor decompositions it terms of their interpretability, visualization and cost analysis. Finally, we demonstrate the efficacy of the methods by applying the techniques to New Jersey’s Covid-19 case tensor data.

keywords:
tensor, tensor completion, tensor decomposition, Covid-19, spatio-temporal data

1 Introduction

Tensor decomposition is a powerful tool in data analysis, computer vision, scientific computing, machine learning and many other fields. In fact, tensor models for dimensionality reduction have been highly effective in machine learning applications like classification and regression. Tensor decomposition on its own have been successful in many modern big data analysis [11, 8, 24, 31].

In this work, we focus on analyzing Covid-19 pandemic data [28, 1] by using tensor decomposition models. Our goals are to predict future infection, locate and identify hotspots from data-set coming from multiple sources across different spatial regions over time. One is interested in determining where and when changes occur in the pattern. The strategy is set up an optimization which separates the data in following: let 𝒴=+𝒮\mathcal{Y}=\mathcal{L}+\mathcal{S} where 𝒴\mathcal{Y} is the given tensor, \mathcal{L} is a low rank reconstructed tensor of 𝒴\mathcal{Y} and 𝒮\mathcal{S} is the sparse tensor. In video processing, the original video is separated into its background and foreground subspace to detect anomalous activities [11, 32]. The tensor \mathcal{L} is the background, and 𝒮\mathcal{S} is the foreground. The sparse tensor 𝒮\mathcal{S} can provide anomalous activities. Similarly, 𝒮\mathcal{S} will contain hotspot occurrence. To achieve this separation, we formulate the following:

(1) min𝐚𝐫,𝐛𝐫,𝐜𝐫,𝜶𝐫𝒞F2+σ𝜶1\displaystyle\min_{\bf a_{r},\bf b_{r},\bf c_{r},\bm{\alpha}_{r}}\|\mathcal{C}-\mathcal{L}\|_{F}^{2}+\sigma\|\bm{\alpha}\|_{\ell_{1}}

where =r=1R𝜶r𝐚𝐫𝐛𝐫𝐜𝐫\mathcal{L}=\sum^{R}_{r=1}\bm{\alpha}_{r}\bf a_{r}\circ\bf b_{r}\circ\bf c_{r}, F\|\cdot\|_{F} is the tensor Frobenius norm, and 1\|\cdot\|_{\ell_{1}} is the vector one norm. The optimization problem (1) is neither convex nor differentiable [12]. In addition, this formulation is amenable to tensor completion problems where one can find missing data. This is basis for predicting future Covid-19 infection cases in presence of previous pandemic data.

Most of the algorithms suggested for solving (1) are based on ALS (alternating least squares) [12]. ALS is fast, easy to program and effective. However, ALS has some drawbacks. There is a need for more efficient and accurate methods. Thus, we propose a sampling method for ALS to leverage the efficiency of ALS and to allow bigger tensor data.

1.1 Previous Work on Covid-19 Data Analysis

Here, we mention a non-exhaustive collection of literature on Covid-19 analysis. The work varies from PDE models to machine learning methods for the prediction of Covid-19 hotspots, patterns, and outbreaks. A parabolic PDE-based predictive model with learned parameters from training data of preceding Covid-19 cases has been proposed to predict Covid-19 infections in Arizona[29]. A tensor train global optimization is used to explore the parameter space to locate the starting points, and the Nelder-Mead simplex local optimization algorithm is used for the local optimization of the covid-19 modeling problem and eventually, Runge-Kutta method was applied to solve the PDE for one step forward prediction. There are several machine learning methods. In supervised machine learning, time series forecasting via Holt’s Winter model was used to analyze global Covid-19 data and predict the sum of global Covid-19 cases in comparison to linear regression and support vector algorithms. Dictionary learning [19] through non-negative matrix factorization to identify the pattern and to predict future covid-19 outbreaks. In addition, machine learning has been applied to a susceptible-exposed-infected-removed (SEIR) model with SARS 2003 training data to predict covid-19 outbreaks [33]. Deep learning models like attention-based encoder-decoder model [20] have been implemented to forecast covid-19 outbreak. There needs to be more work on mathematical modeling to detect covid-19 hotspots; however, a tensor-based anomaly detection method for spatial-temporal data[34] has been proposed to determine hotspots based on an anomaly in pattern.

1.2 Contributions.

In this work, we focus on analyzing Covid-19 infection data [28, 1] of New Jersey from the period of 04/01/2020 to 12/26/2021 (NJ-Covid-19). The state of New Jersey was initially chosen since we would like to study the spread of the disease in the most densely populated. The raw data collected by New York Times [28, 1] was the daily basis cumulative data. Pre-processing techniques were applied to the raw spatio-temporal data and formatting the data in a tensor structure. Standard low-rank tensor decomposition models such as canonical polyadic (CP), Higher Order Orthogonal Iteration (HOOI), as well as tensor rank revealing methods like the tensor rank approximation method called low-rank approximation of tensor (LRAT) [8] and LRAT with flexible Golub-Kahan [31] are used to approximate the pattern and flow of covid infections. Novel stochastic canonical polyadic (STOCP) was also applied to the NJ-Covid-19 data. Some of the other contributions are the following:

  • Converted each entry of the Covid-19 tensor into the rate of increment in each week and applied the CP-ALS algorithm and plotted error with practical threshold using standard deviation and mean. The spike above the threshold line exactly replicates to the original spike of the covid-19 infections.

  • The tensor completion optimization was formulated to predict future covid-19 infections as far as the following week and following quarter using LRAT with flexible Golub-Kahan.

  • The proposed STOCP was implemented on the NJ-Covid-19 Data. To see its efficacy, we compare its numerical results with the standard CP decomposition model via the alternating least-squares method. We tested our spatiotemporal data tensor and other image data of different sizes.

1.3 Outline of the paper

The paper is organized as follows. In Section 2, we provide some tensor backgrounds, standard tensor decompositions, CP, Higher-Order Tucker, and well-known numerical techniques like Alternating Least-Squares and Higher-Order Orthogonal Iteration. Then, in Section 3, we include some discussions of the Covid-19 tensor as well as CP and HOT tensor model analysis. Section 4 describes a sampling method for ALS with some numerical results comparison with ALS. Section 5 deals with the tensor sparse model which is implemented with LRAT with Golub-Kahan [31]. We discuss its application to predicting Covid-19 infection cases and locating and identifying covid-19 hotspots. Finally, we provide concluding remarks and some future outlooks in Section 6.

2 Preliminaries

We denote a vector by a bold lower-case letter 𝐚\mathbf{a}. The bold upper-case letter 𝐀\mathbf{A} represents a matrix and the symbol of a tensor is a calligraphic letter 𝒜\mathcal{A}. Throughout this paper, we focus on third-order tensors 𝒜=(aijk)I×J×K\mathcal{A}=(a_{ijk})\in\mathbb{R}^{I\times J\times K} of three indices 1iI,1jJ1\leq i\leq I,1\leq j\leq J and 1kK1\leq k\leq K, but these can be easily extended to tensors of arbitrary order greater or equal to three.

The three kinds of matricization for third-order 𝒜\mathcal{A} are 𝐀(1),𝐀(2)\mathbf{A}_{(1)},\mathbf{A}_{(2)} and 𝐀(3)\mathbf{A}_{(3)}, according to respectively arranging the column, row, and tube fibers to be columns of matrices, which are defined by fixing every index but one and denoted by 𝐚:jk\mathbf{a}_{:jk}, 𝐚i:k\mathbf{a}_{i:k} and 𝐚ij:\mathbf{a}_{ij:} respectively. We also consider the vectorization for 𝒜\mathcal{A} to obtain a row vector 𝐚\mathbf{a} such the elements of 𝒜\mathcal{A} are arranged according to kk varying faster than jj and jj varying faster than ii, i.e., 𝐚=(a111,,a11K,a121,,a12K,,a1J1,,a1JK,)\mathbf{a}=(a_{111},\cdots,a_{11K},a_{121},\cdots,a_{12K},\cdots,a_{1J1},\cdots,a_{1JK},\cdots). Kronecker Product of two matrices 𝐀m×n\mathbf{A}\in\mathbb{R}^{m\times n} and 𝐁p×q\mathbf{B}\in\mathbb{R}^{p\times q} is denoted as 𝐀𝐁mp×nq\mathbf{A}\otimes\mathbf{B}\in\mathbb{R}^{mp\times nq} and obtained as the product of each element of 𝐀\mathbf{A} and the matrix𝐁\mathbf{B}. Khatri-Rao Product of two matrices 𝐀\mathbf{A} and 𝐁\mathbf{B} with same columns is the column-wise Kronecker product: 𝐀𝐁=[𝐚1𝐛1𝐚R𝐛R].\mathbf{A\odot B}=[\mathbf{a}_{1}\otimes\mathbf{b}_{1}\hskip 4.26773pt\ldots\hskip 4.26773pt\mathbf{a}_{R}\otimes\mathbf{b}_{R}]. The outer product of a rank-one third order tensor is denoted as 𝐚𝐛𝐜I×J×K\mathbf{a}\circ\mathbf{b}\circ\mathbf{c}\in\mathbb{R}^{I\times J\times K} of three nonzero vectors 𝐚,𝐛\mathbf{a},\mathbf{b} and 𝐜\mathbf{c} is a rank-one tensor with elements aibjcka_{i}b_{j}c_{k} for all the indices, i.e. the matricizations of 𝐚𝐛𝐜I×J×K\mathbf{a}\circ\mathbf{b}\circ\mathbf{c}\in\mathbb{R}^{I\times J\times K} are rank-one matrices. A canonical polyadic (CP) decomposition of 𝒜I×J×K\mathcal{A}\in\mathbb{R}^{I\times J\times K} expresses 𝒜\mathcal{A} as a sum of rank-one outer products:

(2) 𝒜=r=1R𝐚r𝐛r𝐜r\mathcal{A}=\sum_{r=1}^{R}\mathbf{a}_{r}\circ\mathbf{b}_{r}\circ\mathbf{c}_{r}

where 𝐚rI,𝐛rJ,𝐜rK\mathbf{a}_{r}\in\mathbb{R}^{I},\mathbf{b}_{r}\in\mathbb{R}^{J},\mathbf{c}_{r}\in\mathbb{R}^{K} for 1rR1\leq r\leq R and and \circ is the outer product. The outer product 𝐚r𝐛r𝐜r\mathbf{a}_{r}\circ\mathbf{b}_{r}\circ\mathbf{c}_{r} is a rank-one component and the integer RR is the number of rank-one components in tensor 𝒜\mathcal{A}. The minimal number RR such that the decomposition (2) holds is the rank of tensor 𝒜\mathcal{A}, which is denoted by rank(𝒜)\mbox{rank}(\mathcal{A}). For any tensor 𝒜I×J×K\mathcal{A}\in\mathbb{R}^{I\times J\times K}, rank(𝒜)\mbox{rank}(\mathcal{A}) has an upper bound min{IJ,JK,IK}\min\{IJ,JK,IK\} [13]. In fact, tensor rank is NP-hard over \mathbb{R} and \mathbb{C} [7]. Another standard tensor decomposition is higher-order Tucker (HOT) decomposition. HOT is a generalization of matrix SVD where m×nm\times n matrix 𝐌\mathbf{M} has a factorizaion 𝐔𝚺𝐕T\mathbf{U}\mathbf{\Sigma}\mathbf{V}^{T}, where 𝐔\mathbf{U} and 𝐕\mathbf{V} are orthogonal matrices and Σ\Sigma is a diagonal matrix with σij=0\sigma_{ij}=0 if i0i\neq 0 and otherwise σii>=0\sigma_{ii}>=0. Its generalization in third-order tensors is

𝒯=𝒢×1𝐔(1)×2𝐔(2)×3𝐔(3)\displaystyle\mathcal{T}=\mathcal{G}\times_{1}\mathbf{U}^{(1)}\times_{2}\mathbf{U}^{(2)}\times_{3}\mathbf{U}^{(3)}

where 𝒯I1×I2×I3\mathcal{T}\in\mathbb{R}^{I_{1}\times I_{2}\times I_{3}} is the given tensor, 𝒢R1×R2×R3\mathcal{G}\in\mathbb{R}^{R_{1}\times R_{2}\times R_{3}} is the core tensor and 𝐔(𝐢)Ii×Ri\mathbf{U^{(i)}}\in\mathbb{R}^{I_{i}\times R_{i}} for i=1,2,3i=1,2,3 is an orthogonal matrix. The Tucker contracted product ×1\times_{1} is defined as 𝒢×1𝐔(1)=r1𝒢r1r2r3𝐔i1r1(1)I1×R2×R3\mathcal{G}\times_{1}\mathbf{U}^{(1)}=\sum_{r_{1}}\mathcal{G}_{r_{1}r_{2}r_{3}}\mathbf{U}^{(1)}_{i_{1}r_{1}}\in\mathbb{R}^{I_{1}\times R_{2}\times R_{3}}.

2.1 Standard Least-Squares Optimization for Tensor Decomposition

Tensor decompositions like CP and HOT [3, 12] are considered to be generalizations of the singular value decomposition (SVD) and principal component analysis (PCA) of a matrix. To achieve CP from a given third order 𝒳\mathcal{X}, an optimization problem is solved to find the finite sum of rank one tensor of 3rd3^{rd} order approximating 𝒳\mathcal{X}:

(3) min𝐀,𝐁,𝐂𝒳𝒳^F\displaystyle\min_{\mathbf{A,B,C}}\mid\mid\mathcal{X}-\mathcal{\hat{X}}\mid\mid_{F}

with

(4) 𝒳^=r=1R𝐚𝐫𝐛𝐫𝐜𝐫\displaystyle\mathcal{\hat{X}}=\sum^{R}_{r=1}\mathbf{a_{r}}\circ\mathbf{b_{r}}\circ\mathbf{c_{r}}

where 𝐀,𝐁,𝐂\mathbf{A,B,C} are factor matrices with 𝐚𝐫,𝐛𝐫,𝐜𝐫\mathbf{a_{r},b_{r},c_{r}} are their respective column vectors. This nonlinear optimization can be divided into subproblems of linear least squares with respect to the factor matrices. This is called the Alternating Least Squares (ALS).

Refer to caption
Figure 1: CP-Decomposition Architecture

ALS is an iterative method for finding the CP decomposition of a given tensor. The nonlinear optimization problem is:

(5) min𝐀,𝐁,𝐂𝒳r=1R𝐚𝐫𝐛𝐫𝐜𝐫F2\displaystyle\min_{\mathbf{A,B,C}}||\mathcal{X}-\sum_{r=1}^{R}\mathbf{a_{r}\circ b_{r}\circ\ c_{r}}||_{F}^{2}

where 𝐀\mathbf{A}, 𝐁\mathbf{B} and 𝐂\mathbf{C} are factor matrices containing the columns 𝐚𝐫I\mathbf{a_{r}}\in\mathcal{R}^{I}, 𝐛𝐫J\mathbf{b_{r}}\in\mathcal{R}^{J} and 𝐜𝐫K\mathbf{c_{r}}\in\mathcal{R}^{K} respectively. The problem can be reduced to linear least squares problems at each iteration with an initial guess 𝐀𝟎,𝐁𝟎,𝐂𝟎\mathbf{A^{0},B^{0},C^{0}}, the sequences 𝐀𝐤,𝐁𝐤,𝐂𝐤\mathbf{A^{k},B^{k},C^{k}} are generated by solving each sub-problems[15, 4]. Given the initial guess 𝐀0\mathbf{A}^{0},𝐁0\mathbf{B}^{0},𝐂0\mathbf{C}^{0}, the factor matrices are updated by the following: update 𝐀\mathbf{A} via

(6) 𝐀k+1=argmin𝐀I×R12𝐗(𝟏)𝐀(𝐂𝐤𝐁𝐤)𝐓F2\displaystyle\mathbf{A}^{k+1}=\operatorname*{arg\,min}_{\mathbf{A}\in\mathcal{R}^{I\times R}}{\frac{1}{2}\mid\mid\mathbf{X_{(1)}-\mathbf{A(C^{k}\odot B^{k})^{T}}}\mid\mid}^{2}_{F}

update 𝐁\mathbf{B} via

(7) 𝐁k+1=argmin𝐁J×R12𝐗(𝟐)𝐁(𝐂𝐤𝐀𝐤)𝐓F2\displaystyle\mathbf{B}^{k+1}=\operatorname*{arg\,min}_{\mathbf{B}\in\mathcal{R}^{J\times R}}{\frac{1}{2}\mid\mid\mathbf{X_{(2)}-\mathbf{B(C^{k}\odot A^{k})^{T}}}\mid\mid}^{2}_{F}

and update 𝐂\mathbf{C} via

(8) 𝐂k+1=argmin𝐂K×R12𝐗(𝟑)𝐂(𝐁𝐤𝐀𝐤)𝐓F2\displaystyle\mathbf{C}^{k+1}=\operatorname*{arg\,min}_{\mathbf{C}\in\mathcal{R}^{K\times R}}{\frac{1}{2}\mid\mid\mathbf{X_{(3)}-\mathbf{C(B^{k}\odot A^{k})^{T}}}\mid\mid}^{2}_{F}

These updating schemes are repeated until convergence; See figure 1, Algorithm 1. The local convergence and uniqueness of the ALS technique were discussed in this work [23].

Algorithm 1 [22]CP-ALS for 3-Way Tensor Decomposition
Tensor 𝒜I×J×K\mathcal{A}\in\mathbb{R}^{I\times J\times K}rank R,Maximum Iterations N.
CP Decomposition λRR×1\mathbf{\lambda}\in\mathrm{R}^{R\times 1}, 𝐀RI×R\mathbf{A}\in\mathrm{R}^{I\times R},𝐁RJ×R\mathbf{B}\in\mathrm{R}^{J\times R}, 𝐂RK×R\mathbf{C}\in\mathrm{R}^{K\times R}.
1: Initialize 𝐀,𝐁,𝐂;\mathbf{A,B,C};
2: for i = 1 … N DO
3:       𝐀𝐗(1)(𝐂𝐁)(𝐂𝐓𝐂𝐁𝐓𝐁)\mathbf{A}\leftarrow\mathbf{X}_{(1)}(\mathbf{C\odot B})(\mathbf{C^{T}C*B^{T}B})^{\dagger}
4:       Normalize the columns of 𝐀\mathbf{A}(storing norms in vector λ.)\lambda.)
5:       𝐁𝐗(2)(𝐂𝐀)(𝐂𝐓𝐂𝐀𝐓𝐀)\mathbf{B}\leftarrow\mathbf{X}_{(2)}(\mathbf{C\odot A})(\mathbf{C^{T}C*A^{T}A})^{\dagger}
6:       Normalize the columns of 𝐀\mathbf{A}(storing norms in vector λ.)\lambda.)
7:       𝐂𝐗(3)(𝐁𝐀)(𝐁𝐓𝐁𝐀𝐓𝐁)\mathbf{C}\leftarrow\mathbf{X}_{(3)}(\mathbf{B\odot A})(\mathbf{B^{T}B*A^{T}B})^{\dagger}
8:       Normalize the columns of 𝐀\mathbf{A}(storing norms in vector λ.)\lambda.)
9:      If convergence is met then,
10:           break for loop
11:      end if
12: end for
13: return λ,𝐀,𝐁,𝐂\mathbf{\lambda,A,B,C};
Algorithm 2 The CP-ALS Algorithm for General Case
Initialize with factor matrics 𝐀𝟎I×R,𝐁0J×R,𝐂0K×R\mathbf{A^{0}}\in\mathbb{R}^{I\times R},\mathbf{B}^{0}\in\mathbb{R}^{J\times R},\mathbf{C}^{0}\in\mathbb{R}^{K\times R}
Factor matrices 𝐀(𝐧)In×R\mathbf{A^{(n)}}\in\mathbb{R}^{I_{n}\times R} for n = 1,2,,N.1,2,\ldots,N.
1: for i = k … N DO
2:
(𝐀(n))k+1\displaystyle(\mathbf{A}^{(n)})^{k+1} =argmin𝐀(n)12||𝐗(n)𝐀((𝐀(𝐍))𝐤(𝐀(𝐧+𝟏))𝐤(𝐀(𝐧𝟏)k+1\displaystyle=\operatorname*{arg\,min}_{\mathbf{A}^{(n)}}\frac{1}{2}||\mathbf{X}_{(n)}-\mathbf{A((A^{(N)})^{k}\odot\ldots\odot(A^{(n+1)})^{k}\circ(\mathbf{A}^{(n-1)}}^{k+1}
(𝐀(1))k+1)T||F2\displaystyle\hskip 142.26378pt\odot\ldots\odot(\mathbf{A}^{(1)})^{k+1})^{T}||_{F}^{2}
3: end for
4: return 𝐀(𝐧)In×R\mathbf{A^{(n)}}\in\mathbb{R}^{I_{n}\times R};

2.2 Higher Order Orthogonal Iteration (HOOI)

HOOI is an iterative algorithm that computes low rank decomposition of a given tensor [25]. To achieve the reconstruction of 𝒯\mathcal{T} in the HOOI format, i.e. 𝒯=𝒢×1𝐔(1)×2𝐔(2)×N𝐔(N)\mathcal{T}=\mathcal{G}\times_{1}\mathbf{U}^{(1)}\times_{2}\mathbf{U}^{(2)}...\times_{N}\mathbf{U}^{(N)}, the problem formulation is

min𝐔(𝟏),𝐔(𝟐),,𝐔(𝐍)𝒯𝒢×1𝐔(1)×2𝐔(2)×N𝐔(N)F2\displaystyle\min_{\mathbf{U^{(1)},U^{(2)},...,U^{(N)}}}{\mid\mid\mathcal{T}-\mathcal{G}\times_{1}\mathbf{U}^{(1)}\times_{2}\mathbf{U}^{(2)}...\times_{N}\mathbf{U}^{(N)}\mid\mid_{F}^{2}}

where 𝒯I1××IN\mathcal{T}\in\mathbb{R}^{I_{1}\times...\times I_{N}} and 𝐔(𝐢)Ii×Ri\mathbf{U^{(i)}}\in\mathbb{R}^{I_{i}\times R_{i}} is a low rank matrix for i=1,,Ni=1,\ldots,N with IiRiI_{i}\geq R_{i}. Equivalently, the optimization can be reformulated as a maximization of 𝒢F2\mid\mid\mathcal{G}\mid\mid_{F}^{2} with 𝒢\mathcal{G} is given by

(9) 𝒢=𝒜×1𝐔(1)T×2𝐔(2)T×N𝐔(N)T\displaystyle\mathcal{G}=\mathcal{A}\times_{1}\mathbf{U}^{(1)^{T}}\times_{2}\mathbf{U}^{(2)^{T}}...\times_{N}\mathbf{U}^{(N)^{T}}

where the core tensor 𝒢R1×RN\mathcal{G}\in\mathbb{R}^{R_{1}\times\ldots R_{N}} tensor for i=1,,Ni=1,\ldots,N. See Figures 2.

Refer to caption
Figure 2: HOOI Architecture of 3-Way Tensor[18]
Algorithm 3 [21]Higher-Order Orthogonal Iteration(HOOI)
:Tensor 𝒳I1××IN\mathcal{X}\in\mathbb{R}^{I_{1}\times...\times I_{N}} and ranks R1,,RNR_{1},...,R_{N}
:Tucker Factors:𝐔𝟏𝐈𝟏×𝐑𝟏,𝐔𝟐𝐈𝟐×𝐑𝟐,,𝐔𝐍𝐈𝐍×𝐑𝐍\mathbf{U_{1}\in\mathbb{R}^{I_{1}\times R_{1}},U_{2}\in\mathbb{R}^{I_{2}\times R_{2}},...,U_{N}\in\mathbb{R}^{I_{N}\times R_{N}}}
and Core tensor 𝒢R1×R2××RN\mathcal{G}\in\mathbb{R}^{R_{1}\times R_{2}\times...\times R_{N}}
1: Initialize 𝐔𝟏,,𝐔𝐍;\mathbf{U_{1},...,U_{N}};(random or given by HSVD)
2: while convergence criterion not met do
3:       for n = 1,…,N do
4:          𝒲𝒳×N𝐔𝐍𝐓×𝐧+𝟏𝐔𝐧+𝟏𝐓×𝐧𝟏𝐔𝐧𝟏𝐓×𝟏𝐔𝟏𝐓\mathcal{W}\leftarrow\mathcal{X}\times_{N}\mathbf{U_{N}^{T}...\times_{n+1}U_{n+1}^{T}\times_{n-1}U_{n-1}^{T}...\times_{1}U_{1}^{T}}
5:          [𝐔𝚺𝐕]SVD(W(n))\mathbf{[U\Sigma V]}\leftarrow SVD(W_{(n)})
6:          𝐔𝐧𝐔(:,𝟏:𝐑𝐧)\mathbf{U_{n}\leftarrow U(:,1:R_{n})}
7:       end for
8: end while
9: 𝒢𝒳×N𝐔𝐍𝐓×𝐍𝟏𝐔𝐍𝟏𝐓×𝟏𝐔𝟏𝐓\mathcal{G}\leftarrow\mathcal{X}\times_{N}\mathbf{U_{N}^{T}\times_{N-1}U_{N-1}^{T}...\times_{1}U_{1}^{T}}

3 Covid-19 Tensor Analysis Using ALS and HOOI

3.1 Covid-19 Data Preparation

We focus on the daily Covid-19 infection data of New Jersey State from the New York Times’s Covid-19 Data Depository from the period of 04/01/2020 to 12/26/2021 [28, 1]. The state of New Jersey was initially chosen since we would like to investigate the spread of the disease in the most densely populated and affected state. The raw data collected by New York Times was a daily basis cumulative data. We restructured the data table into weekly total Covid-19 infections in each county. We stacked 21 matrices representing the counties of New Jersey of size 13 weeks ×\times 7 quarters. Thus, we construct a tensor data, 𝒞13×7×21\mathcal{C}\in\mathbb{R}^{13\times 7\times 21}. Each element of the tensor represents the total infection in a week of a particular quarter in a county. See Figure 3 below.

77 Days\underbrace{\hskip 56.9055pt}1313 Weeks\underbrace{\hskip 56.9055pt}2121 Counties\underbrace{\hskip 31.2982pt}𝒞\mathcal{C}
Figure 3: Covid-19 Tensor: New Jersey’s number of weekly Covid-19 cases every quarter per county.

To increase the accuracy and efficacy of the algorithms, we normalized the data tensor converting it into relative cases tensor with respect to the population of the respective county. We constructed the population tensor of New Jersey collecting population data from US Census 2020 [1] and divided the tensor-data tensor by the population data tensor. Each element of our new normalized tensor is the following:

𝐂(i,j,k)\displaystyle\mathbf{C}(i,j,k) =Total infections in jth week of ith quarter in the kth countyRespective Mid-year Population\displaystyle=\frac{\text{Total infections in }j^{th}\text{ week of }i^{th}\text{ quarter in the }k^{th}\text{ county}}{\text{Respective Mid-year Population}}

Specifically, we have

𝐂(1,1,1)\displaystyle\mathbf{C}(1,1,1) =Total covid infections of Atlantic County in 1st week of Aprilmid year population of Atlantic county in 2020\displaystyle=\frac{\text{Total covid infections of Atlantic County in 1st week of April}}{\text{mid year population of Atlantic county in 2020}}
=𝐀(1,1,1)𝐏(1,1,1)=67274534=2.44e04\displaystyle=\frac{\mathbf{A}(1,1,1)}{\mathbf{P}(1,1,1)}=\frac{67}{274534}=2.44e-04

The tensor 𝒞\mathcal{C} is rescaled by dividing by the population of respective counties of mid-year 2020 [1]. The new tensor data 𝒞¯\bar{\mathcal{C}} has greatly improved efficiency in the numerical implementations. See sections 3.2.1,3.2.2,4.1, and 5.2.1 for some experimental results.

3.2 Numerical Experiments

3.2.1 CP vs HOOI

One main advantage of tensor decomposition, namely, ALS and HOOI, is that it provides analytic tools for higher-order data in several modes. We implemented the algorithms, CP and HOOI, on our tensor 𝒞¯\bar{\mathcal{C}} of size 13×7×613\times 7\times 6. First, we ran the ALS algorithm to construct the CP decomposition into three-factor matrices. Then we analyze through the visualization of the three-factor matrices from the estimated constructed tensor. We are able to estimate the same evolutionary patterns of Covid-19 cases as the original data tensor; see Figures 4 and 5. Collectively, we plot the cases in Figures 6 and 7 based on the intensity levels. HOOI is faster in time but it gives lower accuracy results than CP.

Refer to caption
Figure 4: Original and Approximated using CP
Refer to caption
Figure 5: Original and Approximated using HOOI

Geo-Plot of CP approximation

Refer to caption
Figure 6: Original Covid-19 Cases Geo plot: 3rd Week of Jan 2021
Refer to caption
Figure 7: Approximated Lower Rank CP with R = 20, 3rd Week Jan 2021

3.2.2 Extracting Patterns in Covid-19 via Factor Matrices

We further explore from our output factor matrices of the CP decomposition. We multiply different factor matrices to observe the county-wise, week-wise, and quarter-wise pattern of the Covid-19 cases. The first visualization of Figure 8 shows which quarter has the highest cases increment in 2nd week. Then the second visualization of the quarter, Sep 27-Dec-26 2020, indicates that the fourth week has the highest number of cases. Our algorithm estimates that the week, of Oct 4-10,2020, is the week of the highest increment. Furthermore, we found that the same identical patterns on the original Covid-19 data tensor are consistent with our findings. Similarly, Figure 9 shows the identical pattern of our estimated and original Covid-19 infections in Essex county from June 27 - Sept 25, 2021.

Refer to caption
Figure 8: Cross-Pattern on Sep 27-Dec-26 2020
Refer to caption
Figure 9: This is the Covid-19 weekly consecutive increment (relative cases) of Essex county.7th quarter of June 27 - Sept 25, 2021.

4 Sampling Method for Alternating Least-Squares (SMALS)

Given a tensor 𝒳I×J×K\mathcal{X}\in\mathbb{R}^{I\times J\times K} and a fixed positive integer RR, the ALS algorithm solves three independent least squares problems alternatively. The least squares problems can be solved via various methods such as QR factorization, Cholesky factorization and etc. However, it requires of an 𝒪(R3)\mathcal{O}(R^{3}) floating point operations per second. To reduce this we let S{1,R}S\subset\{1,\ldots R\} be the set of sample indices and As,BsA_{s},B_{s} and CsC_{s} represent the sub-matrices obtained by choosing the columns of A,BA,B, and CC according to the index set SS respectively. The partial derivatives of the objective function ff with respect to the blocks As,BsA_{s},B_{s} and CsC_{s} are

(10) fAS=X(1)(CSBS)+A(CB)T(CSBS),\frac{\partial f}{\partial A_{S}}=-X_{(1)}(C_{S}\odot B_{S})+A(C\odot B)^{T}(C_{S}\odot B_{S}),
(11) fBS=X(2)(CSAS)+B(CA)T(CSAS),\frac{\partial f}{\partial B_{S}}=-X_{(2)}(C_{S}\odot A_{S})+B(C\odot A)^{T}(C_{S}\odot A_{S}),

and

(12) fCS=X(3)(BSAS)+C(BA)T(BSAS).\frac{\partial f}{\partial C_{S}}=-X_{(3)}(B_{S}\odot A_{S})+C(B\odot A)^{T}(B_{S}\odot A_{S}).

the stationary points of the above equations can be obtained by setting each gradient equal to zero. For instance ASf=0\nabla_{A_{S}}f=0 implies the following normal equation

(13) AS((CB)T(CSBS))S\displaystyle A_{S}\left((C\odot B)^{T}(C_{S}\odot B_{S})\right)^{S} =ASC((CB)T(CSBS))SC\displaystyle=-A_{S^{C}}\left((C\odot B)^{T}(C_{S}\odot B_{S})\right)^{S^{C}}
+X(1)(CSBs)\displaystyle+X_{(1)}(C_{S}\odot B_{s})

where ASA^{S} represents the sub matrix of AA obtained by sampling the rows of AA corresponding to the sampling set SS. Similar results can be derived by solving the equations BSf=0\nabla_{B_{S}}f=0 and CSf=0\nabla_{C_{S}}f=0. This reduces the latter computational complexity to 𝒪(max{|S|3})\mathcal{O}(\max\{|S|^{3}\}). When I,JandKI,J\,\text{and}\,K are relatively large, the reduction could be significant. The sampling set SS is chosen based on the performance of each block variable in each iteration. For instance, if the update for the block AiA_{i} yields more decrease in the objective function compared to the block AjA_{j}, the index ii is replaced by jj in the next iteration. The differentiation of ALS can be found here [26]. The derivation of SMALS and further analysis can be found [10].

4.1 Numerical Results of SMALS vs ALS

We generate some numerical results to show the efficacy of SMALS and to compare with standard ALS on Covid-19 tensor data and random color image data.

Refer to caption
Figure 10: SMALS vs ALS on Real Covid-19 Tensor
Refer to caption
Figure 11: SMALS vs ALS on Random Color Images
Refer to caption
Figure 12: Error Evolution of ALS vs SMALS on random images in fig -8

Efficacy Comparison Between ALS and SMALS

Refer to caption
Figure 13: ALS vs SMALS Comparison

We implement both CP-ALS and CP-SMALS algorithms on different sizes of varying tensors. We run the codes twenty times for each tensor case and average their results. Our result indicates SMALS is more efficient in terms of time cost, see figure 13

5 Tensor Sparse Optimization

In [31], an iterative method based on proximal algorithms called low-rank approximation of tensors (LRAT) was proposed to solve the minimum rank optimization:

(14) minar,br,cr,αr𝒞F2+σ𝜶1\displaystyle\min_{a_{r},b_{r},c_{r},\alpha_{r}}\|\mathcal{C}-\mathcal{L}\|_{F}^{2}+\sigma\|\bm{\alpha}\|_{\ell_{1}}

where =r=1Rαr𝐚𝐫𝐛𝐫𝐜𝐫\mathcal{L}=\sum^{R}_{r=1}\alpha_{r}\mathbf{a_{r}}\circ\mathbf{b_{r}}\circ\mathbf{c_{r}}, 𝒞\mathcal{C}, a given data tensor. The implementation is in Algorithm 4. In a recent work, [8], a more practical choice of the parameter σ\sigma led to a more efficient and accurate algorithm. The practical regularization method is based on a flexible Golub-Kahan (fGK) process. In Figures 14 and 15, we implemented the LRAT + fGK algorithm with the convergence plot; the sparse model predicts the general phenomena of the Covid-19 cases.

Refer to caption
Figure 14: LRAT + fGK Estimation
Refer to caption
Figure 15: LRAT + fGK Convergence Plot

.
In figure 77, the error plot shows the difference between the original data and estimation. The horizontal line detects an anomaly and this spot coincides with a spike in the number of infections.

5.1 Hotspot Identification

[14, 2] Recently, “hotspots” in infectious disease epidemiology have been increasingly used, and they dictate the implementation of appropriate control measures for the specific place. Despite “Hotspots “has not a concrete definition, it is described variously as per area of elevated incidence, prevalence, higher transmission efficiency, or higher chance of disease emergence. Our research is also on Covid-19 pandemic-infected population data, so we defined “hotspots” as the geographical area where the higher intensity of disease prevalence and transmission rate as per population density and flow in the specific area. More specifically, we have defined a threshold in the specific area as per population and its activities which precisely indicates the hotspots there. Identification of hotspots attracts the attention of authorities so that more efficient control measures may be implemented by targeting these areas to sustain further transmission.

5.1.1 Sparse Optimization for Hotspot Identification

Our goal is to detect hotspots rapidly. Our goal is to have the following decomposition: 𝒴=+𝒮\mathcal{Y}=\mathcal{L}+\mathcal{S} where 𝒴\mathcal{Y} is the given tensor, \mathcal{L} is a low rank reconstructed tensor of 𝒴\mathcal{Y} and 𝒮\mathcal{S} is the sparse tensor. In video processing, to detect anomalous activities, the original video is separated into its background and foreground subspaces. The tensor \mathcal{L} is the background and 𝒮\mathcal{S} is the foreground. The sparse tensor 𝒮\mathcal{S} can provide anomalous activities. Similarly, 𝒮\mathcal{S} will contain hotspot occurrence. Thus, we use the sparse tensor model and implement LRAT 4.

5.1.2 Hotspot Detection with Practical Threshold

First, we convert the Covid-19 data tensor into a tensor with each entry as the rate of change in the number of infections from the prior week by computing

Number of infections this weekLast week’s total infectionsLast week’s number of infections\frac{\text{Number of infections this week}-\text{Last week's total infections}}{\text{Last week's number of infections}}

for each entry. Then we apply LRAT algorithm to the new tensor. On the newly converted tensor, we apply the practical threshold mean + 5*standard deviation. The resulting plot from LRAT spikes above the threshold line in the week considered to have hotspots. We compare it with the original graph to see that it is consistent, see fig(4.3) below.

Refer to caption
Figure 16: Error Plot for Atlantic County: error between actual vs projected relative data with margin line (mean + 5* standard deviation)

5.2 Tensor Completion for Predicting Covid-19 Infection Cases

The tensor completion is the problem of completing the missing or unobserved entries of the partially observed tensor. Tensor completion algorithms have a wide range of applications in the field of big data [27], computer vision such as image completion [8, 30] which focus on filling the missing entries in a presence of noise. Other important applications of tensor completion are link prediction[17, 5] and recommendation system [9, 6] and video completion [16]. With the given the tensor \mathcal{L} of order nn with missing entries for a given rank, the tensor completion optimization problem can be formulated as the following:

minimizerank()\displaystyle\text{minimize}_{\bf\mathcal{L}}\>\,\text{rank}(\bf\mathcal{L})
subject to(Ω)=𝒞(Ω)\displaystyle\text{subject to}\>\,\mathcal{L}(\Omega)=\mathcal{C}(\Omega)

In the work of Wang and Navasca [31], this optimization is reformulated to the tensor sparse model. To apply the tensor sparse model with constraints in prediction of Covid-19 infection cases, we set up our data by removing some column data from the original tensor; see Figure 17. We implement Algorithm 4 to complete the Covid-19 tensor with the observed data constraints. The reconstructed tensor exhibits the same pattern of the tensor cases even though there is some dissimilarities in particular numerical data.

Refer to caption
Figure 17: Our Tensor Completion Architecture on Covid-19 data Tensor
Algorithm 4 [31, 8, 24]Tensor Completion Via CP with ISTA
Tensor 𝒞I×J×K\mathcal{C}\in\mathbb{R}^{I\times J\times K}rank of the tensor R,a regularization parameter λ\lambda and a scale t>>0.
An approximated tensor 𝒳\mathcal{X}
1: Initialize a tensor 𝒞0=[𝝈0;𝐀0,𝐁0,𝐂0]T\mathcal{C}^{0}=[\bm{\sigma}^{0};\mathbf{A}^{0},\mathbf{B}^{0},\mathbf{C}^{0}]_{T}.
2:Update steps:
I) Update factor matrices 𝐀,𝐁,𝐂\mathbf{A,B,C}:
     Compute 𝐔n\mathbf{U}^{n} = 𝐃𝐧(𝐂𝐁)𝐓\mathbf{D^{n}(C\odot B)^{T}} and let dn=max{𝐔n𝐔nTF,1}d_{n}=\max\{\|\mathbf{U}^{n}{\mathbf{U}^{n}}^{T}\|_{F},1\}.
     Compute 𝐃n\mathbf{D}^{n} and 𝐀n+1\mathbf{A}^{n+1} by
𝐃n=𝐀n1tdn𝐀f(𝐀n,𝐁n,𝐂n,𝝈n),\displaystyle\mathbf{D}^{n}=\mathbf{A}^{n}-\frac{1}{td_{n}}\nabla_{\mathbf{A}}f(\mathbf{A}^{n},\mathbf{B}^{n},\mathbf{C}^{n},\bm{\sigma}^{n}),
𝐀n+1=𝐃ndiag(𝐝1n,,𝐝Tn)1\displaystyle\mathbf{A}^{n+1}=\mathbf{D}^{n}\text{diag}(\|\mathbf{d}_{1}^{n}\|,\cdots,\|\mathbf{d}_{T}^{n}\|)^{-1}
     where 𝐝in\mathbf{d}_{i}^{n} is the ii-th column of 𝐃n\mathbf{D}^{n} for i=1,,Ti=1,\cdots,T.
     Compute 𝐕n\mathbf{V}^{n} as 𝐔𝐧\mathbf{U^{n}} and let en=max{𝐕n𝐕nTF,1}e_{n}=\max\{\|\mathbf{V}^{n}{\mathbf{V}^{n}}^{T}\|_{F},1\}.
     Compute 𝐄n\mathbf{E}^{n} and 𝐁n+1\mathbf{B}^{n+1} by
𝐄n=𝐁n1ten𝐁f(𝐀n+1,𝐁n,𝐂n,𝝈n),\displaystyle\mathbf{E}^{n}=\mathbf{B}^{n}-\frac{1}{te_{n}}\nabla_{\mathbf{B}}f(\mathbf{A}^{n+1},\mathbf{B}^{n},\mathbf{C}^{n},\bm{\sigma}^{n}),
𝐁n+1=𝐄ndiag(𝐞1n,,𝐞Tn)1\displaystyle\mathbf{B}^{n+1}=\mathbf{E}^{n}\text{diag}(\|\mathbf{e}_{1}^{n}\|,\cdots,\|\mathbf{e}_{T}^{n}\|)^{-1}
     where 𝐞in\mathbf{e}_{i}^{n} is the ii-th column of 𝐄n\mathbf{E}^{n} for i=1,,Ti=1,\cdots,T.
     Compute 𝐖n\mathbf{W}^{n} by (1) and let fn=max{𝐖n𝐖nTF,1}f_{n}=\max\{\|\mathbf{W}^{n}{\mathbf{W}^{n}}^{T}\|_{F},1\}.
     Compute 𝐅n\mathbf{F}^{n} and 𝐂n+1\mathbf{C}^{n+1} by
𝐅n=𝐂n1tfn𝐂f(𝐀n+1,𝐁n+1,𝐂n,𝝈n),\displaystyle\mathbf{F}^{n}=\mathbf{C}^{n}-\frac{1}{tf_{n}}\nabla_{\mathbf{C}}f(\mathbf{A}^{n+1},\mathbf{B}^{n+1},\mathbf{C}^{n},\bm{\sigma}^{n}),
𝐂n+1=𝐅ndiag(𝐟1n,,𝐟Tn)1\displaystyle\mathbf{C}^{n+1}=\mathbf{F}^{n}\text{diag}(\|\mathbf{f}_{1}^{n}\|,\cdots,\|\mathbf{f}_{T}^{n}\|)^{-1}
     where 𝐟in\mathbf{f}_{i}^{n} is the ii-th column of 𝐅n\mathbf{F}^{n} for i=1,,Ri=1,\cdots,R.
II) Update the row vector 𝝈\bm{\sigma}:
‘     Compute 𝐐n+1\mathbf{Q}^{n+1} by (2) and let ηn=max{𝐐n+1𝐐n+1TF,1}\eta_{n}=\max\{\|\mathbf{Q}^{n+1}{\mathbf{Q}^{n+1}}^{T}\|_{F},1\}.
     Compute 𝜷n+1\bm{\beta}^{n+1} by (3) and use the soft thresholding:      
𝝈n+1=𝒯λtηn(𝜷n+1).\bm{\sigma}^{n+1}=\mathcal{T}_{\frac{\lambda}{t\eta_{n}}}(\bm{\beta}^{n+1}).
3: Denote the limitations by 𝐀^,𝐁^,𝐂^,𝝈^\mathbf{\hat{A}},\mathbf{\hat{B}},\mathbf{\hat{C}},\bm{\hat{\sigma}}, compute 𝒯^=[𝝈^;𝐀^,𝐁^,𝐂^]R\mathcal{\hat{T}}=[\bm{\hat{\sigma}};\mathbf{\hat{A}},\mathbf{\hat{B}},\mathbf{\hat{C}}]_{R} and count the number T^\hat{T} of nonzero entries in 𝝈^\bm{\hat{\sigma}}.
4: Impose constraints 𝒞(Ω)=𝒯(Ω)\mathcal{C}(\Omega)=\mathcal{T}(\Omega).
5: Return The tensor 𝒯^\mathcal{\hat{T}} with the estimated rank R^\hat{R}.

5.2.1 Numerical Experiments on Prediction of Infected Cases

We implement a tensor completion algorithm to our tensor data. We replace the last (most current) week’s data of the Atlantic and Warren counties with the mean of the remaining data. Then, the missing values on the tensor is completed via low rank approximation of the Covid-19 tensor. We have the following very nice prediction; see Figures 18-22.

Refer to caption
Figure 18: After Tensor Completion Applied to Atlantic County
Refer to caption
Figure 19: After Tensor Completion Applied to Warren County
Refer to caption
Figure 20: Prediction Covid-19 cases of last quarter based tensor completion formulation: 7th7^{th} quarter of Warren county cases is replaced by the mean of the remaining data and applied the LRAT 4 algorithm.
Refer to caption
Figure 21: 7th Qtr Prediction of Covid Infections: this is the total number of cases prediction of Atlantic county for 7th7^{th} Quarter (June27-Sep25
Refer to caption
Figure 22: Evolution of error for each epoch

6 Future Works and Conclusion

In this work, we apply various tensor models and tensor algorithms to analyze Covid-19 data. The standard tensor models, CP and HOOI, with their off-the-shelves algorithms, ALS and HOOI, are tested against a new sampling method for ALS (SMALS). The numerical results are very promising as it cuts the down time while keeping the Frobenious norm errors relatively consistent with ALS. Here tensor sparse model [31, 8] is used as the model for prediction of future Covid-19 infection cases as a tensor completion problem. The numerical results are impressive as the tensor completion algorithm can predict infection a week ahead as well as a quarter ahead. Moreover, the tensor sparse model can locate which counties exhibit hotspots. The sparse tensor model is based on proximal algorithms and flexible hybrid method by Golub-Kahan methods for efficient practical implementation.

In our future work, we would like to have a more mathematical and methodical technique in locating and detecting hotspots. We have used l1l_{1} minimization in the tensor completion; we will work on the efficacy of l0l_{0} minimization of low-rank approximation of CP decomposition and tensor completion algorithm in the proximal framework.

Acknowledgments

This material is based upon work supported by the National Science Foundation under Grant No. DMS-1439786 while the author, C. Navasca, was in residence at the Institute for Computational and Experimental Research in Mathematics in Providence, RI, during the Model and Dimension Reduction in Uncertain and Dynamic Systems Program. C. Navasca is also in part supported by National Science Foundation No. MCB-2126374.

References

  • [1] United states census bureau: New jersey population, GitHub repository, (2020).
  • [2] P. Bejon, T. N. Williams, A. Liljander, A. M. Noor, J. Wambua, E. Ogada, A. Olotu, F. H. Osier, S. I. Hay, A. Färnert, and K. Marsh, Stable and unstable malaria hotspots in longitudinal cohort studies in Kenya, PLoS Med, 7 (2010), p. e1000304.
  • [3] R. Bro, Parafac. tutorial and applications, Chemometrics and Intelligent Laboratory Systems, 38 (1997), pp. 149–171, https://doi.org/https://doi.org/10.1016/S0169-7439(97)00032-4, https://www.sciencedirect.com/science/article/pii/S0169743997000324.
  • [4] L. De Lathauwer, B. De Moor, and J. Vandewalle, A multilinear singular value decomposition, SIAM Journal on Matrix Analysis and Applications, 21 (2000), pp. 1253–1278, https://doi.org/10.1137/S0895479896305696, https://doi.org/10.1137/S0895479896305696, https://arxiv.org/abs/https://doi.org/10.1137/S0895479896305696.
  • [5] D. M. Dunlavy, T. G. Kolda, and E. Acar, Temporal link prediction using matrix and tensor factorizations, 5 (2011).
  • [6] H. Ge, J. Caverlee, and H. Lu, Taper: A contextual tensor-based approach for personalized expert recommendation, in Proceedings of the 10th ACM Conference on Recommender Systems, RecSys ’16, New York, NY, USA, 2016, Association for Computing Machinery, p. 261–268.
  • [7] C. J. Hillar and L.-H. Lim, Most tensor problems are np-hard, J. ACM, 60 (2013), https://doi.org/10.1145/2512329, https://doi.org/10.1145/2512329.
  • [8] J. Jiang, F. Sanogo, and C. Navasca, Low-CP-rank tensor completion via practical regularization, J. Sci. Comput., 91 (2022), pp. Paper No. 18, 20, https://doi.org/10.1007/s10915-022-01789-9, https://doi.org/10.1007/s10915-022-01789-9.
  • [9] A. Karatzoglou, X. Amatriain, L. Baltrunas, and N. Oliver, Multiverse recommendation: N-dimensional tensor factorization for context-aware collaborative filtering, in Proceedings of the Fourth ACM Conference on Recommender Systems, RecSys ’10, New York, NY, USA, 2010, Association for Computing Machinery, p. 79–86.
  • [10] R. G. Karim, D. Dulal, and C. Navasca, Tensor decomposition via sampling methods, (2023).
  • [11] R. G. Karim, G. Guo, D. Yan, and C. Navasca, Accurate tensor decomposition with simultaneous rank approximation for surveillance videos, in 2020 54th Asilomar Conference on Signals, Systems, and Computers, IEEE, 2020, pp. 842–846.
  • [12] T. G. Kolda and B. W. Bader, Tensor decompositions and applications, SIAM Review, 51 (2009), pp. 455–500, https://doi.org/10.1137/07070111X.
  • [13] J. Kruskal, Three-way arrays: rank and uniqueness of trilinear decompositions, with applications to arithmetic complexity and statistics, Linear Algebra Appl., 18 (1977), pp. 95–138.
  • [14] J. Lessler, A. S. Azman, H. S. McKay, and S. M. Moore, What is a Hotspot Anyway?, Am J Trop Med Hyg, 96 (2017), pp. 1270–1273.
  • [15] N. Li, S. Kindermann, and C. Navasca, Some convergence results on the regularized alternating least-squares method for tensor decomposition, 2011, https://doi.org/10.48550/ARXIV.1109.3831, https://arxiv.org/abs/1109.3831.
  • [16] J. Liu, P. Musialski, P. Wonka, and J. Ye, Tensor completion for estimating missing values in visual data, in 2009 IEEE 12th International Conference on Computer Vision, 2009, pp. 2114–2121, https://doi.org/10.1109/ICCV.2009.5459463.
  • [17] Y. Liu, F. Shang, H. Cheng, J. Cheng, and H. Tong, Factor matrix trace norm minimization for low-rank tensor completion, in SDM, 2014.
  • [18] F. Liu , J. Chen , W. Tan , and C. Cai , A multi-modal fusion method based on higher-order orthogonal iteration decomposition, Entropy, 23 (2021), https://doi.org/10.3390/e23101349, https://www.mdpi.com/1099-4300/23/10/1349.
  • [19] H. Lyu, C. Strohmeier, G. Menz, and D. Needell, COVID-19 time-series prediction by joint dictionary learning and online NMF, CoRR, abs/2004.09112 (2020), https://arxiv.org/abs/2004.09112, https://arxiv.org/abs/2004.09112.
  • [20] J. Mathew, R. K. Behera, Z. E. Panthakkalakath, et al., A deep learning framework for covid outbreak prediction, arXiv preprint arXiv:2010.00382, (2020).
  • [21] E. E. Papalexakis, C. Faloutsos, and N. D. Sidiropoulos, Tensors for data mining and data fusion: Models, applications, and scalable algorithms, ACM Trans. Intell. Syst. Technol., 8 (2016), https://doi.org/10.1145/2915921, https://doi.org/10.1145/2915921.
  • [22] E. E. Papalexakis, C. Faloutsos, and N. D. Sidiropoulos, Tensors for data mining and data fusion: Models, applications, and scalable algorithms, ACM Trans. Intell. Syst. Technol., 8 (2017), pp. 16:1–16:44, https://doi.org/10.1145/2915921, https://doi.org/10.1145/2915921.
  • [23] A. Preprint, Local convergence of the alternating least squares algorithm for canonical tensor approximation, SIAM Journal on Matrix Analysis and Applications, 33 (2012), https://doi.org/10.1137/110843587.
  • [24] F. Sanogo and C. Navasca, Tensor completion via the cp decomposition, in 2018 52nd Asilomar Conference on Signals, Systems, and Computers, 2018, pp. 845–849, https://doi.org/10.1109/ACSSC.2018.8645405.
  • [25] B. N. Sheehan and Y. Saad, Higher order orthogonal iteration of tensors (hooi) and its relation to pca and glram, in SDM, 2007.
  • [26] N. D. Sidiropoulos, L. De Lathauwer, X. Fu, K. Huang, E. E. Papalexakis, and C. Faloutsos, Tensor decomposition for signal processing and machine learning, IEEE Transactions on Signal Processing, 65 (2017), pp. 3551–3582, https://doi.org/10.1109/TSP.2017.2690524.
  • [27] Q. Song, H. Ge, J. Caverlee, and X. Hu, Tensor completion algorithms in big data analytics, 13 (2019), https://doi.org/10.1145/3278607, https://doi.org/10.1145/3278607.
  • [28] N. Times, New jersey counties covid cases. https://github.com/nytimes/covid-19-data/blob/master/us-counties.csv.
  • [29] H. Wang and N. Yamamoto, Using a partial differential equation with google mobility data to predict covid-19 in arizona, Mathematical Biosciences and Engineering, 17 (2020), pp. 4891–4904, https://doi.org/10.3934/mbe.2020266, https://www.aimspress.com/article/doi/10.3934/mbe.2020266.
  • [30] X. Wang and C. Navasca, Adaptive low rank approximation of tensors, in Proceedings of the IEEE International Conference on Computer Vision Workshop (ICCVW, Santiago, Chile, 2015.
  • [31] X. Wang and C. Navasca, Low-rank approximation of tensors via sparse optimization, Numer. Linear Algebra Appl., 25 (2018), pp. e2136, 17, https://doi.org/10.1002/nla.2136, https://doi.org/10.1002/nla.2136.
  • [32] X. Wang and C. Navasca, Low-rank approximation of tensors via sparse optimization, Numerical Linear Algebra Appl., 25 (2018), pp. 2183–2202, https://doi.org/http://dx.doi.org/10.1002/andp.19053221004.
  • [33] Z. Yang, Z. Zeng, K. Wang, S.-S. Wong, W. Liang, M. Zanin, P. Liu, X. Cao, Z. Gao, Z. Mai, J. Liang, X. Liu, S. Li, Y. Li, F. Ye, W. Guan, Y. Yang, F. Li, S. Luo, Y. Xie, B. Liu, Z. Wang, S. Zhang, Y. Wang, N. Zhong, and J. He, Modified seir and ai prediction of the epidemics trend of covid-19 in china under public health interventions, Journal of Thoracic Disease, 12 (2020), https://jtd.amegroups.com/article/view/36385.
  • [34] Y. Zhao, H. Yan, S. E. Holte, R. P. Kerani, and Y. Mei, Rapid detection of hot-spot by tensor decomposition with application to weekly gonorrhea data, in International Workshop on Intelligent Statistical Quality Control, Springer, 2019, pp. 265–286.