Norms of structured random matrices

Radosław Adamczak Radosław Adamczak, Institute of Mathematics, University of Warsaw, Banacha 2, 02–097 Warsaw, Poland. [email protected] , Joscha Prochno Joscha Prochno, Faculty of Computer Science and Mathematics, University of Passau, Innstraße 33, 94032 Passau, Germany. [email protected] , Marta Strzelecka Marta Strzelecka, Institute of Mathematics and Scientific Computing, University of Graz, Heinrichstraße 36, 8010 Graz, Austria; Institute of Mathematics, University of Warsaw, Banacha 2, 02–097 Warsaw, Poland. [email protected] (corresponding author) and Michał Strzelecki Michał Strzelecki, Institute of Mathematics, University of Warsaw, Banacha 2, 02–097 Warsaw, Poland. [email protected]

(Date: March 31, 2023)

Abstract.

For $m,n\in\mathbb{N}$ , let $X=(X_{ij})_{i\leq m,j\leq n}$ be a random matrix, $A=(a_{ij})_{i\leq m,j\leq n}$ a real deterministic matrix, and $X_{A}=(a_{ij}X_{ij})_{i\leq m,j\leq n}$ the corresponding structured random matrix. We study the expected operator norm of $X_{A}$ considered as a random operator between $\ell_{p}^{n}$ and $\ell_{q}^{m}$ for $1\leq p,q\leq\infty$ . We prove optimal bounds up to logarithmic terms when the underlying random matrix $X$ has i.i.d. Gaussian entries, independent mean-zero bounded entries, or independent mean-zero $\psi_{r}$ ( $r\in(0,2]$ ) entries. In certain cases, we determine the precise order of the expected norm up to constants. Our results are expressed through a sum of operator norms of Hadamard products $A\circ A$ and $(A\circ A)^{T}$ .

Key words and phrases:

Gaussian random matrix, operator norm, structured random matrix,

\psi_{r}

random variable.

2020 Mathematics Subject Classification:

Primary 60B20; Secondary 46B09; 52A23; 60G15; 60E15.

1. Introduction and main results

With his work on the statistical analysis of large samples [69], Wishart initiated the systematic study of large random matrices. Ever since, random matrices have continuously entered more and more areas of mathematics and applied sciences beyond probability theory and statistics, for instance, in numerical analysis through the work of Goldstine and von Neumann [65, 20] and in quantum physics through the works of Wigner [66, 67, 68] on his famous semicircle law, which resulted in significant effort to understand spectral statistics of random matrices from an asymptotic point of view. Today, random matrix theory has grown into a vital area of probability theory and statistics, and within the last two decades, random matrices have come to play a major role in many areas of (algorithmic) computational mathematics, for instance, in questions related to sparsification methods [1, 54] and sparse approximation [57, 58], dimension reduction [4, 12, 44], or combinatorial optimization [46, 53]. We refer the reader to [5, 6, 60] for more information.

In this paper, we are interested in the non-asymptotic theory of (large) random matrices. This theory plays a fundamental role in geometric functional analysis at least since the ’70s, the connection coming in various different flavors. It is of particular importance in the geometry of Banach spaces and the theory of operator algebras [9, 10, 15, 18, 21, 30] and their applications to high-dimensional problems, for instance, in convex geometry [17, 22], compressed sensing [14, 16, 48, 63], information-based complexity [27, 28], or statistical learning theory [50, 64]. On the other hand, geometric functional analysis had and still has enduring influence on random matrix theory as is witnessed, for instance, through applications of measure concentration techniques; we refer to [15, 42] and the references cited therein. The quantity we study and focus on here concerns the expected operator norm of random matrices considered as operators between finite-dimensional $\ell_{p}$ spaces; recall that $\ell_{p}^{n}$ denotes the space $\mathbb{R}^{n}$ equipped with the (quasi-)norm $\|\cdot\|_{p}$ , given by $\|(x_{j})_{j=1}^{n}\|_{p}=(\sum_{j=1}^{n}|x_{j}|^{p})^{1/p}$ for $0<p<\infty$ and $\|(x_{j})_{j=1}^{n}\|_{\infty}=\max_{j\leq n}|x_{j}|$ if $p=\infty$ . We address the following problem: for $1\leq p,q\leq\infty$ and $m,n\in\mathbb{N}$ , determine the right order (up to constants that may depend on the parameters $p$ and $q$ ) of

\mathbb{E}\|X_{A}\colon\ell^{n}_{p}\to\ell^{m}_{q}\|,

where, given a deterministic real $m\times n$ matrix $A=(a_{ij})_{i\leq m,j\leq n}$ and a random matrix $X=(X_{ij})_{i\leq m,j\leq n}$ , we denote by

X_{A}\coloneqq A\mathbin{\circ}X=(a_{ij}X_{ij})_{i\leq m,j\leq n}

the structured random matrix; the symbol $\mathbin{\circ}$ stands for the Hadamard product of matrices (i.e., entrywise multiplication). The bounds on the expected operator norm should be of optimal order and expressed in terms of the coefficients $a_{ij}$ , $i\leq m,j\leq n$ . Understanding such expressions and related quantities is important, for instance, when studying the worst-case error of optimal algorithms which are based on random information in function approximation problems [28] (see also [33]) or the quality of random information for the recovery of vectors from an $\ell_{p}$ -ellipsoid, where (the radius of) optimal information is given by Gelfand numbers of a diagonal operator [29].

In the case where the random entries of $X$ are i.i.d. standard Gaussians (then we write $G_{A}$ instead of $X_{A}$ ) and $1\leq p,q\leq\infty$ , we will show the following bound, which is sharp up to logarithmic terms:

(1.1)

D_{1}+D_{2}\lesssim\mathbb{E}\|G_{A}\colon\ell^{n}_{p}\to\ell^{m}_{q}\|\lesssim(\ln n)^{1/p^{*}}(\ln m)^{1/q}\bigl{[}\sqrt{\ln(mn)}D_{1}+\sqrt{\ln n}D_{2}\bigr{]},

where $D_{1}\coloneqq\|A\mathbin{\circ}A\colon\ell^{n}_{p/2}\to\ell^{m}_{q/2}\|^{1/2}$ , $D_{2}\coloneqq\|(A\mathbin{\circ}A)^{T}\colon\ell^{m}_{q^{*}/2}\to\ell^{n}_{p^{*}/2}\|^{1/2}$ , and $p^{*}$ denotes the Hölder conjugate of $p$ defined by the relation $1/p+1/p^{*}=1$ . As will be explained later, we obtain sharp estimates in certain cases and derive results similar to (1.1) for other models of randomness.

1.1. History of the problem and known results

In what follows, $A=(a_{ij})_{i,j}$ is a real deterministic matrix and $G=(g_{ij})_{i,j}$ always stands for a random matrix with i.i.d. standard Gaussian entries (usually the matrices are of size $m\times n$ unless explicitly stated otherwise). We use $C(r)$ , $C(r,K)$ , etc. for positive constants which may depend only on the parameters given in brackets and write $C,C^{\prime},c,c^{\prime},\dots$ for positive absolute constants. The symbols $\lesssim$ , $\lesssim_{r}$ , $\lesssim_{r,K}$ , etc. denote that the inequality holds up to multiplicative constants depending only on the parameters given in the subscripts; we write $a\asymp b$ if $a\lesssim b$ and $b\lesssim a$ , and $\asymp_{r}$ , $\asymp_{r,K}$ , etc. if the constants may depend on the parameters given in the subscript.

In 1975, Bennett, Goodman, and Newman [9] proved that if $X$ is an $m\times n$ random matrix with independent, mean-zero entries taking values in $[-1,1]$ , and $2\leq q<\infty$ , then

(1.2)

\mathbb{E}\|X\colon\ell^{n}_{2}\to\ell^{m}_{q}\|\lesssim_{q}\max\{n^{1/2},m^{1/q}\}.

In fact, up to constants, this estimate is best possible: for any $m\times n$ matrix $X^{\prime}$ with $\pm 1$ entries one readily sees that $\|X^{\prime}\colon\ell^{n}_{2}\to\ell^{m}_{q}\|\geq\max\{n^{1/2},m^{1/q}\}$ ; just use standard unit vectors and operator duality. Moreover, in this ‘unstructured’ case, where $a_{ij}=1$ for all $i,j$ , it is easy to extend (1.2) to the whole range of $p,q\in[1,\infty]$ (see [8, 13] or Remark 4.2 below). Also, if all entries are i.i.d. Rademacher random variables, then the bounds are two-sided, i.e., the expected operator norm is, up to constants, the same as the minimal norm for all $p$ , $q$ (see [8, Proposition 3.2] or [13, Satz 2]).

The case most studied in the literature is the one of the spectral norm, i.e., the $\ell_{2}^{n}\to\ell_{2}^{m}$ operator norm. Seginer [51] proved in 2000 that if $X=(X_{ij})_{i\leq m,j\leq n}$ is an $m\times n$ random matrix with i.i.d. mean-zero entries, then its operator norm is of the same order as the sum of expectations of the maximum Euclidean norm of rows and columns of $X$ , i.e.,

(1.3)

\displaystyle\mathbb{E}\|X\colon\ell_{2}^{n}\to\ell_{2}^{m}\|

\displaystyle\asymp\mathbb{E}\max_{j\leq n}\|(X_{ij})_{i=1}^{m}\|_{2}+\mathbb{E}\max_{i\leq m}\|(X_{ij})_{j=1}^{n}\|_{2}.

Riemer and Schütt [49] proved that, up to a logarithmic factor $\ln(en)^{2}$ , the same holds true for any random matrix with independent but not necessarily identically distributed mean-zero entries. Let us also mention that in the Gaussian setting one can use a non-commutative Khintchine bound (see, e.g., [59, Equation (4.9)]) to show that, up to a factor $\sqrt{\ln n}$ , the expected spectral norm is of the order of the largest Euclidean norm of its rows and columns.

In the very same setting that was considered by Riemer and Schütt, Latała [37] had obtained a few years earlier the dimension-free estimate

\mathbb{E}\|X\colon\ell_{2}^{n}\to\ell_{2}^{m}\|\lesssim\max_{j\leq n}\Bigl{(}\sum_{i=1}^{m}\mathbb{E}X_{ij}^{2}\Bigr{)}^{1/2}+\max_{i\leq m}\Bigl{(}\sum_{j=1}^{n}\mathbb{E}X_{ij}^{2}\Bigr{)}^{1/2}+\Bigl{(}\sum_{i=1}^{m}\sum_{j=1}^{n}\mathbb{E}X_{ij}^{4}\Bigr{)}^{1/4}.

This bound is superior to the Riemer–Schütt bound in the case of matrices with all entries equal to $1$ and is optimal for Wigner matrices. In other cases, like the one of diagonal matrices, the Riemer–Schütt bound is better.

In the case of structured Gaussian matrices, Latała, van Handel, and Youssef [40], building upon earlier work of Bandeira and van Handel [7] (which combined the moment method with combinatorial considerations) as well as results proved by van Handel in [61] (which used Slepian’s lemma), obtained the precise behavior without any logarithmic terms in the dimension, namely

(1.4)		$\displaystyle\mathbb{E}\\|G_{A}\colon\ell_{2}^{n}\to\ell_{2}^{m}\\|$	$\displaystyle\asymp\mathbb{E}\max_{j\leq n}\\|(a_{ij}g_{ij})_{i=1}^{m}\\|_{2}+\mathbb{E}\max_{i\leq m}\\|(a_{ij}g_{ij})_{j=1}^{n}\\|_{2}$
		$\displaystyle\asymp\max_{j\leq n}\\|(a_{ij})_{i=1}^{m}\\|_{2}+\max_{i\leq m}\\|(a_{ij})_{j=1}^{n}\\|_{2}+\mathbb{E}\max_{i\leq m,j\leq n}\|a_{ij}g_{ij}\|.$

Their proof is based on a clever block decomposition of the underlying matrix (see [40, Figure 3.1]). This result finally answered in the affirmative a conjecture made by Latała more than a decade before. We also refer the reader to the survey [62] discussing in quite some detail results prior to [40] and [61] — the latter work discusses the conjectures of Latała and van Handel and shows their equivalence.

Very recently, Latała and Świątkowski [39] investigated a similar problem when the underlying random matrix has Rademacher entries. They proved a lower bound which, up to a $\ln\ln n$ factor, can be reversed for randomized $n\times n$ circulant matrices.

In [23], Guédon, Hinrichs, Litvak, and Prochno studied our main and motivating question on the order of the expected operator norm of structured random matrices considered as operators between $\ell_{p}^{n}$ and $\ell_{q}^{m}$ in the special case where $p\leq 2\leq q$ and the random entries are Gaussian. In this situation, where we are not dealing with the spectral norm, the moment method cannot be employed. The approach in [23] was therefore different and based on a majorizing measure construction combining the works [24] and [25]. In [23, Theorem 1.1], the authors proved that if $1<p\leq 2\leq q<\infty$ , then

	$\displaystyle\mathbb{E}\\|G_{A}\colon\ell_{p}^{n}\to\ell_{q}^{m}\\|\lesssim\gamma_{q}\max_{j\leq n}\\|(a_{ij})_{i=1}^{m}\\|_{q}$	$\displaystyle+(p^{})^{5/q}(\ln m)^{1/q}\gamma_{p^{}}\max_{i\leq m}\\|(a_{ij})_{j=1}^{n}\\|_{p^{*}}$
(1.5)			$\displaystyle+(p^{*})^{5/q}(\ln m)^{1/q}\gamma_{q}\ \mathbb{E}\max_{i\leq m,j\leq n}\|a_{ij}g_{ij}\|,$

where $\gamma_{r}\coloneqq(\mathbb{E}|g|^{r})^{1/r}$ for a standard Gaussian random variable $g$ . Moreover, for $p=1$ and $q\geq 2$ , it was noted in [23, Remark 1.4] (see also [45, Twierdzenie 2]) that

(1.6)

\displaystyle\mathbb{E}\|G_{A}\colon\ell_{1}^{n}\to\ell_{q}^{m}\|

\displaystyle\lesssim\sqrt{q}\max_{j\leq n}\|(a_{ij})_{i=1}^{m}\|_{q}+\mathbb{E}\max_{i\leq m,j\leq n}|a_{ij}g_{ij}|.

Later, an extension of (1.5) to the case of matrices with i.i.d. isotropic log-concave rows was obtained by Strzelecka in [55].

Trying to extend the upper bound for $\mathbb{E}\|G_{A}\colon\ell_{p}^{n}\to\ell_{q}^{m}\|$ to the whole range $1\leq p,q\leq\infty$ one encounters two difficulties. First of all, the methods used in order to prove (1.5) fail if $q\leq 2$ or $p\geq 2$ , because the majorizing measure construction used in [23] is restricted to the case $q\geq 2$ and the assumption $1<p\leq 2$ is required in a Hölder bound. Moreover, when $q\leq 2$ or $p\geq 2$ the result cannot hold with the right-hand side of the same form as in (1.5) (see Remark 4.2 below for counterexamples¹¹1By Jensen’s inequality, the expected norm of a matrix with i.i.d. Rademacher entries is less than or equal to $\sqrt{2/\pi}$ times the expected norm of the matrix with Gaussian entries, so (1.5) for $q\leq 2$ or $p\geq 2$ would imply the same (up to a constant) bound for $\pm 1$ matrices, which does not hold in this range of $(p,q)$ as we explain in Remark 4.2. to (1.5) in the cases $q\leq 2$ and $p\geq 2$ ). This explains the different form of expressions $D_{1}$ and $D_{2}$ in (1.1), which in the range $p\leq 2\leq q$ reduce to the maxima of norms on the right-hand side of (1.5) — see (1.9) below.

1.2. Lower bounds and conjectures

By arguments similar to the ones used in order to prove the lower bound in (1.4), one can check that in the range considered in [23, 45] (i.e., $1\leq p\leq 2\leq q\leq\infty$ ) one has

(1.7)		$\displaystyle\mathbb{E}\\|G_{A}\colon\ell_{p}^{n}\to\ell_{q}^{m}\\|\gtrsim_{p,q}\max_{j\leq n}\\|(a_{ij})_{i=1}^{m}\\|_{q}$	$\displaystyle+\max_{i\leq m}\\|(a_{ij})_{j=1}^{n}\\|_{p^{*}}$
		$\displaystyle+\mathbb{E}\max_{i\leq m,j\leq n}\|a_{ij}g_{ij}\|.$

Note that for $p=1$ ,

\max_{i\leq m}\|(a_{ij})_{j=1}^{n}\|_{p^{*}}=\max_{i\leq m,j\leq n}|a_{ij}|\leq\sqrt{\pi/2}\,\mathbb{E}\max_{i\leq m,j\leq n}|a_{ij}g_{ij}|,

which explains the simplified form of (1.6).

We remark that the proof of (1.7) is based merely on the observation that the operator norm is greater than the maximum entry of the matrix and the appropriate maximum norms of its rows and columns, combined with comparison of moments for Gaussian random vectors. Another but related way to proceed, valid for all $1\leq p,q\leq\infty$ , is to exchange expectation and suprema over the $\ell_{p}^{n}$ and $\ell_{q^{\ast}}^{m}$ balls in the definition of the operator norm. We present the details in Subsection 5.1. In particular, Proposition 5.1 and Corollary 5.2 imply²²2We use here also a trivial observation that $\|G_{A}\colon\ell_{p}^{n}\to\ell_{q}^{m}\|\geq\max_{i,j}|a_{ij}g_{ij}|$ . that, for $1\leq p,q\leq\infty$ ,

	$\displaystyle\mathbb{E}\\|G_{A}\colon\ell_{p}^{n}\to\ell_{q}^{m}\\|\gtrsim\\|A\mathbin{\circ}A\colon\ell^{n}_{p/2}\to\ell^{m}_{q/2}\\|^{1/2}$	$\displaystyle+\\|(A\mathbin{\circ}A)^{T}\colon\ell^{m}_{q^{}/2}\to\ell^{n}_{p^{}/2}\\|^{1/2}$
(1.8)			$\displaystyle+\mathbb{E}\max_{i\leq m,j\leq n}\|a_{ij}g_{ij}\|.$

It is an easy observation (see Lemma 2.1 below) that for $p\leq 2\leq q$ ,

(1.9)

\begin{split}\|A\mathbin{\circ}A\colon\ell^{n}_{p/2}\to\ell^{m}_{q/2}\|^{1/2}&=\max_{j\leq n}\|(a_{ij})_{i=1}^{m}\|_{q},\\ \|(A\mathbin{\circ}A)^{T}\colon\ell^{m}_{q^{*}/2}\to\ell^{n}_{p^{*}/2}\|^{1/2}&=\max_{i\leq m}\|(a_{ij})_{j=1}^{n}\|_{p^{\ast}}.\end{split}

Thus, in the range $1\leq p\leq 2\leq q<\infty$ considered in [23, 45], the lower bounds (1.7) and (1.8) coincide.

Although it would be natural to conjecture at this point that the bound (1.8) may be reversed up to a multiplicative constant depending only on $p,q$ , such a reverse bound turns out not to be true in the case $p\leq q<2$ (and in the dual one $2<p\leq q$ ) as we shall show in Subsection 5.3.

In order to conjecture the right asymptotic behavior of $\mathbb{E}\|G_{A}\colon\ell_{p}^{n}\to\ell_{q}^{m}\|$ , one may take a look at the boundary values of $p$ and $q$ , i.e., $p\in\{1,\infty\}$ or $q\in\{1,\infty\}$ . Note that (1.6) provides an asymptotic behavior of $\mathbb{E}\|G_{A}\colon\ell_{p}^{n}\to\ell_{q}^{m}\|$ on a part of this boundary (i.e., for $p=1$ and $2\leq q\leq\infty$ and in the dual case $q=\infty$ and $1\leq p\leq 2$ ). We provide sharp results on the remaining parts of the boundary of $[1,\infty]\times[1,\infty]$ (see dense lines on the boundary of Figure 1 below):

$\displaystyle\mathbb{E}\\|G_{A}\colon\ell^{n}_{p}\to\ell^{m}_{1}\\|$	$\displaystyle\asymp_{p}D_{1}+D_{2}$	$\displaystyle\qquad\text{for all }1<p\leq\infty,$
$\displaystyle\mathbb{E}\\|G_{A}\colon\ell^{n}_{\infty}\to\ell^{m}_{q}\\|$	$\displaystyle\asymp_{q}D_{1}+D_{2}$	$\displaystyle\qquad\text{for all }1\leq q<\infty,$
$\displaystyle\mathbb{E}\\|G_{A}\colon\ell^{n}_{1}\to\ell^{m}_{q}\\|$	$\displaystyle\asymp_{\phantom{r}}D_{1}+\max_{j\leq n}(\sqrt{\ln(j+1)}b_{j}^{\downarrow{}})$	$\displaystyle\qquad\text{for all }1\leq q\leq 2,$
$\displaystyle\mathbb{E}\\|G_{A}\colon\ell_{p}^{n}\to\ell_{\infty}^{m}\\|$	$\displaystyle\asymp_{\phantom{r}}D_{2}+\max_{i\leq m}(\sqrt{\ln(i+1)}d_{i}^{\downarrow{}})$	$\displaystyle\qquad\text{for all }2\leq p\leq\infty,$

where

\begin{split}D_{1}&\coloneqq\|A\mathbin{\circ}A\colon\ell^{n}_{p/2}\to\ell^{m}_{q/2}\|^{1/2},\\ D_{2}&\coloneqq\|(A\mathbin{\circ}A)^{T}\colon\ell^{m}_{q^{*}/2}\to\ell^{n}_{p^{*}/2}\|^{1/2},\end{split}\qquad\qquad\begin{split}b_{j}&\coloneqq\|(a_{ij})_{i\leq m}\|_{2q/(2-q)},\\ d_{i}&\coloneqq\|(a_{ij})_{j\leq n}\|_{2p/(p-2)},\end{split}

and with $(x_{1}^{\downarrow{}},\ldots,x_{n}^{\downarrow{}})$ denoting the non-increasing rearrangement of $(|x_{1}|,\ldots,|x_{n}|)$ for a given $(x_{j})_{j\leq n}\in\mathbb{R}^{n}$ . (For the precise formulation see Propositions 1.8 and 1.10, and Corollary 1.11 below.)

Moreover, in Subsection 5.1 we generalize the lower bounds from the boundary into the whole range $(p,q)\in[1,\infty]\times[1,\infty]$ (see Figure 1 below), i.e., we prove

(1.10)

\mathbb{E}\|G_{A}\colon\ell_{p}^{n}\to\ell_{q}^{m}\|\gtrsim_{p,q}D_{1}+D_{2}+\begin{cases}\mathbb{E}\max_{i\leq m,j\leq n}|a_{ij}g_{ij}|&\text{if }\ p\leq 2\leq q,\\ \max_{j\leq n}\sqrt{\ln(j+1)}b_{j}^{\downarrow{}}&\text{if }\ p\leq q\leq 2,\\ \max_{i\leq m}\sqrt{\ln(i+1)}d_{i}^{\downarrow{}}&\text{if }\ 2\leq p\leq q,\\ 0&\textrm{if }\ q<p.\end{cases}

Figure 1. The third summand in (1.10) and in Conjecture 1:

	northeast lines:	$\displaystyle\qquad\mathbb{E}\max_{i\leq m,j\leq n}\|a_{ij}g_{ij}\|,$
	horizontal lines:	$\displaystyle\qquad\max_{j\leq n}\sqrt{\ln(j+1)}b_{j}^{\downarrow{}},$
	vertical lines:	$\displaystyle\qquad\max_{i\leq m}\sqrt{\ln(i+1)}d_{i}^{\downarrow{}},$
	northwest lines:	$\displaystyle\qquad 0.$

Note that the horizontal axis represents

1/p

and the vertical one

1/q

. Dense lines correspond to exact asymptotics and loosely spaced lines to upper and lower bounds matching up to logarithms.

Let us now discuss the relation between the terms appearing above. We postpone the proofs of all the following claims to Section 5.

In the case $p\leq 2\leq q$ , we have

(1.11)		$\displaystyle D_{1}+D_{2}+\mathbb{E}\max_{i\leq m,j\leq n}\|a_{ij}g_{ij}\|$	$\displaystyle\asymp_{p,q}D_{1}+D_{2}+\max_{i\leq m,j\leq n}\sqrt{\ln(j+1)}a_{ij}^{\prime}$
		$\displaystyle\asymp_{p,q}D_{1}+D_{2}+\max_{i\leq m,j\leq n}\sqrt{\ln(i+1)}a_{ij}^{\prime\prime},$

where the matrices $(a_{ij}^{\prime})_{i,j}$ and $(a_{ij}^{\prime\prime})_{i,j}$ are obtained by permuting the columns and rows, respectively, of the matrix $(|a_{ij}|)_{i,j}$ in such a way that $\max_{i}a_{i1}^{\prime}\geq\dots\geq\max_{i}a_{in}^{\prime}$ and $\max_{j}a_{1j}^{\prime\prime}\geq\dots\geq\max_{j}a_{mj}^{\prime\prime}$ . Therefore, in the range $1\leq p\leq q\leq\infty$ the right-hand side of (1.10) changes continuously with $p$ and $q$ (for a fixed matrix $A$ ).

Obviously, $\max_{j\leq n}\sqrt{\ln(j+1)}b_{j}^{\downarrow{}}\geq\max_{i\leq m,j\leq n}\sqrt{\ln(j+1)}a_{ij}^{\prime}$ and, in general, the former quantity may be of larger order than the latter one. In Subsection 5.3 we shall present a more subtle relation: for every $1\leq p\leq q<2$ we shall give an example showing that the right-hand side of (1.10) may be of larger order than $D_{1}+D_{2}+\mathbb{E}\max_{i\leq m,j\leq n}|a_{ij}g_{ij}|$ . Note that by duality, i.e., the fact that

(1.12)

\|X_{A}\colon\ell_{p}^{n}\to\ell_{q}^{m}\|=\|(X_{A})^{T}\colon\ell_{q^{*}}^{m}\to\ell_{p^{*}}^{n}\|=\|(X^{T})_{A^{T}}\colon\ell_{q^{*}}^{m}\to\ell_{p^{*}}^{n}\|,

the same holds in the case $2<p\leq q$ . This suggests that the behavior of $\mathbb{E}\|G_{A}\colon\ell_{p}^{n}\to\ell_{q}^{m}\|$ is different in the regions with horizontal or vertical lines than in the region with northeast lines.

Moreover, we have

(1.13)

D_{1}+D_{2}\gtrsim_{p,q}\begin{cases}\max_{j\leq n}\sqrt{\ln(j+1)}b_{j}^{\downarrow{}}&\text{if }q<p\text{ and }q<2,\\ \max_{i\leq m}\sqrt{\ln(i+1)}d_{i}^{\downarrow{}}&\text{if }q<p\text{ and }p^{\ast}<2\end{cases}

(see Subsection 5.2). Note that this is not the case for $p\leq q$ , as one can easily see by considering, e.g., $A$ equal to the identity matrix. This suggests a different (than in other regions), simplified, behavior of $\mathbb{E}\|G_{A}\colon\ell_{p}^{n}\to\ell_{q}^{m}\|$ in the region with northwest lines.

Given the discussion above, the lower bounds presented in (1.10), and the fact that they can be reversed for all $p\in[1,\infty]$ , $q\in\{1,\infty\}$ (and for all $q\in[1,\infty]$ , $p\in\{1,\infty\}$ ), it is natural to conjecture the following.

Conjecture 1.

For all $1\leq p,q\leq\infty$ , we conjecture that

(1.14)

\mathbb{E}\|G_{A}\colon\ell_{p}^{n}\to\ell_{q}^{m}\|\asymp_{p,q}D_{1}+D_{2}+\begin{cases}\mathbb{E}\max_{i\leq m,j\leq n}|a_{ij}g_{ij}|&\text{if }\ p\leq 2\leq q,\\ \max_{j\leq n}\sqrt{\ln(j+1)}b_{j}^{\downarrow{}}&\text{if }\ p\leq q\leq 2,\\ \max_{i\leq m}\sqrt{\ln(i+1)}d_{i}^{\downarrow{}}&\text{if }\ 2\leq p\leq q,\\ 0&\textrm{if }\ q<p.\end{cases}

Remark 1.1.

One could pose another natural conjecture, based on the potential generalization of the first line of the bound (1.4), namely that the inequality

(1.15)

\mathbb{E}\|G_{A}\colon\ell_{p}^{n}\to\ell_{q}^{m}\|\asymp_{p,q}\mathbb{E}\max_{i\leq m}\|(a_{ij}g_{ij})_{j}\|_{p^{\ast}}+\mathbb{E}\max_{j\leq n}\|(a_{ij}g_{ij})_{i}\|_{q}

holds for all $1\leq p,q\leq\infty$ . Indeed, the lower bound is true with constant $\frac{1}{2}$ , since for every deterministic matrix $X$ one has

\|X\colon\ell_{p}^{n}\to\ell_{q}^{m}\|\geq\max\Bigl{\{}\max_{i\leq m}\|(X_{ij})_{j}\|_{p^{\ast}},\max_{j\leq n}\|(X_{ij})_{i}\|_{q}\Bigr{\}}.

However, as we prove in Subsection 5.4, this conjecture is wrong: although the right-hand sides of (1.14) and (1.15) are comparable in the range $1\leq p\leq 2\leq q\leq\infty$ , for every pair of $p,q$ outside this range the right-hand side of (1.15) may be of smaller order than the left-hand side.

Let us now present a conjecture concerning the boundedness of linear operators given by infinite dimensional matrices. In what follows, we say that a matrix $B=(b_{ij})_{i,j\in\mathbb{N}}$ defines a bounded operator from $\ell_{p}(\mathbb{N})$ to $\ell_{q}(\mathbb{N})$ if for all $x\in\ell_{p}(\mathbb{N})$ the product $Bx$ is well defined, belongs to $\ell_{q}(\mathbb{N})$ and the corresponding linear operator is bounded.

Conjecture 2.

Let $A=(a_{ij})_{i,j\in\mathbb{N}}$ be an infinite matrix with real coefficients and let $1\leq p,q\leq\infty$ . We conjecture that the matrix $G_{A}=(a_{ij}g_{ij})_{i,j\in\mathbb{N}}$ defines a bounded linear operator between $\ell_{p}(\mathbb{N})$ and $\ell_{q}(\mathbb{N})$ almost surely if and only if the matrix $A\circ A$ defines a bounded linear operator between $\ell_{p/2}(\mathbb{N})$ and $\ell_{q/2}(\mathbb{N})$ , the matrix $(A\circ A)^{T}$ defines a bounded linear operator between $\ell_{q^{\ast}/2}(\mathbb{N})$ and $\ell_{p^{\ast}/2}(\mathbb{N})$ , and

•

in the case $p\leq 2\leq q$ , $\mathbb{E}\sup_{i,j\in\mathbb{N}}|a_{ij}g_{ij}|<\infty$ ,
•

in the case $p\leq q\leq 2$ , $\lim_{j\to\infty}b_{j}=0$ , and $\sup_{j\in\mathbb{N}}\sqrt{\ln(j+1)}b_{j}^{\downarrow{}}<\infty$ , where $b_{j}=\|(a_{ij})_{i\in\mathbb{N}}\|_{2q/(2-q)}$ , $j\in\mathbb{N}$ ,
•

in the case $2\leq p\leq q$ , $\lim_{i\to\infty}d_{i}=0$ , and $\sup_{i\in\mathbb{N}}\sqrt{\ln(i+1)}d_{i}^{\downarrow{}}<\infty$ , where $d_{i}\coloneqq\|(a_{ij})_{j\in\mathbb{N}}\|_{2p/(p-2)}$ , $i\in\mathbb{N}$ ,
•

(in the case $q<p$ we do not need to assume any additional conditions).

We remark that it suffices to prove Conjecture 1 in order to confirm Conjecture 2.

Proposition 1.2.

Assume $1\leq p,q\leq\infty$ . Then (1.14) for this choice of $p,q$ implies the assertion of Conjecture 2 for the same choice of $p,q$ .

We postpone the proof of this proposition to Subsection 5.5.

In this article, in addition to the cases $p=q=2$ obtained in [40] and $p=1,q\geq 2$ proved in [23, 45], we confirm Conjecture 1 when $p\in\{1,\infty\}$ , $q\in[1,\infty]$ and when $q\in\{1,\infty\}$ , $p\in[1,\infty]$ . In all the other cases, we are able to prove the upper bounds only up to logarithmic (in the dimensions $m,n$ ) multiplicative factors (see Corollary 1.4 below). In particular, Proposition 1.2 implies that Conjecture 2 holds for all $p\in\{1,\infty\}$ , $q\in[1,\infty]$ and for all $q\in\{1,\infty\}$ , $p\in[1,\infty]$ .

Note that in the structured case we work with, interpolating the results obtained for the boundary cases $p\in\{1,\infty\}$ or $q\in\{1,\infty\}$ gives a bound with polynomial (in the dimensions) multiplicative constants which are much worse than logarithmic constants from Corollary 1.4 below. However, as we shall see in Remark 4.2 below, interpolation techniques work well in the non-structured case.

1.3. Main results valid for $1\leq p,q\leq\infty$

We start with general theorems valid for the whole range of $p$ , $q$ . Results which are based on methods working only for specific values of $p$ , $q$ , but yielding better logarithmic terms, are presented in the next subsection. A brief summary and comparison of all results can be found in Table LABEL:table:summary.

Before stating our main results, we need to introduce additional notation. For a non-empty set $J\subset\{1,\ldots,n\}$ , and $1\leq p\leq\infty$ , we define

B_{p}^{J}\coloneqq\Bigl{\{}(x_{j})_{j\in J}:\sum_{j\in J}|x_{j}|^{p}\leq 1,\quad x_{j}\in\mathbb{R}\Bigr{\}}.

By $\ell_{p}^{J}$ we denote the space $\mathbb{R}^{J}\coloneqq\bigl{\{}(x_{j})_{j\in J}:x_{j}\in\mathbb{R}\bigr{\}}$ equipped with the norm

\|x\|_{\ell_{p}^{J}}=\Bigl{(}\sum_{j\in J}|x_{j}|^{p}\Bigr{)}^{1/p},

whose unit ball is $B_{p}^{J}$ . Obviously, the space $\ell_{p}^{J}$ can be identified with a subspace of $\ell_{p}^{n}$ . If $A\colon\ell_{p}^{n}\to\ell_{q}^{m}$ is a linear operator, the notation $A\colon\ell_{p}^{J}\to\ell_{q}^{I}$ means that $A$ is restricted to the space $\ell_{p}^{J}$ and composed with a projection onto $\ell_{q}^{I}$ . Moreover, for $x=(x_{1},\ldots,x_{n})\in\mathbb{R}^{n}$ , $\sup_{J}\|x\|_{\ell_{p}^{J}}=\bigl{(}\sum_{j\leq k}|x_{j}^{\downarrow{}}|^{p}\bigr{)}^{1/p}$ , where the supremum is taken over all $J\subset\{1,\ldots,n\}$ with $|J|=k$ , and $(x_{1}^{\downarrow{}},\ldots,x_{n}^{\downarrow{}})$ is the non-increasing rearrangement of $(|x_{1}|,\ldots,|x_{n}|)$ .

Theorem 1.3 (Main theorem in a general version with sets $I_{0}$ , $J_{0}$ ).

Assume that $m\leq M$ , $n\leq N$ , $1\leq p,q\leq\infty$ , and $G=(g_{ij})_{i\leq M,j\leq N}$ has i.i.d. standard Gaussian entries. Then

		$\displaystyle\mathbb{E}\sup_{I_{0},J_{0}}\\|G_{A}\colon\ell_{p}^{J_{0}}\to\ell_{q}^{I_{0}}\\|=\mathbb{E}\sup_{I_{0},J_{0}}\sup_{x\in B_{p}^{J_{0}}}\sup_{y\in B_{q^{*}}^{I_{0}}}\sum_{i\in I_{0}}\sum_{j\in J_{0}}y_{i}a_{ij}g_{ij}x_{j}$
	$\displaystyle\leq\ln(en)^{1/p^{*}}\ln(em)^{1/q}\Bigl{[}$	$\displaystyle\bigl{(}2.4\sqrt{\ln(mn)}+8\sqrt{\ln M}+\sqrt{2/\pi}\bigr{)}\sup_{I_{0},J_{0}}\\|A\mathbin{\circ}A\colon\ell^{J_{0}}_{p/2}\to\ell^{I_{0}}_{q/2}\\|^{1/2}$
		$\displaystyle+\bigl{(}8\sqrt{\ln N}+2\sqrt{2/\pi}\bigr{)}\sup_{I_{0},J_{0}}\\|(A\mathbin{\circ}A)^{T}\colon\ell^{I_{0}}_{q^{}/2}\to\ell^{J_{0}}_{p^{}/2}\\|^{1/2}\Bigr{]},$

where the suprema are taken over all sets $I_{0}\subset\{1,\ldots,M\}$ , $J_{0}\subset\{1,\ldots,N\}$ such that $|I_{0}|=m$ , $|J_{0}|=n$ .

The above theorem gives an estimate on the largest operator norm among all submatrices of $G_{A}$ of fixed size. Let us remark that apart from being of intrinsic interest, quantities of this type (for $p=q=2$ ) have appeared in connection with the study of the restricted isometry property of random matrices with independent rows [2] or in the analysis of entropic uncertainty principles for random quantum measurements [3, 47].

Let us now give an outline of the proof of Theorem 1.3. Note that

(1.16)

\|G_{A}\colon\ell_{p}^{J_{0}}\to\ell_{q}^{I_{0}}\|=\sup_{x\in B_{p}^{J_{0}}}\sup_{y\in B_{q^{\ast}}^{I_{0}}}\sum_{i\in I_{0}}\sum_{j\in J_{0}}y_{i}a_{ij}g_{ij}x_{j}.

In the first step of our proof, we find polytopes $L$ and $K$ approximating (with accuracy depending logarithmically on the dimension) the unit balls in $\ell_{p}^{J_{0}}$ and $\ell_{q^{\ast}}^{I_{0}}$ , respectively. The extreme points of the sets $K$ and $L$ have a special and simple structure: absolute values of their non-zero coordinates are all equal to a constant depending only on the size of the support of a given point. Since $K$ is close to $B_{q^{*}}^{I_{0}}$ and $L$ is close to $B_{p}^{J_{0}}$ , we may consider only $x\in\operatorname{Ext}(L),y\in\operatorname{Ext}(K)$ in (1.16). Since non-zero coordinates of $x\in\operatorname{Ext}(L)$ and $y\in\operatorname{Ext}(K)$ , respectively, are all equal up to a sign we may use a symmetrization argument and the contraction principle to remove $x$ and $y$ in the sum on the right-hand side of (1.16). Thus, in the next step of the proof we only need to estimate the expected value of

\sup_{I_{0},J_{0}}\sup_{\emptyset\neq I\subset I_{0}}\sup_{\emptyset\neq J\subset J_{0}}|I|^{-1/q^{*}}|J|^{-1/p}\sum_{i\in I,j\in J}a_{ij}g_{ij},

where $I$ and $J$ represent the potential supports of points in $\operatorname{Ext}(K)$ and $\operatorname{Ext}(L)$ . To deal with this quantity, we first consider the suprema over the subsets of fixed sizes and use Slepian’s lemma to compare the supremum of the Gaussian process above with the supremum of another Gaussian process, which may be bounded easily. Then we make use of the term $|I|^{-1/q^{*}}|J|^{-1/p}<1$ , which allows us to go back to suprema over the sets $B_{p}^{J_{0}}$ and $B_{q^{*}}^{I_{0}}$ . At the end, we use the Gaussian concentration inequality to unfix the sizes of sets $I$ and $J$ and complete the proof.

Applying Theorem 1.3 with $N=n$ , $M=m$ immediately yields the following result, which confirms Conjecture 1 up to some logarithmic terms.

Corollary 1.4 (Main theorem – $\ell_{p}$ to $\ell_{q}$ version).

Assume that $1\leq p,q\leq\infty$ and $G=(g_{ij})_{i\leq m,j\leq n}$ has i.i.d. standard Gaussian entries. Then,

	$\displaystyle\mathbb{E}\\|G_{A}\colon\ell^{n}_{p}\to\ell^{m}_{q}\\|\lesssim(\ln n)^{1/p^{*}}(\ln m)^{1/q}\Bigl{[}$	$\displaystyle\sqrt{\ln(mn)}\\|A\mathbin{\circ}A\colon\ell^{n}_{p/2}\to\ell^{m}_{q/2}\\|^{1/2}$
		$\displaystyle+\sqrt{\ln n}\\|(A\mathbin{\circ}A)^{T}\colon\ell^{m}_{q^{}/2}\to\ell^{n}_{p^{}/2}\\|^{1/2}\Bigr{]}.$

Moreover, we easily recover the same bound in the case of independent bounded entries. We state and prove a general version with sets $I_{0}$ and $J_{0}$ akin to Theorem 1.3 in Subsection 3.2.

Corollary 1.5.

Assume that $1\leq p,q\leq\infty$ and $X=(X_{ij})_{i\leq m,j\leq n}$ has independent mean-zero entries taking values in $[-1,1]$ . Then

	$\displaystyle\mathbb{E}\\|X_{A}\colon\ell^{n}_{p}\to\ell^{m}_{q}\\|\lesssim(\ln n)^{1/p^{*}}(\ln m)^{1/q}\Bigl{[}$	$\displaystyle\sqrt{\ln(mn)}\\|A\mathbin{\circ}A\colon\ell^{n}_{p/2}\to\ell^{m}_{q/2}\\|^{1/2}$
		$\displaystyle+\sqrt{\ln n}\\|(A\mathbin{\circ}A)^{T}\colon\ell^{m}_{q^{}/2}\to\ell^{n}_{p^{}/2}\\|^{1/2}\Bigr{]}.$

We use the two results above to obtain their analogue in the case of $\psi_{r}$ entries for $r\leq 2$ ; these random variables are defined by (1.17). This class contains, among others,

•

log-concave random variables (which are $\psi_{1}$ ),
•

heavy tailed Weibull random variables (of shape parameter $r\in(0,1)$ , i.e., $\mathbb{P}(|X_{ij}|\geq t)=e^{-t^{r}/L}$ for $t\geq 0$ ),
•

random variables satisfying the condition

$\|X_{ij}\|_{2\rho}\leq\alpha\|X_{ij}\|_{\rho}\qquad\text{for all }\rho\geq 1.$

These random variables are $\psi_{r}$ with $r=1/\log_{2}\alpha$ . They were considered recently in [38].

A general version of the following Corollary 1.6 with sets $I_{0}$ and $J_{0}$ is stated and proved in Subsection 3.2.

Corollary 1.6.

Assume that $K,L>0$ , $r\in(0,2]$ , $1\leq p,q\leq\infty$ , and $X=(X_{ij})_{i\leq m,j\leq n}$ has independent mean-zero entries satisfying

(1.17)

\displaystyle\mathbb{P}(|X_{ij}|\geq t)\leq Ke^{-t^{r}/L}\qquad\text{for all }t\geq 0,i\leq m,j\leq n.

Then

		$\displaystyle\mathbb{E}\\|X_{A}\colon\ell^{n}_{p}\to\ell^{m}_{q}\\|$
	$\displaystyle\lesssim_{r,K,L}(\ln n)^{1/p^{*}}(\ln m)^{1/q}\ln(mn)^{\frac{1}{r}-\frac{1}{2}}\Bigl{[}$	$\displaystyle\sqrt{\ln(mn)}\\|A\mathbin{\circ}A\colon\ell^{n}_{p/2}\to\ell^{m}_{q/2}\\|^{1/2}$
		$\displaystyle+\sqrt{\ln n}\\|(A\mathbin{\circ}A)^{T}\colon\ell^{m}_{q^{}/2}\to\ell^{n}_{p^{}/2}\\|^{1/2}\Bigr{]}.$

1.4. Results for particular ranges of $p$ , $q$

We continue with results for some specific ranges of $p$ , $q$ , where we are able to prove estimates with better logarithmic dependence (results which follow from them by duality (1.12) are stated in Table LABEL:table:summary to keep the presentation short). We postpone their proofs to Section 4. We start with the case of Gaussian random variables. Recall that $\gamma_{q}=(\mathbb{E}|g|^{q})^{1/q}$ , where $g$ is a standard Gaussian random variable.

Proposition 1.7.

For all $1\leq p\leq 2$ and $1\leq q<\infty$ , we have

(1.18)		$\displaystyle\mathbb{E}\\|G_{A}\colon\ell^{n}_{p}\to\ell^{m}_{q}\\|$	$\displaystyle\leq\gamma_{q}\ln(en)^{1/p^{*}}\\|A\mathbin{\circ}A\colon\ell^{n}_{p/2}\to\ell^{m}_{q/2}\\|^{1/2}$
		$\displaystyle\qquad+2.2\ln(en)^{1/2+1/p^{}}\\|(A\mathbin{\circ}A)^{T}\colon\ell^{m}_{q^{}/2}\to\ell^{n}_{p^{*}/2}\\|^{1/2}.$

If $q=1$ or $p=\infty$ , then we are able to get a result without logarithmic terms. Recall that for a sequence $(x_{j})_{j\leq n}$ we denote by $(x_{j}^{\downarrow{}})_{j\leq n}$ the non-increasing rearrangement of $(|x_{j}|)_{j\leq n}$ .

Proposition 1.8.

(i)

For $1<p\leq\infty$ , we have

\|A\mathbin{\circ}A\colon\ell^{n}_{p/2}\to\ell^{m}_{1/2}\|^{1/2}+\|(A\mathbin{\circ}A)^{T}\colon\ell^{m}_{\infty}\to\ell^{n}_{p^{*}/2}\|^{1/2}\lesssim\mathbb{E}\|G_{A}\colon\ell^{n}_{p}\to\ell^{m}_{1}\|\\ \leq\gamma_{1}\|A\mathbin{\circ}A\colon\ell^{n}_{p/2}\to\ell^{m}_{1/2}\|^{1/2}+2\gamma_{p^{*}}\|(A\mathbin{\circ}A)^{T}\colon\ell^{m}_{\infty}\to\ell^{n}_{p^{*}/2}\|^{1/2}.

(ii)

Moreover,

\mathbb{E}\|G_{A}\colon\ell^{n}_{1}\to\ell^{m}_{1}\|\asymp\|A\mathbin{\circ}A\colon\ell^{n}_{1/2}\to\ell^{m}_{1/2}\|^{1/2}+\max_{j\leq n}\sqrt{\ln(j+1)}b_{j}^{\downarrow{}},

where $b_{j}\coloneqq\|(a_{ij})_{i\leq m}\|_{2}$ , $j\leq n$ .

Note that ii shows in particular that a blow up of the constant $\gamma_{p^{\ast}}$ in the upper estimate (i) for $p\to 1$ is necessary, since the right most summands in i and ii are non-comparable.

Remark 1.9.

It shall be clear from the proof that the upper bound in part i of Proposition 1.8 remains valid for any random matrix $X$ (instead of $G$ ) with independent isotropic rows (i.e., rows with mean zero and the covariance matrix equal to the identity) such that

(1.19)

\Bigr{(}\mathbb{E}\Bigl{|}\sum_{i=1}^{m}\alpha_{i}X_{ij}\Bigr{|}^{p^{*}}\Bigr{)}^{1/p^{*}}\lesssim_{p}\Bigl{(}\sum_{i=1}^{m}\alpha_{i}^{2}\Bigr{)}^{1/2}\qquad\text{for all }\alpha\in\mathbb{R}^{m},j\leq n.

Note that the independence and the isotropicity of rows imply that also the columns of $X$ are isotropic (since the coordinates of every column are independent and have mean zero and variance $1$ ). Therefore, whenever $p\geq 2$ , condition (1.19) is always satisfied (because the $p^{\ast}$ -integral norm is bounded above by the $2$ -integral norm, which is then equal to the right-hand side of (1.19), since the covariance matrix of each column is equal to the $m\times m$ identity matrix).

The following proposition generalizes part (ii) of Proposition 1.8 to an arbitrary $q\leq 2$ . We list it separately since we present a proof using different arguments. Recall that the case $p=1$ , $q\geq 2$ was established before, see (1.6).

Proposition 1.10.

If $1\leq q\leq 2$ , then

	$\displaystyle\\|G_{A}\colon\ell_{1}^{n}\to\ell_{q}^{m}\\|$	$\displaystyle\asymp\\|A\mathbin{\circ}A\colon\ell_{1/2}^{n}\to\ell_{q/2}^{m}\\|^{1/2}+\max_{j\leq n}(\sqrt{\ln(j+1)}b_{j}^{\downarrow{}})$
		$\displaystyle=\max_{j\leq n}\\|(a_{ij})_{i\leq m}\\|_{q}+\max_{j\leq n}(\sqrt{\ln(j+1)}b_{j}^{\downarrow{}}),$

where $b_{j}=\|(a_{ij})_{i\leq m}\|_{2q/(2-q)}$ for $j\leq n$ .

Proposition 1.10 immediately implies its dual version.

Corollary 1.11.

If $2\leq p\leq\infty$ , then

	$\displaystyle\\|G_{A}\colon\ell_{p}^{n}\to\ell_{\infty}^{m}\\|$	$\displaystyle\asymp\\|(A\mathbin{\circ}A)^{T}\colon\ell_{1/2}^{m}\to\ell_{p^{\ast}/2}^{n}\\|^{1/2}+\max_{i\leq m}(\sqrt{\ln(i+1)}d_{i}^{\downarrow{}})$
		$\displaystyle=\max_{i\leq m}\\|(a_{ij})_{j\leq n}\\|_{p^{\ast}}+\max_{i\leq m}(\sqrt{\ln(i+1)}d_{i}^{\downarrow{}}),$

where $d_{i}=\|(a_{ij})_{j\leq n}\|_{2p^{\ast}/(2-p^{\ast})}=\|(a_{ij})_{j\leq n}\|_{2p/(p-2)}$ for $i\leq m$ .

Remark 1.12.

Corollary 1.11 and the dual version of (1.6) provide the exact behavior of expected norm of Gaussian operator from $\ell_{p}^{n}$ to $\ell_{q}^{m}$ not only when $q=\infty$ , but also for $q\geq c_{0}\ln m$ , as we explain now. For all $q\geq q_{0}\coloneqq c_{0}\ln m$ we have the following inequalities for norms on $\mathbb{R}^{m}$ ,

\|\cdot\|_{q}\geq\|\cdot\|_{\infty}\geq m^{-1/q_{0}}\|\cdot\|_{q_{0}}=e^{-1/c_{0}}\|\cdot\|_{q_{0}}\geq e^{-1/c_{0}}\|\cdot\|_{q},

therefore,

\frac{1}{e^{1/c_{0}}}\mathbb{E}\|X_{A}\colon\ell_{p}^{n}\to\ell_{q}^{m}\|\leq\mathbb{E}\|X_{A}\colon\ell_{p}^{n}\to\ell_{\infty}^{m}\|\leq\mathbb{E}\|X_{A}\colon\ell_{p}^{n}\to\ell_{q}^{m}\|.

Similarly,

\displaystyle\|(A\mathbin{\circ}A)^{T}\colon\ell^{m}_{q^{*}/2}\to\ell^{n}_{p^{*}/2}\|

\displaystyle\asymp_{c_{0}}\|(A\mathbin{\circ}A)^{T}\colon\ell^{m}_{1/2}\to\ell^{n}_{p^{*}/2}\|.

Proposition 1.7 implies the following estimate for matrices with independent $\psi_{r}$ entries, in the same way as Corollary 1.4 implies Corollary 1.6 (see Subsection 3.2).

Corollary 1.13.

Assume that $K,L>0$ , $r\in(0,2]$ , and $X=(X_{ij})_{i\leq m,j\leq n}$ has independent mean-zero entries satisfying

(1.20)

\displaystyle\mathbb{P}(|X_{ij}|\geq t)\leq Ke^{-t^{r}/L}\qquad\text{for all }t\geq 0.

Then, for $1\leq p\leq 2$ , $1\leq q\leq\infty$ ,

	$\displaystyle\mathbb{E}\\|X_{A}\colon\ell^{n}_{p}\to\ell^{m}_{q}\\|$	$\displaystyle\lesssim_{r,K,L}(\ln n)^{1/p^{*}}\ln(nm)^{1/r-1/2}\\|A\mathbin{\circ}A\colon\ell^{n}_{p/2}\to\ell^{m}_{q/2}\\|^{1/2}$
		$\displaystyle\qquad+(\ln n)^{1/2+1/p^{}}\ln(nm)^{1/r-1/2}\\|(A\mathbin{\circ}A)^{T}\colon\ell^{m}_{q^{}/2}\to\ell^{n}_{p^{*}/2}\\|^{1/2}.$

By Hoeffding’s inequality (i.e., Lemma 2.13) we know that matrices with independent valued in $[-1,1]$ entries having mean zero satisfy (1.20) with $r=2$ and $K=2=L$ . In this special case of independent bounded random variables one can also adapt the methods of [9] to prove in the smaller range $1\leq p\leq 2\leq q<\infty$ the following result with explicit numerical constants and improved dependence on $n$ (note that the second logarithmic term is better than in Corollary 1.13, where the exponent equals $1/2+1/p^{*}$ ).

Proposition 1.14.

Assume that $X=(X_{ij})_{i\leq m,j\leq n}$ has independent mean-zero entries taking values in $[-1,1]$ . Then, for $1\leq p\leq 2\leq q<\infty$ ,

	$\displaystyle\mathbb{E}\\|X_{A}\colon\ell^{n}_{p}\to\ell^{m}_{q}\\|$	$\displaystyle\leq C(q)\ln(en)^{1/p^{*}}\\|A\mathbin{\circ}A\colon\ell^{n}_{p/2}\to\ell^{m}_{q/2}\\|^{1/2}$
		$\displaystyle\quad+10^{1/q}\ln(en)^{1/q+1/p^{}}\\|(A\mathbin{\circ}A)^{T}\colon\ell^{m}_{q^{}/2}\to\ell^{n}_{p^{*}/2}\\|^{1/2},$

where $C(q)\coloneqq 2(q\Gamma(q/2))^{1/q}\asymp\sqrt{q}$ .

Finally, we have the following general result for matrices with independent $\psi_{r}$ entries (cf. Corollary 1.6).

Theorem 1.15.

Let $K,L>0$ , $r\in(0,2]$ , and assume that $X=(X_{ij})_{i\leq m,j\leq n}$ has independent mean-zero entries satisfying

\mathbb{P}(|X_{ij}|\geq t)\leq Ke^{-t^{r}/L}\quad\text{for all }t\geq 0.

Then, for all $1\leq p\leq 2$ and $1\leq q<\infty$ ,

	$\displaystyle\mathbb{E}\\|X_{A}\colon\ell^{n}_{p}\to\ell^{m}_{q}\\|$	$\displaystyle\lesssim_{r,K,L}\ q^{1/r}(\ln n)^{1/p^{*}}\\|A\mathbin{\circ}A\colon\ell^{n}_{p/2}\to\ell^{m}_{q/2}\\|^{1/2}$
		$\displaystyle\qquad\ \ \ +(\ln n)^{1/2+1/p^{}}\ln(mn)^{1/r}\\|(A\mathbin{\circ}A)^{T}\colon\ell^{m}_{q^{}/2}\to\ell^{n}_{p^{*}/2}\\|^{1/2}.$

Having in mind the strategy of proof described after Theorem 1.3, let us elaborate on the idea of proof of Theorem 1.15. We shall split the matrix $X$ into two parts $X^{(1)}$ and $X^{(2)}$ which we treat separately. In our decomposition, all entries of $X^{(1)}$ are bounded by $C\ln(mn)^{1/r}$ and the probability that $X^{(2)}\neq 0$ is very small. Then we shall deal with $X^{(2)}$ using a crude bound (Lemma 4.3) and the fact that the probability that $X^{(2)}\neq 0$ is small enough to compensate it. In order to bound the expectation of the norm of $X^{(1)}$ , we require a cut-off version of Theorem 1.15 (Lemma 4.4). To obtain it, we shall replace $B_{p}^{n}$ in the expression for the operator norm with a suitable polytope $K$ (and leave $\sup_{y\in B_{q^{*}}^{m}}$ as it is) and then apply a Gaussian-type concentration inequality to the function $Z\mapsto F(Z)\coloneqq\|Z_{A}x\|_{q}$ for $x\in\operatorname{Ext}(K)$ .

1.5. Tail bounds

All the bounds for $\mathbb{E}\|X_{A}\colon\ell_{p}^{n}\to\ell_{q}^{m}\|$ provided in this work for random matrices $X$ also yield a tail bound for $\|X_{A}\colon\ell_{p}^{n}\to\ell_{q}^{m}\|$ . (It is clear from the proof of Proposition 1.16 — see Subsection 3.2 — that the same applies to the estimates for $\sup_{I_{0},J_{0}}\|G_{A}:\ell_{p}^{J_{0}}\to\ell_{q}^{I_{0}}\|$ , but we omit the details to keep the presentation clear.)

Proposition 1.16 (Tail bound).

Assume that $K,L\geq 1$ , $r\in(0,2]$ , $1\leq p,q\leq\infty$ , and $\gamma\geq 1$ . Fix a deterministic $m\times n$ matrix $A$ and assume that

D\geq\|A\mathbin{\circ}A\colon\ell^{n}_{p/2}\to\ell^{m}_{q/2}\|^{1/2}.

If for all random matrices $X=(X_{ij})_{i\leq m,j\leq n}$ with independent mean-zero entries satisfying

(1.21)

\mathbb{P}(|X_{ij}|\geq t)\leq Ke^{-t^{r}/L}\qquad\text{for all }t\geq 0,\,i\leq m,\,j\leq n,

we have

(1.22)

\mathbb{E}\|X_{A}\colon\ell^{n}_{p}\to\ell^{m}_{q}\|\leq\gamma D,

then, for all random matrices with independent mean-zero entries satisfying (1.21), we also have

(1.23)

\bigl{(}\mathbb{E}\|X_{A}\colon\ell^{n}_{p}\to\ell^{m}_{q}\|^{\rho}\bigr{)}^{1/\rho}\lesssim_{r,K,L}\ \rho^{1/r}\gamma D\qquad\text{for all }\rho\geq 1,

and, for all $t>0$ ,

(1.24)

\mathbb{P}\bigl{(}\|X_{A}\colon\ell^{n}_{p}\to\ell^{m}_{q}\|\geq t\gamma D\bigr{)}\leq C(r,K,L)\exp\bigl{(}-t^{r}/C(r,K,L)\bigr{)}.

Note that random variables taking values in $[-1,1]$ satisfy condition (1.21) with $r=2$ , $K=e$ , and $L=1$ . Thus, Proposition 1.16 applies also in the setting of bounded or Gaussian entries.

1.6. Organization of the paper

In Section 2 we gather various preliminary results we shall use in the sequel. Section 3 contains the proofs of the main results valid for all $p$ , $q$ (i.e., Theorem 1.3 and its corollaries) and the tail bound from Proposition 1.16. In Section 4 we prove the results for specific choices/ranges of $p$ , $q$ . In Section 5 we prove lower bounds on expected operator norms, showing in particular that our estimates are optimal up to logarithmic factors. We also prove other results justifying the proposed form of Conjecture 1. The last subsection of Section 5 is devoted to infinite dimensional Gaussian operators.

2. Preliminaries

2.1. General facts

We start with some easy lemmas which will be used repeatedly throughout the paper.

Lemma 2.1.

For any real $m\times n$ matrix $B=(b_{ij})_{i\leq m,j\leq n}$ and $0<r\leq 1\leq s\leq\infty$ , we have

\|B\colon\ell^{n}_{r}\to\ell^{m}_{s}\|=\|B\colon\ell^{n}_{1}\to\ell^{m}_{s}\|=\max_{j\leq n}\|(b_{ij})_{i=1}^{m}\|_{s}.

Furthermore, for a real $m\times n$ matrix $A=(a_{ij})_{i\leq m,j\leq n}$ and $1\leq p\leq 2$ , $p\leq q\leq\infty$ ,

\|A\mathbin{\circ}A\colon\ell^{n}_{p/2}\to\ell^{m}_{q/2}\|^{1/2}=\max_{j\leq n}\|(a_{ij})_{i=1}^{m}\|_{q}.

Proof.

Since $0<r\leq 1$ , we have $\operatorname{conv}B_{r}^{n}=B_{1}^{n}$ , where $\operatorname{conv}S$ denotes the convex hull of the set $S$ . Moreover, the extreme points of $B_{1}^{n}$ are the signed standard unit vectors, i.e., $\pm e_{1},\dots,\pm e_{n}$ , and $z\mapsto\|z\|_{s}$ is a convex function (since $s\geq 1$ ). Thus,

\sup_{x\in B_{r}^{n}}\|Bx\|_{s}=\sup_{x\in\operatorname{conv}B_{r}^{n}}\|Bx\|_{s}=\sup_{x\in B_{1}^{n}}\|Bx\|_{s}=\max_{1\leq j\leq n}\|Be_{j}\|_{s}=\max_{1\leq j\leq n}\|(b_{ij})_{i=1}^{m}\|_{s}.

This immediately implies the result for the Hadamard product $A\circ A=:B$ if $1\leq p\leq 2\leq q\leq\infty$ .

If, on the other hand, $1\leq p\leq q\leq 2$ , then by the subadditivity of the function $t\mapsto|t|^{q/2}$ ,

	$\displaystyle\\|A\mathbin{\circ}A\colon\ell_{p/2}^{n}\to\ell_{q/2}^{m}\\|^{q/2}$	$\displaystyle=\sup_{x\in B_{p/2}^{n}}\sum_{i=1}^{m}\Bigl{\|}\sum_{j=1}^{n}a_{ij}^{2}x_{j}\Bigr{\|}^{q/2}\leq\sup_{x\in B_{p/2}^{n}}\sum_{i=1}^{m}\sum_{j=1}^{n}\|a_{ij}\|^{q}\|x_{j}\|^{q/2}$
		$\displaystyle=\\|(\|a_{ij}\|^{q})_{i\leq m,j\leq n}\colon\ell_{p/q}^{n}\to\ell_{1}^{m}\\|=\max_{j\leq n}\\|(a_{ij})_{i\leq m}\\|_{q}^{q},$

where in the last equality we used the first part of the Lemma. Since we clearly have

\|A\mathbin{\circ}A\colon\ell_{p/2}^{n}\to\ell_{q/2}^{m}\|\geq\max_{j\leq n}\|(a_{ij}^{2})_{i\leq m}\|_{q/2}=\max_{j\leq n}\|(a_{ij})_{i\leq m}\|_{q}^{2},

we thus obtain

\|A\mathbin{\circ}A\colon\ell_{p/2}^{n}\to\ell_{q/2}^{m}\|^{1/2}=\max_{j\leq n}\|(a_{ij})_{i\leq m}\|_{q}.\qed

Definition 2.2.

A set $K\subset\mathbb{R}^{n}$ is called unconditional, if for every $(x_{j})_{j\leq n}\in K$ and every $(\varepsilon_{j})_{j\leq n}\in\{-1,1\}^{n}$ we have $(\varepsilon_{j}x_{j})_{j\leq n}\in K$ .

We shall use the following version of [49, Lemma 2.1].

Lemma 2.3.

Assume that $1\leq p\leq\infty$ , $n\in\mathbb{N}$ , and define the convex set

K\coloneqq\operatorname{conv}\Bigl{\{}\frac{1}{|J|^{1/p}}\bigl{(}\varepsilon_{j}\mathbf{1}_{\{j\in J\}}\bigr{)}_{j=1}^{n}:J\subset\{1,\dots,n\},J\neq\emptyset,(\varepsilon_{j})_{j=1}^{n}\in\{-1,1\}^{n}\Bigr{\}}.

Then $B_{p}^{n}\subset\ln(en)^{1/p^{*}}K$ .

Proof.

Fix a vector $x=(x_{1},\dots,x_{n})\in\mathbb{R}^{n}$ . We want to prove that $\|x\|_{K}\leq\ln(en)^{1/p^{*}}\|x\|_{p}$ , where

\|x\|_{K}=\inf\{\lambda>0\colon x\in\lambda K\}

denotes the norm generated by $K$ , i.e., its Minkowski gauge. Since both $K$ and $B_{p}^{n}$ are permutationally invariant and unconditional (see Definition 2.2), we may and will assume that $x_{1}\geq\dots\geq x_{n}\geq 0$ . If we put $x_{n+1}\coloneqq 0$ , then

\displaystyle x=\sum_{j=1}^{n}x_{j}e_{j}=\sum_{j=1}^{n}(x_{j}-x_{j+1})(e_{1}+\dots+e_{j}).

Since $\|e_{1}+\dots+e_{j}\|_{K}=j^{1/p}$ for $1\leq j\leq n$ ,³³3Indeed, $j^{-1/p}(e_{1}+\dots+e_{j})\in K$ , so $\|e_{1}+\dots+e_{j}\|_{K}\leq j^{1/p}$ ; on the other hand, $K\subset B_{p}^{n}$ , so $\|e_{1}+\dots+e_{j}\|_{K}\geq\|e_{1}+\dots+e_{j}\|_{p}=j^{1/p}$ . the triangle and Hölder inequalities yield

	$\displaystyle\\|x\\|_{K}$	$\displaystyle\leq\sum_{j=1}^{n}(x_{j}-x_{j+1})j^{1/p}=\sum_{j=1}^{n}x_{j}(j^{1/p}-(j-1)^{1/p})$
		$\displaystyle\leq\sum_{j=1}^{n}x_{j}j^{1/p-1}\leq\\|x\\|_{p}\Bigl{(}\sum_{j=1}^{n}\frac{1}{j}\Bigr{)}^{1/p^{}}\leq\\|x\\|_{p}\ln(en)^{1/p^{}},$

where we also used the elementary estimates $j^{1/p}-(j-1)^{1/p}\leq j^{\frac{1}{p}-1}$ and $\sum_{j=1}^{n}\frac{1}{j}\leq 1+\int_{1}^{n}\frac{1}{t}dt=\ln(en)$ . This completes the proof. ∎

Remark 2.4.

The term $\ln(en)^{1/p^{*}}$ can be replaced by $1+\frac{1}{p}\ln(en)^{1/p^{*}}$ by writing in the above proof

\displaystyle\sum_{j=1}^{n}x_{j}(j^{1/p}-(j-1)^{1/p})

\displaystyle\leq x_{1}+\frac{1}{p}\sum_{j=2}^{n}x_{j}(j-1)^{\frac{1}{p}-1}\leq\|x\|_{p}\Bigl{(}1+\frac{1}{p}\Bigl{(}\sum_{j=1}^{n-1}\frac{1}{j}\Bigr{)}^{1/p^{*}}\Bigr{)}.

Here we used the estimates $j^{1/p}-(j-1)^{1/p}\leq\frac{1}{p}(j-1)^{\frac{1}{p}-1}$ for $j>1$ (which follows from the concavity of the function $t\mapsto t^{1/p}$ ) and the trivial one $x_{1}\leq\|x\|_{p}$ .

Remark 2.5.

The constant $(\ln n)^{1/p^{\ast}}$ in Lemma 2.3 is sharp up to a constant depending on $p$ for every $1\leq p<\infty$ (when $p=\infty$ , $K=B_{p}^{n}$ and the constant depending on $p$ degenerates as $p\to\infty$ ). More precisely, we shall prove that if $B_{p}^{n}\subset C(p,n)K$ , then $C(p,n)\gtrsim_{p}(\ln n)^{1/p^{\ast}}$ . Note that $B_{p}^{n}\subset C(p,n)K$ if and only if

(2.1)

\|\cdot\|_{p^{\ast}}\leq C(p,n)\|\cdot\|_{K}^{\ast},

where $\|\cdot\|_{K}^{\ast}$ is norm dual to $\|\cdot\|_{K}$ .

Let $\operatorname{Ext}K$ be the set of extreme points of $K$ , and let $(y_{j}^{\downarrow{}})_{j\leq n}$ be the non-increasing rearrangement of $(|y_{j}|)_{j\leq n}$ . For every $y\in\mathbb{R}^{n}$ ,

	$\displaystyle\\|y\\|_{K}^{\ast}=\sup_{x\in K}\sum_{j=1}^{n}x_{j}y_{j}=\sup_{x\in\operatorname{Ext}K}\sum_{j=1}^{n}x_{j}y_{j}$	$\displaystyle=\sup_{J\subset[n],J\neq\emptyset}\sum_{j\in J}\|y_{j}\|\frac{1}{\|J\|^{1/p}}$
		$\displaystyle=\sup_{k\leq n}\sum_{j=1}^{k}y_{j}^{\downarrow{}}\frac{1}{k^{1/p}}.$

Assume that $p^{\ast}\neq 1$ and put $y_{j}\coloneqq j^{-1/p^{\ast}}$ . We get

\|y\|_{K}^{\ast}=\sup_{k\leq n}\sum_{j=1}^{k}j^{-1/p^{\ast}}\frac{1}{k^{1/p}}\asymp_{p}\sup_{k\leq n}k^{1-\frac{1}{p^{\ast}}}\frac{1}{k^{1/p}}=1,

whereas

\|y\|_{p^{\ast}}=\Bigl{(}\sum_{j=1}^{n}j^{-1}\Bigr{)}^{1/p^{\ast}}\asymp(\ln n)^{1/p^{\ast}},

so inequality (2.1) yields that $C(p,n)\gtrsim_{p}(\ln n)^{1/p^{\ast}}$ .

We shall also need the following standard lemma (see, e.g., [41, Section 1.3]). We will use the versions with $r=1$ and $r=2$ .

Lemma 2.6.

Let $Z$ be a nonnegative random variable. If there exist $a\geq 0$ , $b,\alpha,\beta,s_{0}>0$ , and $r\geq 1$ such that

\mathbb{P}(Z\geq a+bs)\leq\alpha e^{-\beta s^{r}}\quad\text{for }s\geq s_{0},

then

\mathbb{E}Z\leq a+b\Bigl{(}s_{0}+\alpha\frac{e^{-\beta s_{0}^{r}}}{r\beta s_{0}^{r-1}}\Bigr{)}.

Proof.

Integration by parts yields

	$\displaystyle\mathbb{E}Z$	$\displaystyle\leq a+bs_{0}+\int_{a+bs_{0}}^{\infty}\mathbb{P}(Z\geq u)du=a+bs_{0}+b\int_{s_{0}}^{\infty}\mathbb{P}(Z\geq a+bs)ds$
		$\displaystyle\leq a+bs_{0}+b\alpha\int_{s_{0}}^{\infty}e^{-\beta s^{r}}ds$
		$\displaystyle\leq a+bs_{0}+\frac{b\alpha}{r\beta s_{0}^{r-1}}\int_{s_{0}}^{\infty}r\beta s^{r-1}e^{-\beta s^{r}}ds=a+b\Bigl{(}s_{0}+\alpha\frac{e^{-\beta s_{0}^{r}}}{r\beta s_{0}^{r-1}}\Bigr{)}.\qed$

2.2. Contraction principles

Below we recall the well-known contraction principle due to Kahane and its extension by Talagrand (see, e.g., [64, Exercise 6.7.7] and [43, Theorem 4.4 and the proof of Theorem 4.12]).

Lemma 2.7 (Contraction principle).

Let $(X,\|\cdot\|)$ be a normed space, $n\in\mathbb{N}$ , and $\rho\geq 1$ . Assume that $x_{1},\dots,x_{n}\in X$ and $\alpha\coloneqq(\alpha_{1},\dots,\alpha_{n})\in\mathbb{R}^{n}$ . Then, if $\varepsilon_{1},\dots,\varepsilon_{n}$ are independent Rademacher random variables, we have

\mathbb{E}\bigl{\|}\sum_{i=1}^{n}\alpha_{i}\varepsilon_{i}x_{i}\bigr{\|}^{\rho}\leq\|\alpha\|_{\infty}^{\rho}\,\mathbb{E}\bigl{\|}\sum_{i=1}^{n}\varepsilon_{i}x_{i}\bigr{\|}^{\rho}.

Lemma 2.8 (Contraction principle).

Let $T$ be a bounded subset of $\mathbb{R}^{n}$ . Assume that $\varphi_{i}:\mathbb{R}\to\mathbb{R}$ are $1$ -Lipschitz and $\varphi_{i}(0)=0$ for $i=1,\ldots,n$ . Then, if $\varepsilon_{1},\dots,\varepsilon_{n}$ are independent Rademacher random variables, we have

\mathbb{E}\sup_{t\in T}\sum_{i=1}^{n}\varepsilon_{i}\varphi_{i}(t_{i})\leq\mathbb{E}\sup_{t\in T}\sum_{i=1}^{n}\varepsilon_{i}t_{i}.

2.3. Gaussian random variables

The following result is fundamental to the theory of Gaussian processes and referred to as Slepian’s inequality or Slepian’s lemma [52]. We use the following (slightly adapted) version taken from [11, Theorem 13.3].

Lemma 2.9 (Slepian’s lemma).

Let $(X_{t})_{t\in T}$ and $(Y_{t})_{t\in T}$ be two Gaussian random vectors satisfying $\mathbb{E}[X_{t}]=\mathbb{E}[Y_{t}]$ for all $t\in T$ . Assume that, for all $s,t\in T$ , we have $\mathbb{E}[(X_{s}-X_{t})^{2}]\leq\mathbb{E}[(Y_{s}-Y_{t})^{2}]$ . Then

\mathbb{E}\sup_{t\in T}X_{t}\leq\mathbb{E}\sup_{t\in T}Y_{t}.

The next lemma is folklore. We include a short proof of an estimate with specific constants for the sake of completeness.

Lemma 2.10.

Assume that $k\geq 2$ and let $g_{i}$ , $i\leq k$ , be standard Gaussian random variables (not necessarily independent). Then

	$\displaystyle\mathbb{E}\max_{1\leq i\leq k}g_{i}$	$\displaystyle\leq\sqrt{2\ln k},$
	$\displaystyle\mathbb{E}\max_{1\leq i\leq k}\|g_{i}\|$	$\displaystyle\leq 2\sqrt{\ln k}.$

Proof.

Since the moment generating function of a Gaussian random variable is given by $\mathbb{E}e^{tg_{1}}=e^{t^{2}/2}$ , it follows from Jensen’s inequality that

	$\displaystyle\mathbb{E}\max_{i\leq k}g_{i}$	$\displaystyle\leq\frac{1}{t}\ln\bigl{(}\mathbb{E}\exp(t\max_{i\leq k}g_{i})\bigr{)}$
		$\displaystyle\leq\frac{1}{t}\ln\bigl{(}\mathbb{E}\sum_{i=1}^{k}\exp(tg_{i})\bigr{)}=\frac{1}{t}\ln\bigl{(}ke^{t^{2}/2}\bigr{)}=\frac{\ln k}{t}+\frac{t}{2}.$

By taking $t=\sqrt{2\ln k}$ , we get the first assertion. We apply this inequality with random variables $g_{1},-g_{1},\ldots,g_{k},-g_{k}$ to get the second assertion, namely

\mathbb{E}\max_{i\leq k}|g_{i}|=\mathbb{E}\max_{i\leq k}\max\{g_{i},-g_{i}\}\leq\sqrt{2\ln(2k)}\leq\sqrt{2\ln(k^{2})}=2\sqrt{\ln k}.\qed

The next two lemmas are taken from [61]. Recall that $b_{1}^{\downarrow{}}\geq\ldots\geq b_{n}^{\downarrow{}}$ is the non-increasing rearrangement of $(|b_{j}|)_{j\leq n}$ .

Lemma 2.11 ([61, Lemma 2.3]).

Assume that $(b_{j})_{j\leq n}\in\mathbb{R}^{n}$ and let $(X_{j})_{j\leq n}$ be random variables (not necessarily independent) satisfying

\mathbb{P}(|X_{j}|>t)\leq Ke^{-t^{2}/b_{j}^{2}}\qquad\text{for all }t\geq 0,\ j\leq n.

Then

\mathbb{E}\max_{j\leq n}|X_{j}|\lesssim_{K}\max_{j\leq n}b_{j}^{\downarrow{}}\sqrt{\ln(j+1)}.

Lemma 2.12 ([61, Lemma 2.4]).

Assume that $(b_{j})_{j\leq n}\in\mathbb{R}^{n}$ and let $(X_{j})_{j\leq n}$ be independent random variables with $X_{j}\sim\mathcal{N}(0,b_{j}^{2})$ for $j\leq n$ . Then

\mathbb{E}\max_{j\leq n}|X_{j}|\gtrsim\max_{j\leq n}b_{j}^{\downarrow{}}\sqrt{\ln(j+1)}.

Lemma 2.13 (Hoeffding’s inequality, [32, Theorem 2]).

Assume that $(b_{j})_{j\leq n}\in\mathbb{R}^{n}$ and let $X_{j}$ , $j\leq n$ , be independent mean-zero random variables such that $|X_{j}|\leq 1$ a.s. Then, for all $t\geq 0$ ,

\mathbb{P}\bigl{(}\bigl{|}\sum_{j=1}^{n}b_{j}X_{j}\bigr{|}\geq t\bigr{)}\leq 2\exp\Bigl{(}-\frac{t^{2}}{2\sum_{j=1}^{n}b_{j}^{2}}\Bigr{)}.

2.4. Random variables with heavy tails

The following lemma is a special case of [34, Theorem 1].

Lemma 2.14 (Contraction principle).

Let $K,L>0$ and assume that $(\eta_{i})_{i\leq n}$ and $(\xi_{i})_{i\leq n}$ are two sequences of independent symmetric random variables satisfying for every $i\leq n$ and $t\geq 0$ ,

\mathbb{P}(|\eta_{i}|\geq t)\leq K\mathbb{P}(L|\xi_{i}|\geq t).

Then, for every convex function $\varphi$ and every $a_{1},\ldots,a_{n}\in\mathbb{R}$ ,

\mathbb{E}\varphi\Big{(}\sum_{i=1}^{n}a_{i}\eta_{i}\Big{)}\leq\mathbb{E}\varphi\Big{(}KL\sum_{i=1}^{n}a_{i}\xi_{i}\Big{)}.

Lemma 2.15 ([31, Theorem 6.2]).

Assume that $Z_{1},\dots,Z_{n}$ are independent symmetric Weibull random variables with shape parameter $r\in(0,1]$ and scale parameter $1$ , i.e., $\mathbb{P}(|Z_{i}|\geq t)=e^{-t^{r}}$ for $t\geq 0$ . Then, for every $\rho{}\geq 2$ and $a\in\mathbb{R}^{n}$ ,

\Bigl{\|}\sum_{i=1}^{n}a_{i}Z_{i}\Bigr{\|}_{\rho{}}\asymp\max\bigl{\{}\sqrt{\rho{}}\|a\|_{2}\|Z_{1}\|_{2},\|a\|_{\rho{}}\|Z_{1}\|_{\rho{}}\bigr{\}}.

Remark 2.16 (Moments of Weibull random variables).

Note that if $Z$ is a symmetric random variable such that $\mathbb{P}(|Z|\geq t)=e^{-t^{r}}$ , $r\in(0,2]$ , then $Y=|Z|^{r}\operatorname{sgn}(Z)$ has (symmetric) exponential distribution with parameter $1$ , so by Stirling’s formula we obtain, for all $\rho\geq 1$ ,

\|Z\|_{\rho{}}=\|Y\|_{\rho{}/r}^{1/r}=\Gamma\Bigl{(}\frac{\rho{}}{r}+1\Bigr{)}^{1/{\rho{}}}\leq\Bigl{(}\frac{C}{r}\Bigr{)}^{\frac{1}{r}+\frac{1}{2\rho{}}}\rho{}^{1/r}\leq\Bigl{(}\frac{C}{r}\Bigr{)}^{\frac{1}{r}+\frac{1}{2}}\rho{}^{1/r},

with $C\geq 1$ .

The three previous results easily imply the following estimate for integral norms of linear combinations of independent $\psi_{r}$ random variables.

Proposition 2.17.

Let $K,L>0$ , $r\in(0,1]$ and assume that $Z_{1},\dots,Z_{n}$ are independent symmetric random variables satisfying $\mathbb{P}(|Z_{i}|\geq t)\leq Ke^{-t^{r}/L}$ for all $t\geq 0$ and $i\leq n$ . Then, for every $\rho{}\geq 2$ and $a\in\mathbb{R}^{n}$ ,

	$\displaystyle\Bigl{\\|}\sum_{i=1}^{n}a_{i}Z_{i}\Bigr{\\|}_{\rho{}}$	$\displaystyle\lesssim(C/r)^{\frac{1}{r}+\frac{1}{2}}KL^{1/r}\max\bigl{\{}\sqrt{\rho{}}\\|a\\|_{2},\rho{}^{1/r}\\|a\\|_{\rho{}}\bigr{\}}$
		$\displaystyle\lesssim(C^{\prime}/r)^{\frac{1}{r}+\frac{1}{2}}KL^{1/r}\max\bigl{\{}\sqrt{\rho{}}\\|a\\|_{2},\rho{}^{1/r}\\|a\\|_{\infty}\bigr{\}}.$

Proof.

The first inequality is an immediate consequence of Lemma 2.14 (applied with $\eta_{i}=Z_{i}$ , independent Weibull variables $\xi_{i}$ with shape parameter $r$ and scale parameter $1$ , and with the convex function $\varphi:t\mapsto|t|^{\rho}$ ), Lemma 2.15, and Remark 2.16. The second inequality follows from

	$\displaystyle\\|a\\|_{\rho{}}\leq\\|a\\|_{2}^{2/\rho{}}\\|a\\|_{\infty}^{1-2/{\rho{}}}$	$\displaystyle=\rho{}^{\frac{2}{\rho{}r}}\\|\rho{}^{-1/r}a\\|_{2}^{2/\rho{}}\\|a\\|_{\infty}^{1-2/{\rho{}}}$
		$\displaystyle\leq{\rho{}}^{\frac{2}{\rho{}r}}\Bigl{(}\frac{2}{\rho{}^{1+1/r}}\\|a\\|_{2}+\Bigl{(}1-2/{\rho{}}\Bigr{)}\\|a\\|_{\infty}\Bigr{)},$

where in the last step we used the inequality between weighted arithmetic and geometric means. ∎

The next lemma is standard and provides us with several equivalent formulations of the $\psi_{r}$ property expressed through tail bounds, growth of moments, and the exponential moments, respectively. We provide a brief proof, since in the literature one usually finds versions for $r\geq 1$ only.

Lemma 2.18.

Assume that $r\in(0,2]$ . Let $Z$ be a non-negative random variable. The following conditions are equivalent:

(i)

There exist $K_{1},L_{1}>0$ such that

$\mathbb{P}(Z\geq t)\leq K_{1}e^{-t^{r}/L_{1}}\quad\text{for all }t\geq 0.$
(ii)

There exists $K_{2}$ such that

$\|Z\|_{\rho{}}\leq K_{2}\rho{}^{1/r}\quad\text{for all }\rho{}\geq 1.$
(iii)

There exist $K_{3},u>0$ such that

$\mathbb{E}\exp(uZ^{r})\leq K_{3}.$

Here, i implies ii with $K_{2}=C(r)K_{1}L_{1}^{1/r}$ , ii implies iii with $K_{3}=1+e^{(2er)^{-1}}$ , $u=(2erK_{2}^{r})^{-1}$ , and iii implies i with $K_{1}=K_{3}$ , $L_{1}=u^{-1}$ .

Proof.

Property i implies ii by Lemma 2.14 (applied with $n=1$ , $\eta_{1}=Z$ and an independent Weibull variable $\xi_{1}$ with parameter $r$ ) and Remark 2.16. Property iii implies i by Chebyshev’s inequality:

\mathbb{P}(Z\geq t)=\mathbb{P}\bigl{(}\exp(uZ^{r})\geq\exp(ut^{r})\bigr{)}\leq K_{3}\exp(-ut^{r}).

Assume now that ii holds and denote $k_{0}=\lfloor\frac{1}{r}\rfloor$ . Then, for every $k\in[1,k_{0}]$ , we have $kr\leq 1$ and

\mathbb{E}Z^{kr}\leq(\mathbb{E}Z\bigr{)}^{kr}\leq K_{2}^{kr},

while for $k\geq k_{0}+1$ , we have $kr\geq 1$ and, hence, property ii yields

\mathbb{E}Z^{kr}\leq K_{2}^{kr}(kr)^{k}.

Hence, by Stirling’s formula we have for $u=(2erK_{2}^{r})^{-1}$ ,

	$\displaystyle\mathbb{E}\exp(uZ^{r})$	$\displaystyle=1+\sum_{k=1}^{k_{0}}\frac{u^{k}\mathbb{E}Z^{kr}}{k!}+\sum_{k=k_{0}+1}^{\infty}\frac{u^{k}\mathbb{E}Z^{kr}}{k!}$
		$\displaystyle\leq 1+\sum_{k=1}^{k_{0}}\frac{u^{k}K_{2}^{kr}}{k!}+\sum_{k=k_{0}+1}^{\infty}\frac{u^{k}K_{2}^{kr}(kr)^{k}}{\bigl{(}k/e\bigr{)}^{k}}$
		$\displaystyle=1+\sum_{k=1}^{k_{0}}\frac{u^{k}K_{2}^{kr}}{k!}+\sum_{k=k_{0}+1}^{\infty}2^{-k}\leq e^{uK_{2}^{r}}+1.\qed$

The next lemma states that a linear combination of independent $\psi_{r}$ random variables is a $\psi_{r}$ random variable.

Lemma 2.19.

Assume that $u>0$ , $r\in(0,2]$ , and let $(Z_{i})_{i\leq k}$ be independent symmetric random variables satisfying $\mathbb{P}(|Z_{i}|\geq t)\leq Ke^{-t^{r}/L}$ for all $t\geq 0$ . Then for every $a\in\mathbb{R}^{k}$ the random variable $Y\coloneqq\|a\|_{2}^{-1}\sum_{i=1}^{k}a_{i}Z_{i}$ satisfies, for all $t\geq 0$ ,

\mathbb{P}(|Y|\geq t)\leq K^{\prime}e^{-t^{r}/L^{\prime}},

where $K^{\prime}$ , $L^{\prime}$ depend only on $K$ , $L$ , and $r$ .

Proof.

The case $r\geq 1$ is standard (see, e.g., [14, Theorem 1.2.5]), therefore we skip a proof in this case (however, in order to prove the lemma in the case $r\geq 1$ it suffices to use the result of Gluskin and Kwapień [19] (together with Lemma 2.14) instead of Lemma 2.15 in the proof below).

Assume that $r\in(0,1]$ and recall that $Y=\|a\|_{2}^{-1}\sum_{i=1}^{k}a_{i}Z_{i}$ . By Proposition 2.17,

\|Y\|_{\rho{}}\lesssim_{K,L,r}\max\{\sqrt{\rho{}},\rho{}^{1/r}\}=\rho{}^{1/r}\qquad\text{for all }\rho{}\geq 1.

Hence, Lemma 2.18 yields the assertion. ∎

Lemma 2.20.

Assume that $r\in(0,2]$ , $\frac{1}{s}\coloneqq\frac{1}{r}-\frac{1}{2}$ , $Y$ is a non-negative random variable such that $\mathbb{P}(Y\geq t)=e^{-t^{s}}$ for all $t\geq 0$ , and $g\sim\mathcal{N}(0,1)$ is independent of $Y$ . Then, for every $t\geq 0$ ,

\mathbb{P}\bigl{(}|g|Y\geq t\bigr{)}\geq ce^{-4t^{r}},

where $c:=\sqrt{2/\pi}e^{-2}$ .

Proof.

In the case $r=2$ we have $s=\infty$ and then $Y=1$ almost surely and the assertion is trivial. Assume now that $r<2$ . By our assumptions $r=\frac{2s}{2+s}$ . Let $x_{0}\coloneqq(2t^{s})^{1/(2+s)}$ . Note that $x\geq x_{0}$ is equivalent to $\frac{t^{s}}{x^{s}}\leq\frac{x^{2}}{2}$ . Thus,

	$\displaystyle\mathbb{P}\bigl{(}\|g\|Y\geq t\bigr{)}$	$\displaystyle=\mathbb{E}e^{-\frac{t^{s}}{\|g\|^{s}}}=\sqrt{\frac{2}{\pi}}\int_{0}^{\infty}e^{-\frac{t^{s}}{x^{s}}-\frac{x^{2}}{2}}dx\geq\sqrt{\frac{2}{\pi}}\int_{x_{0}}^{x_{0}+1}e^{-\frac{t^{s}}{x^{s}}-\frac{x^{2}}{2}}dx$
		$\displaystyle\geq\sqrt{\frac{2}{\pi}}\int_{x_{0}}^{x_{0}+1}e^{-x^{2}}dx\geq\sqrt{\frac{2}{\pi}}e^{-(x_{0}+1)^{2}}\geq\sqrt{\frac{2}{\pi}}e^{-2(x_{0}^{2}+1)}$
		$\displaystyle=ce^{-2x_{0}^{2}}\geq ce^{-4t^{2s/(2+s)}}=ce^{-4t^{r}},$

where we used $2^{{2/(2+s)}}\leq 2$ and chose $c\coloneqq\sqrt{2/\pi}e^{-2}$ . ∎

Lemma 2.21.

Assume that $K,L>0$ , $r\in(0,2]$ and that $Z$ is a random variable satisfying $\mathbb{P}(|Z|\geq t)\leq Ke^{-t^{r}/L}$ for all $t\geq 0$ . Let $Y$ , $g$ , and $c=\sqrt{2/\pi}e^{-2}$ be as in Lemma 2.20. Then there exist random variables $U\sim|Z|$ and $V\sim|g|Y$ such that

U\leq(8L)^{1/r}\Bigl{(}\Bigl{(}\frac{\ln(K/c)}{4}\Bigr{)}^{1/r}+V\Bigr{)}\quad\text{a.s.}

Proof.

For $t=0$ we have $1=\mathbb{P}(|Z|\geq 0)\leq K$ , so $K\geq 1$ , and thus $\ln(K/c)=\ln(Ke^{2}\sqrt{\pi/2})>0$ . We use our assumptions, the inequality $(a+b)^{r}\geq(a^{r}+b^{r})/2$ , and Lemma 2.20 to obtain for any $t\geq 0$ ,

	$\displaystyle\mathbb{P}\Bigl{(}(8L)^{-1/r}\|Z\|\geq t+\bigl{(}\ln(K/c)/4\bigr{)}^{1/r}\Bigr{)}$	$\displaystyle\leq K\exp\Bigl{(}-8\Bigl{[}t+\bigl{(}\ln(K/c)/4\bigr{)}^{1/r}\Bigr{]}^{r}\Bigr{)}$
		$\displaystyle\leq K\exp\Bigl{(}-4\bigl{(}t^{r}+\ln(K/c)/4\bigr{)}\Bigr{)}=ce^{-4t^{r}}$
		$\displaystyle\leq\mathbb{P}\bigl{(}\|g\|Y\geq t\bigr{)}.$

Consider the version $U$ of $|Z|$ and the version $V$ of $|g|Y$ defined on the (common) probability space $(0,1)$ equipped with Lebesgue measure, constructed as the (generalised) inverses of cumulative distribution functions of $|Z|$ and $|g|Y$ , respectively. Then $(8L)^{-1/r}U-\bigl{(}\ln(K/c)/4\bigr{)}^{1/r}\leq V$ , which implies the assertion. ∎

Lemma 2.22.

Let $K,L>0$ , $r\in(0,2]$ and $k\geq 3$ , and assume that $(Z_{i})_{i\leq k}$ , are random variables satisfying $\mathbb{P}(|Z_{i}|\geq t)\leq Ke^{-t^{r}/L}$ for all $t\geq 0$ . Then

\mathbb{P}\bigl{(}\max_{i\leq k}|Z_{i}|\geq(vL\ln k)^{1/r}\bigr{)}\leq Kk^{-v+1}\leq eKe^{-v}\qquad\text{for every }v\geq 1

and

\mathbb{E}\max_{i\leq k}|Z_{i}|\lesssim\bigl{(}LK^{r}r^{-1}\ln k\bigr{)}^{1/r}\lesssim_{r,K,L}(\ln k)^{1/r}.

Proof.

By a union bound and the assumptions we get, for every $v\geq 1$ ,

	$\displaystyle\mathbb{P}\bigl{(}\max_{i\leq k}\|Z_{i}\|\geq(vL\ln k)^{1/r}\bigr{)}$	$\displaystyle\leq\sum_{i=1}^{k}\mathbb{P}\bigl{(}\|Z_{i}\|\geq(vL\ln k)^{1/r}\bigr{)}\leq k\cdot Ke^{-v\ln k}$
		$\displaystyle=Ke^{-(v-1)\ln k}=Kk^{-v+1}\leq eKe^{-v},$

where we used $k\geq 3$ in the last step. We integrate by parts, change the variables, and use the above bound to obtain the second part of the assertion, i.e.,

	$\displaystyle\mathbb{E}\max_{i\leq k}\|Z_{i}\|$	$\displaystyle=\int_{0}^{\infty}\mathbb{P}\bigl{(}\max_{i\leq k}\|Z_{i}\|\geq u\bigr{)}du\leq(L\ln k)^{1/r}+\int_{(L\ln k)^{1/r}}^{\infty}\mathbb{P}\bigl{(}\max_{i\leq k}\|Z_{i}\|\geq u\bigr{)}du$
		$\displaystyle=(L\ln k)^{1/r}+\frac{(L\ln k)^{1/r}}{r}\int_{1}^{\infty}v^{\frac{1}{r}-1}\mathbb{P}\bigl{(}\max_{i\leq k}\|Z_{i}\|\geq(vL\ln k)^{1/r}\bigr{)}dv$
		$\displaystyle\leq(L\ln k)^{1/r}\Bigl{(}1+\frac{eK}{r}\int_{1}^{\infty}v^{\frac{1}{r}-1}e^{-v}dv\Bigr{)}$
		$\displaystyle\leq(L\ln k)^{1/r}\Bigl{(}1+eK\ \Gamma\Bigl{(}\frac{1}{r}+1\Bigr{)}\Bigr{)}.\qed$

3. Proofs of the main results

After the preparation in the previous section, we shall now present the proofs of our main results.

3.1. General bound via Slepian’s lemma

In order to obtain Theorem 1.3 we first prove its weaker version, for $p=\infty$ and $q=1$ only. After that we shall use the polytope $K$ from Lemma 2.3 and the Gaussian concentration to see how Proposition 3.1 implies the general bound. The proof of this proposition relies on the symmetrization together with the contraction principle, which allow us to get rid of $y_{i}$ and $x_{j}$ , and make use of Slepian’s lemma.

Proposition 3.1.

Assume that $G=(g_{ij})_{i\leq m,j\leq n}$ has i.i.d. standard Gaussian entries and $k\leq m$ , $l\leq n$ . Then

	$\displaystyle\mathbb{E}\sup_{I,J}\sup_{y\in B_{\infty}^{m}}\sup_{x\in B_{\infty}^{n}}\sum_{i\in I,j\in J}y_{i}a_{ij}g_{ij}x_{j}$	$\displaystyle\leq\bigl{(}8\sqrt{\ln m}+\sqrt{2/\pi}\bigr{)}\sup_{I,J}\sum_{i\in I}\sqrt{\sum_{j\in J}a_{ij}^{2}}$
		$\displaystyle\qquad+\bigl{(}8\sqrt{\ln n}+2\sqrt{2/\pi}\bigr{)}\sup_{I,J}\sum_{j\in J}\sqrt{\sum_{i\in I}a_{ij}^{2}},$

where the suprema are taken over all sets $I\subset\{1,\ldots,m\}$ , $J\subset\{1,\ldots,n\}$ such that $|I|=k$ , $|J|=l$ .

Proof.

Throughout the proof, $k\leq m$ and $l\leq n$ are fixed and the suprema are taken over all index sets satisfying $I\subset\{1,\ldots,m\}$ , $|I|=k$ and $J\subset\{1,\ldots,n\}$ , $|J|=l$ .

Let us denote by $(\widetilde{g}_{ij})_{i\leq m,j\leq n}$ an independent copy of $(g_{ij})_{i\leq m,j\leq n}$ . Using the duality $(\ell_{1}^{m})^{*}=\ell_{\infty}^{m}$ , centering the expression, noticing that $\sum_{j\in J}a_{ij}\widetilde{g}_{ij}x_{j}$ is a Gaussian random variable with variance $\sqrt{\sum_{j\in J}a_{ij}^{2}x_{j}^{2}}$ , and using Jensen’s inequality, we see that

	$\displaystyle\mathbb{E}\sup_{I,J}$	$\displaystyle\sup_{x\in B_{\infty}^{n}}\sup_{y\in B_{\infty}^{m}}\sum_{i\in I,j\in J}y_{i}a_{ij}g_{ij}x_{j}=\mathbb{E}\sup_{I,J}\sup_{x\in B_{\infty}^{n}}\sum_{i\in I}\Bigl{\|}\sum_{j\in J}a_{ij}g_{ij}x_{j}\Bigr{\|}$
		$\displaystyle\leq\mathbb{E}\sup_{I,J}\sup_{x\in B_{\infty}^{n}}\sum_{i\in I}\Bigl{(}\Bigl{\|}\sum_{j\in J}a_{ij}g_{ij}x_{j}\Bigr{\|}-\mathbb{E}\Bigl{\|}\sum_{j\in J}a_{ij}\widetilde{g}_{ij}x_{j}\Bigr{\|}\Bigr{)}+\sup_{I,J}\sup_{x\in B_{\infty}^{n}}\sum_{i\in I}\mathbb{E}\Bigl{\|}\sum_{j\in J}a_{ij}\widetilde{g}_{ij}x_{j}\Bigr{\|}$
		$\displaystyle=\mathbb{E}\sup_{I,J}\sup_{x\in B_{\infty}^{n}}\sum_{i\in I}\Bigl{(}\Bigl{\|}\sum_{j\in J}a_{ij}g_{ij}x_{j}\Bigr{\|}-\mathbb{E}\Bigl{\|}\sum_{j\in J}a_{ij}\widetilde{g}_{ij}x_{j}\Bigr{\|}\Bigr{)}+\sup_{I,J}\sup_{x\in B_{\infty}^{n}}\sum_{i\in I}\sqrt{\sum_{j\in J}a_{ij}^{2}x_{j}^{2}}\mathbb{E}\|g\|$
(3.1)			$\displaystyle\leq\mathbb{E}\sup_{I,J}\sup_{x\in B_{\infty}^{n}}\sum_{i\in I}\Bigl{(}\Bigl{\|}\sum_{j\in J}a_{ij}g_{ij}x_{j}\Bigr{\|}-\Bigl{\|}\sum_{j\in J}a_{ij}\widetilde{g}_{ij}x_{j}\Bigr{\|}\Bigr{)}+\sqrt{\frac{2}{\pi}}\sup_{I,J}\sum_{i\in I}\sqrt{\sum_{j\in J}a_{ij}^{2}}.$

To estimate the expected value on the right-hand side, we use a symmetrization trick together with the contraction principle (Lemma 2.8). Let $(\varepsilon_{i})_{i\leq m}$ be a sequence of independent Rademacher random variables independent of all others. Since the random vectors

Z_{i}=\Bigl{(}\mathbf{1}_{\{i\in I\}}\Bigl{(}\bigl{|}\sum_{j\in J}a_{ij}g_{ij}x_{j}\bigr{|}-\bigl{|}\sum_{j\in J}a_{ij}\widetilde{g}_{ij}x_{j}\bigr{|}\Bigr{)}\Bigr{)}_{I\subset[m],J\subset[n],x\in B_{\infty}^{n}}

(where $i\leq m$ ) are independent and symmetric, $(Z_{i})_{i\leq m}$ has the same distribution as $(\varepsilon_{i}Z_{i})_{i\leq m}$ . Therefore,

	$\displaystyle\mathbb{E}\sup_{I,J}\sup_{x\in B_{\infty}^{n}}\sum_{i\in I}\Bigl{(}\Bigl{\|}\sum_{j\in J}a_{ij}g_{ij}x_{j}\Bigr{\|}-\Bigl{\|}\sum_{j\in J}a_{ij}\widetilde{g}_{ij}x_{j}\Bigr{\|}\Bigr{)}$
	$\displaystyle=\mathbb{E}\sup_{I,J}\sup_{x\in B_{\infty}^{n}}\sum_{i\in I}\varepsilon_{i}\Bigl{(}\Bigl{\|}\sum_{j\in J}a_{ij}g_{ij}x_{j}\Bigr{\|}-\Bigl{\|}\sum_{j\in J}a_{ij}\widetilde{g}_{ij}x_{j}\Bigr{\|}\Bigr{)}$
	$\displaystyle\leq 2\mathbb{E}\sup_{I,J}\sup_{x\in B_{\infty}^{n}}\sum_{i\in I}\varepsilon_{i}\Bigl{\|}\sum_{j\in J}a_{ij}g_{ij}x_{j}\Bigr{\|}$
(3.2)		$\displaystyle=2\mathbb{E}\sup_{I,J}\sup_{x\in B_{\infty}^{n}}\sum_{i=1}^{m}\varepsilon_{i}\Bigl{\|}\sum_{j\in J}a_{ij}g_{ij}x_{j}\mathbf{1}_{\{i\in I\}}\Bigr{\|}.$

Applying (conditionally, with the values of $g_{ij}$ ’s fixed) the contraction principle (i.e., Lemma 2.8) with the set

T=\biggl{\{}\Bigl{(}\sum_{j\in J}a_{ij}g_{ij}x_{j}\mathbf{1}_{\{i\in I\}}\Bigr{)}_{i\leq m}\colon I\subset[m],|I|=k,J\subset[n],|J|=l,x\in B_{\infty}^{n}\biggr{\}}

and the function $u\mapsto|u|$ (which is 1-Lipschitz and takes the value $0$ at the origin), we get

		$\displaystyle\mathbb{E}\sup_{I,J}\sup_{x\in B_{\infty}^{n}}\sum_{i=1}^{m}\varepsilon_{i}\Bigl{\|}\sum_{j\in J}a_{ij}g_{ij}x_{j}\mathbf{1}_{\{i\in I\}}\Bigr{\|}\leq\mathbb{E}\sup_{I,J}\sup_{x\in B_{\infty}^{n}}\sum_{i=1}^{m}\varepsilon_{i}\sum_{j\in J}a_{ij}g_{ij}x_{j}\mathbf{1}_{\{i\in I\}}$
(3.3)			$\displaystyle=\mathbb{E}\sup_{I,J}\sup_{x\in B_{\infty}^{n}}\sum_{j\in J}\sum_{i\in I}a_{ij}\varepsilon_{i}g_{ij}x_{j}=\mathbb{E}\sup_{I,J}\sup_{x\in B_{\infty}^{n}}\sum_{j\in J}\sum_{i\in I}a_{ij}g_{ij}x_{j}.$

By proceeding similarly as in (3.1), we obtain

	$\displaystyle\mathbb{E}\sup_{I,J}\sup_{x\in B_{\infty}^{n}}\sum_{j\in J}\sum_{i\in I}a_{ij}g_{ij}x_{j}=\mathbb{E}\sup_{I,J}\sum_{j\in J}\Bigl{\|}\sum_{i\in I}a_{ij}g_{ij}\Bigr{\|}$
(3.4)		$\displaystyle\leq\mathbb{E}\sup_{I,J}\sum_{j\in J}\Bigl{(}\Bigl{\|}\sum_{i\in I}a_{ij}g_{ij}\Bigr{\|}-\mathbb{E}\Bigl{\|}\sum_{i\in I}a_{ij}\widetilde{g}_{ij}\Bigr{\|}\Bigr{)}+\sqrt{\frac{2}{\pi}}\sup_{I,J}\sum_{j\in J}\sqrt{\sum_{i\in I}a_{ij}^{2}}.$

Observe that using symmetrization and the contraction principle similarly as in (3.2) and (3.1), we can estimate the first summand on right-hand side of (3.4) as follows,

(3.5)

\mathbb{E}\sup_{I,J}\sum_{j\in J}\Bigl{(}\Bigl{|}\sum_{i\in I}a_{ij}g_{ij}\Bigr{|}-\mathbb{E}\Bigl{|}\sum_{i\in I}a_{ij}\widetilde{g}_{ij}\Bigr{|}\Bigr{)}\leq 2\mathbb{E}\sup_{I,J}\sum_{i\in I}\sum_{j\in J}a_{ij}g_{ij}.

Altogether, the inequalities in (3.1) – (3.5) yield that

	$\displaystyle\mathbb{E}\sup_{I,J}\sup_{y\in B_{\infty}^{m}}\sup_{x\in B_{\infty}^{n}}\sum_{i\in I,j\in J}y_{i}a_{ij}g_{ij}x_{j}$	$\displaystyle\leq 4\mathbb{E}\sup_{I,J}\sum_{i\in I}\sum_{j\in J}a_{ij}g_{ij}+2\sqrt{\frac{2}{\pi}}\sup_{I,J}\sum_{j\in J}\sqrt{\sum_{i\in I}a_{ij}^{2}}$
(3.6)			$\displaystyle\qquad+\sqrt{\frac{2}{\pi}}\sup_{I,J}\sum_{i\in I}\sqrt{\sum_{j\in J}a_{ij}^{2}}.$

We shall now estimate the first summand on the right-hand side of (3.6) using Slepian’s lemma (i.e., Lemma 2.9). Denote

	$\displaystyle X_{I,J}$	$\displaystyle\coloneqq\sum_{i\in I}\sum_{j\in J}a_{ij}g_{ij},$
	$\displaystyle Y_{I,J}$	$\displaystyle\coloneqq\sum_{i\in I}g_{i}\sqrt{\sum_{j\in J}a_{ij}^{2}}+\sum_{j\in J}\widetilde{g}_{j}\sqrt{\sum_{i\in I}a_{ij}^{2}},$

where $g_{i},i=1,\ldots,m$ , $\widetilde{g}_{j},j=1,\ldots,n$ are independent standard Gaussian variables. The random variables $X_{I,J},Y_{I,J}$ clearly have zero mean. Thus, we only need to calculate and compare $\mathbb{E}(X_{I,J}-X_{\widetilde{I},\widetilde{J}})^{2}$ and $\mathbb{E}(Y_{I,J}-Y_{\widetilde{I},\widetilde{J}})^{2}$ . In the calculations below it will be evident over which sets the index $i$ (resp. $j$ ) runs, so in order to shorten the notation and improve readability, we use the notational convention

\sum_{I}\coloneqq\sum_{i\in I},\qquad\sum_{\widetilde{J}}\coloneqq\sum_{j\in\widetilde{J}},\qquad\sum_{I\cap\widetilde{I},J\setminus\widetilde{J}}\coloneqq\sum_{{i\in I\cap\widetilde{I},j\in J\setminus\widetilde{J}}},\qquad\text{etc.}

By independence,

	$\displaystyle\mathbb{E}(X_{I,J}-X_{\widetilde{I},\widetilde{J}})^{2}$	$\displaystyle=\sum_{I,J}a_{ij}^{2}+\sum_{\widetilde{I},\widetilde{J}}a_{ij}^{2}-2\sum_{\mathclap{I\cap\widetilde{I},J\cap\widetilde{J}}}a_{ij}^{2}$
		$\displaystyle=\sum_{I,J}a_{ij}^{2}+\sum_{\widetilde{I},\widetilde{J}}a_{ij}^{2}-\sum_{{I\cap\widetilde{I},J}}a_{ij}^{2}-\sum_{{I\cap\widetilde{I},\widetilde{J}}}a_{ij}^{2}+\!\!\sum_{{I\cap\widetilde{I},J\setminus\widetilde{J}}}a_{ij}^{2}+\!\!\sum_{{I\cap\widetilde{I},\widetilde{J}\setminus J}}a_{ij}^{2}.$

By independence and the inequality $2\sqrt{ab}\leq a+b$ (valid for $a,b\geq 0$ ),

	$\displaystyle\mathbb{E}(Y_{I,J}-Y_{\widetilde{I},\widetilde{J}})^{2}$	$\displaystyle=2\sum_{I,J}a_{ij}^{2}+2\sum_{\widetilde{I},\widetilde{J}}a_{ij}^{2}$
		$\displaystyle\qquad-2\sum_{I\cap\widetilde{I}}\sqrt{\sum_{J\vphantom{\widetilde{J}--cludgeformovingpositionoflimitdown}}a_{ij}^{2}}\sqrt{\sum_{\widetilde{J}}a_{ij}^{2}}-2\sum_{J\cap\widetilde{J}}\sqrt{\sum_{I\vphantom{\widetilde{I}--cludgeformovingpositionoflimitdown}}a_{ij}^{2}}\sqrt{\sum_{\widetilde{I}}a_{ij}^{2}}$
		$\displaystyle\geq 2\sum_{I,J}a_{ij}^{2}+2\sum_{\widetilde{I},\widetilde{J}}a_{ij}^{2}-\sum_{I\cap\widetilde{I},J\vphantom{\widetilde{J}--cludgeformovingpositionoflimitdown}}a_{ij}^{2}-\sum_{I\cap\widetilde{I},\widetilde{J}}a_{ij}^{2}-\sum_{I\vphantom{\widetilde{I}--cludgeformovingpositionoflimitdown},J\cap\widetilde{J}}a_{ij}^{2}-\sum_{\widetilde{I},J\cap\widetilde{J}}a_{ij}^{2}$
		$\displaystyle=\sum_{I,J}a_{ij}^{2}+\sum_{\widetilde{I},\widetilde{J}}a_{ij}^{2}-\sum_{I\cap\widetilde{I},J\vphantom{\widetilde{J}--cludgeformovingpositionoflimitdown}}a_{ij}^{2}-\sum_{I\cap\widetilde{I},\widetilde{J}}a_{ij}^{2}+\sum_{I\vphantom{\widetilde{I}--cludgeformovingpositionoflimitdown},J\setminus\widetilde{J}}a_{ij}^{2}+\sum_{\widetilde{I},\widetilde{J}\setminus J}a_{ij}^{2}.$

Thus, we clearly have

\mathbb{E}(X_{I,J}-X_{\widetilde{I},\widetilde{J}})^{2}\leq\mathbb{E}(Y_{I,J}-Y_{\widetilde{I},\widetilde{J}})^{2}

(cf. Remark 3.2 below). Hence, by Slepian’s lemma (Lemma 2.9) and Lemma 2.10 on the expected maxima of standard Gaussian random variables,

	$\displaystyle\mathbb{E}\sup_{I,J}\sum_{i\in I}\sum_{j\in J}a_{ij}g_{ij}$	$\displaystyle\leq\mathbb{E}\sup_{I,J}\Biggl{[}\sum_{i\in I}g_{i}\sqrt{\sum_{j\in J}a_{ij}^{2}}+\sum_{j\in J}\widetilde{g}_{j}\sqrt{\sum_{i\in I}a_{ij}^{2}}\Biggr{]}$
		$\displaystyle\leq\mathbb{E}\sup_{I,J}\sum_{i\in I}g_{i}\sqrt{\sum_{j\in J}a_{ij}^{2}}+\mathbb{E}\sup_{I,J}\sum_{j\in J}\widetilde{g}_{j}\sqrt{\sum_{i\in I}a_{ij}^{2}}$
		$\displaystyle\leq\mathbb{E}\sup_{i\leq m}\|g_{i}\|\sup_{I,J}\sum_{i\in I}\sqrt{\sum_{j\in J}a_{ij}^{2}}+\mathbb{E}\sup_{j\leq n}\|\widetilde{g}_{j}\|\sup_{I,J}\sum_{j\in J}\sqrt{\sum_{i\in I}a_{ij}^{2}}$
		$\displaystyle\leq 2\sqrt{\ln m}\sup_{I,J}\sum_{i\in I}\sqrt{\sum_{j\in J}a_{ij}^{2}}+2\sqrt{\ln n}\sup_{I,J}\sum_{j\in J}\sqrt{\sum_{i\in I}a_{ij}^{2}}.$

Recalling the estimate (3.6), we arrive at

	$\displaystyle\mathbb{E}\sup_{I,J}\sup_{y\in B_{\infty}^{m}}\sup_{x\in B_{\infty}^{n}}\sum_{i\in I,j\in J}y_{i}a_{ij}g_{ij}x_{j}$	$\displaystyle\leq\bigl{(}8\sqrt{\ln m}+\sqrt{2/\pi}\bigr{)}\sup_{I,J}\sum_{i\in I}\sqrt{\sum_{j\in J}a_{ij}^{2}}$
		$\displaystyle\qquad+\bigl{(}8\sqrt{\ln n}+2\sqrt{2/\pi}\bigr{)}\sup_{I,J}\sum_{j\in J}\sqrt{\sum_{i\in I}a_{ij}^{2}},$

which completes the proof of Proposition 3.1. ∎

Remark 3.2.

In the above proof, we also have

	$\displaystyle\mathbb{E}(X_{I,J}-X_{\widetilde{I},\widetilde{J}})^{2}=\sum_{I,J}a_{ij}^{2}+\sum_{\widetilde{I},\widetilde{J}}a_{ij}^{2}-\sum_{I\cap\widetilde{I},J\cap\widetilde{J}}a_{ij}^{2}-\sum_{J\cap\widetilde{J},I\cap\widetilde{I}}a_{ij}^{2}$
	$\displaystyle\geq\sum_{I,J}a_{ij}^{2}+\sum_{\widetilde{I},\widetilde{J}}a_{ij}^{2}-\sum_{I\cap\widetilde{I}}\sqrt{\sum_{J}a_{ij}^{2}}\sqrt{\sum_{\widetilde{J}}a_{ij}^{2}}-\sum_{J\cap\widetilde{J}}\sqrt{\sum_{I}a_{ij}^{2}}\sqrt{\sum_{\widetilde{I}}a_{ij}^{2}}$
	$\displaystyle=\frac{1}{2}\mathbb{E}(Y_{I,J}-Y_{\widetilde{I},\widetilde{J}})^{2}.$

Therefore, by Slepian’s lemma (Lemma 2.9) we may reverse the estimate from the proof as follows:

\displaystyle\mathbb{E}\sup_{I,J}\sup_{y\in B_{\infty}^{m}}\sup_{x\in B_{\infty}^{n}}\sum_{i\in I,j\in J}y_{i}a_{ij}g_{ij}x_{j}\geq\frac{1}{\sqrt{2}}\mathbb{E}\sup_{I,J}\biggl{[}\sum_{i\in I}g_{i}\sqrt{\sum_{j\in J}a_{ij}^{2}}+\sum_{j\in J}\widetilde{g}_{j}\sqrt{\sum_{i\in I}a_{ij}^{2}}\biggr{]}.

Proof of Theorem 1.3.

Recall that $\sup_{I_{0},J_{0}}$ stands for the supremum taken over all sets $I_{0}\subset[M]\coloneqq\{1,\ldots,M\}$ , $J_{0}\subset[N]\coloneqq\{1,\ldots,N\}$ with $|I_{0}|=m$ , $|J_{0}|=n$ . Given such sets $I_{0}$ , ${J_{0}}$ , we introduce the sets

	$\displaystyle K$	$\displaystyle=K(I_{0})\coloneqq\operatorname{conv}\Bigl{\{}\frac{1}{\|I\|^{1/q^{*}}}\bigl{(}\varepsilon_{i}\mathbf{1}_{\{i\in I\}}\bigr{)}_{i\in I_{0}}\colon I\subset I_{0},I\neq\emptyset,(\varepsilon_{i})_{i\in I_{0}}\in\{-1,1\}^{I_{0}}\Bigr{\}},$
	$\displaystyle L$	$\displaystyle=L(J_{0})\coloneqq\operatorname{conv}\Bigl{\{}\frac{1}{\|J\|^{1/p}}\bigl{(}\eta_{j}\mathbf{1}_{\{j\in J\}}\bigr{)}_{j\in{J_{0}}}\colon J\subset{J_{0}},J\neq\emptyset,(\eta_{j})_{j\in{J_{0}}}\in\{-1,1\}^{J_{0}}\Bigr{\}}.$

Then, by Lemma 2.3, $B_{q^{*}}^{I_{0}}\subset\ln(em)^{1/q}K$ and $B_{p}^{J_{0}}\subset\ln(en)^{1/p^{*}}L$ . Therefore,

	$\displaystyle\mathbb{E}\sup_{I_{0},J_{0}}\sup_{x\in B_{p}^{J_{0}}}\sup_{y\in B_{q^{*}}^{I_{0}}}\sum_{i\in I_{0}}\sum_{j\in J_{0}}y_{i}a_{ij}g_{ij}x_{j}$
	$\displaystyle\leq\ln(em)^{1/q}\ln(en)^{1/p^{*}}$
	$\displaystyle\quad\cdot\mathbb{E}\sup_{I_{0},J_{0}}\sup_{I\subset I_{0},J\subset J_{0}}\sup\Bigl{\{}\frac{1}{\|I\|^{1/q^{*}}\|J\|^{1/p}}\sum_{i\in I}\sum_{j\in J}\varepsilon_{i}a_{ij}g_{ij}\eta_{j}:\ \varepsilon_{i},\eta_{j}\in\{-1,1\}\Bigr{\}}$
	$\displaystyle=\ln(em)^{1/q}\ln(en)^{1/p^{*}}$
	$\displaystyle\quad\cdot\mathbb{E}\max_{k\leq m,l\leq n}\frac{1}{k^{1/q^{*}}l^{1/p}}\sup_{I\subset[M],\|I\|=k}\sup_{J\subset[N],\|J\|=l}\sup_{x\in B_{\infty}^{N}}\sup_{y\in B_{\infty}^{M}}\sum_{i\in I}\sum_{j\in J}y_{i}a_{ij}g_{ij}x_{j}$
(3.7)		$\displaystyle=\ln(em)^{1/q}\ln(en)^{1/p^{*}}\mathbb{E}\max_{k\leq m,l\leq n}Z_{k,l},$

where we denoted

Z_{k,l}\coloneqq\frac{1}{k^{1/q^{*}}l^{1/p}}\sup_{I,J}\sup_{x\in B_{\infty}^{N}}\sup_{y\in B_{\infty}^{M}}\sum_{i\in I}\sum_{j\in J}y_{i}a_{ij}g_{ij}x_{j},

with the suprema here (and later on in this proof) being always taken over all sets $I\subset[M],|I|=k$ and $J\subset[N],|J|=l$ .

By Proposition 3.1, we only know that for all $k\leq m$ and $l\leq n$ ,

	$\displaystyle\mathbb{E}Z_{k,l}$	$\displaystyle\leq\bigl{(}8\sqrt{\ln M}+\sqrt{2/\pi}\bigr{)}\frac{1}{k^{1/q^{*}}l^{1/p}}\sup_{I,J}\sum_{i\in I}\sqrt{\sum_{j\in J}a_{ij}^{2}}$
(3.8)			$\displaystyle\qquad+\bigl{(}8\sqrt{\ln N}+2\sqrt{2/\pi}\bigr{)}\frac{1}{k^{1/q^{*}}l^{1/p}}\sup_{I,J}\sum_{j\in J}\sqrt{\sum_{i\in I}a_{ij}^{2}},$

but we shall use the Gaussian concentration and the union bound to obtain an estimate for $\mathbb{E}\max_{k\leq m,l\leq n}Z_{k,l}.$

Note first that $(k^{-1/q^{\ast}}\mathbf{1}_{\{i\in I\}})_{i\in I_{0}}\in K(I_{0})\subset B_{q^{\ast}}^{I_{0}}$ and $(l^{-1/p}\mathbf{1}_{\{j\in J\}})_{j\in J_{0}}\in L(J_{0})\subset B_{p}^{J_{0}}$ , provided that $|I|=k$ , $|J|=l$ , $I\subset I_{0}$ , $J\subset J_{0}$ . Therefore,

	$\displaystyle\frac{1}{k^{1/q^{*}}l^{1/p}}\sup_{I,J}\sum_{i\in I}\sqrt{\sum_{j\in J}a_{ij}^{2}}$	$\displaystyle\leq\sup_{I_{0},J_{0}}\sup_{x\in B_{p}^{J_{0}}}\sup_{y\in B_{q^{\ast}}^{I_{0}}}\sum_{i\in I_{0}}y_{i}\sqrt{\sum_{j\in J_{0}}a_{ij}^{2}x_{j}^{2}}$
		$\displaystyle=\sup_{I_{0},J_{0}}\sup_{z\in B_{p/2}^{J_{0}}}\Bigl{(}\sum_{i\in I_{0}}\bigl{(}\sum_{j\in J_{0}}a_{ij}^{2}z_{j}\bigr{)}^{q/2}\Bigr{)}^{1/q}$
		$\displaystyle=\sup_{I_{0},J_{0}}\\|A\mathbin{\circ}A\colon\ell^{J_{0}}_{p/2}\to\ell^{I_{0}}_{q/2}\\|^{1/2}$

and, similarly,

\displaystyle\frac{1}{k^{1/q^{*}}l^{1/p}}\sup_{I,J}\sum_{j\in J}\sqrt{\sum_{i\in I}a_{ij}^{2}}\leq\sup_{I_{0},J_{0}}\|(A\mathbin{\circ}A)^{T}\colon\ell^{I_{0}}_{q^{*}/2}\to\ell^{J_{0}}_{p^{*}/2}\|^{1/2}.

This together with the estimate in (3.1) gives

	$\displaystyle\mathbb{E}Z_{k,l}$	$\displaystyle\leq\bigl{(}8\sqrt{\ln M}+\sqrt{2/\pi}\bigr{)}\sup_{I_{0},J_{0}}\\|A\mathbin{\circ}A\colon\ell^{J_{0}}_{p/2}\to\ell^{I_{0}}_{q/2}\\|^{1/2}$
(3.9)			$\displaystyle\qquad+\bigl{(}8\sqrt{\ln N}+2\sqrt{2/\pi}\bigr{)}\sup_{I_{0},J_{0}}\\|(A\mathbin{\circ}A)^{T}\colon\ell^{I_{0}}_{q^{}/2}\to\ell^{J_{0}}_{p^{}/2}\\|^{1/2}.$

Note that by the Cauchy–Schwarz inequality, the function

z\mapsto\frac{1}{k^{1/q^{*}}l^{1/p}}\sup_{I,J}\sup_{x\in B_{\infty}^{N}}\sup_{y\in B_{\infty}^{M}}\sum_{i\in I}\sum_{j\in J}y_{i}a_{ij}z_{ij}x_{j}

is $D$ -Lipschitz with

	$\displaystyle D\leq\frac{1}{k^{1/q^{*}}l^{1/p}}\sup_{I,J}\sqrt{\sum_{j\in J}\sum_{i\in I}a_{ij}^{2}}$	$\displaystyle\leq\sup_{I,J}\sqrt{\sup_{x\in B_{p/2}^{N}}\sup_{y\in B_{q^{*}/2}^{M}}\sum_{i\in I}\sum_{j\in J}y_{i}a_{ij}^{2}x_{j}}$
		$\displaystyle\leq\sup_{I_{0},J_{0}}\sqrt{\sup_{x\in B_{p/2}^{N}}\sup_{y\in B_{q^{*}/2}^{M}}\sum_{i\in I_{0}}\sum_{j\in J_{0}}y_{i}a_{ij}^{2}x_{j}},$

where in the last inequality we used the fact that $k\leq m$ and $l\leq n$ . In order to estimate the right-hand side of the latter inequality, we consider the following two cases:

Case 1. If $q^{*}\geq 2$ , then $(q^{*}/2)^{*}=q/(2-q)\geq q/2$ and $\|\cdot\|_{q/(2-q)}\leq\|\cdot\|_{q/2}$ . Consequently,

(3.10)

\displaystyle\sup_{x\in B_{p/2}^{N},y\in B_{q^{*}/2}^{M}}\sum_{i\in I_{0}}\sum_{j\in J_{0}}y_{i}a_{ij}^{2}x_{j}

\displaystyle=\|A\mathbin{\circ}A\colon\ell_{p/2}^{J_{0}}\to\ell_{q/(2-q)}^{I_{0}}\|\leq\|A\mathbin{\circ}A\colon\ell_{p/2}^{J_{0}}\to\ell_{q/2}^{I_{0}}\|.

Case 2. If $q^{*}\leq 2$ , then $B_{q^{*}/2}^{M}\subset B_{1}^{M}$ and $\|\cdot\|_{\infty}\leq\|\cdot\|_{q/2}$ . Thus,

	$\displaystyle\sup_{x\in B_{p/2}^{N},y\in B_{q^{*}/2}^{M}}\sum_{i\in I_{0}}\sum_{j\in J_{0}}y_{i}a_{ij}^{2}x_{j}$	$\displaystyle\leq\sup_{u\in B_{p/2}^{N},v\in B_{1}^{M}}\sum_{i\in I_{0}}\sum_{j\in J_{0}}v_{i}a_{ij}^{2}u_{j}$
(3.11)			$\displaystyle=\\|A\mathbin{\circ}A\colon\ell_{p/2}^{J_{0}}\to\ell_{\infty}^{I_{0}}\\|\leq\\|A\mathbin{\circ}A\colon\ell_{p/2}^{J_{0}}\to\ell_{q/2}^{I_{0}}\\|.$

In both cases we have

D\leq\sup_{I_{0},J_{0}}\|A\mathbin{\circ}A\colon\ell^{J_{0}}_{p/2}\to\ell^{I_{0}}_{q/2}\|^{1/2},

so the Gaussian concentration inequality (see, e.g., [41, Chapter 5.1]) implies that for all $u\geq 0$ , $k\leq m$ , and $l\leq n$ ,

\displaystyle\mathbb{P}(Z_{k,l}\geq\mathbb{E}Z_{k,l}+u)\leq\exp\Bigl{(}-\frac{u^{2}}{2\sup_{I_{0},J_{0}}\|A\mathbin{\circ}A\colon\ell^{J_{0}}_{p/2}\to\ell^{I_{0}}_{q/2}\|}\Bigr{)},

\mathbb{P}\bigl{(}Z_{k,l}\geq\max_{k\leq m,l\leq n}\mathbb{E}Z_{k,l}+\sqrt{2\ln(mn)}u\sup_{I_{0},J_{0}}\|A\mathbin{\circ}A\colon\ell^{J_{0}}_{p/2}\to\ell^{I_{0}}_{q/2}\|^{1/2}\bigr{)}\\ \leq\exp(-u^{2}\ln(mn)).

This, together with the union bound, implies that for $u\geq\sqrt{2}$ , we have

\mathbb{P}\bigl{(}\max_{k\leq m,l\leq n}Z_{k,l}\geq\max_{k\leq m,l\leq n}\mathbb{E}Z_{k,l}+\sqrt{2\ln(mn)}u\sup_{I_{0},J_{0}}\|A\mathbin{\circ}A\colon\ell^{J_{0}}_{p/2}\to\ell^{I_{0}}_{q/2}\|^{1/2}\bigr{)}\\ \leq mne^{-u^{2}\ln(mn)}=\exp\Bigl{(}-(u^{2}-1)\ln(mn)\Bigr{)}\leq e^{-u^{2}/2}.

Hence, by Lemma 2.6 and the estimate in (3.1),

	$\displaystyle\mathbb{E}\!\max_{k\leq m,l\leq n}Z_{k,l}$	$\displaystyle\leq\max_{k\leq m,l\leq n}\mathbb{E}Z_{k,l}+\sqrt{2\ln(mn)}\Bigl{(}\sqrt{2}+\frac{1}{e\sqrt{2}}\Bigr{)}\sup_{I_{0},J_{0}}\\|A\mathbin{\circ}A\colon\ell^{J_{0}}_{p/2}\to\ell^{I_{0}}_{q/2}\\|^{1/2}$
		$\displaystyle\leq\bigl{(}2.4\sqrt{\ln(mn)}+8\sqrt{\ln M}+\sqrt{2/\pi}\bigr{)}\sup_{I_{0},J_{0}}\\|A\mathbin{\circ}A\colon\ell^{J_{0}}_{p/2}\to\ell^{I_{0}}_{q/2}\\|^{1/2}$
		$\displaystyle\qquad\qquad+\bigl{(}8\sqrt{\ln N}+2\sqrt{2/\pi}\bigr{)}\sup_{I_{0},J_{0}}\\|(A\mathbin{\circ}A)^{T}\colon\ell^{I_{0}}_{q^{}/2}\to\ell^{J_{0}}_{p^{}/2}\\|^{1/2}.$

Recalling (3.1) yields the assertion. ∎

3.2. Coupling

In this subsection we use contraction principles and the coupling described in Lemma 2.21 to prove Corollaries 1.5 and 1.6, and Proposition 1.16. Below we state more general versions of the corollaries akin to Theorem 1.3 (the versions from the introduction follow by setting $M=m$ , $N=n$ ).

Theorem 3.3 (General version of Corollary 1.5).

Assume that $m\leq M$ , $n\leq N$ , $1\leq p,q\leq\infty$ , and $X=(X_{ij})_{i\leq M,j\leq N}$ has independent mean-zero entries taking values in $[-1,1]$ . Then

\mathbb{E}\sup_{I,J}\|X_{A}\colon\ell_{p}^{J}\to\ell_{q}^{I}\|=\mathbb{E}\sup_{I,J}\sup_{x\in B_{p}^{J}}\sup_{y\in B_{q^{*}}^{I}}\sum_{i\in I}\sum_{j\in J}y_{i}a_{ij}X_{ij}x_{j}\\ \leq\ln(en)^{1/p^{*}}\ln(em)^{1/q}\Bigl{[}\bigl{(}2.4\sqrt{2\pi}\sqrt{\ln(mn)}+8\sqrt{2\pi}\sqrt{\ln M}+2\bigr{)}\sup_{I,J}\|A\mathbin{\circ}A\colon\ell^{J}_{p/2}\to\ell^{I}_{q/2}\|^{1/2}\\ +\bigl{(}8\sqrt{2\pi}\sqrt{\ln N}+4\bigr{)}\sup_{I,J}\|(A\mathbin{\circ}A)^{T}\colon\ell^{I}_{q^{*}/2}\to\ell^{J}_{p^{*}/2}\|^{1/2}\Bigr{]},

where the suprema are taken over all sets $I\subset\{1,\ldots,M\}$ , $J\subset\{1,\ldots,N\}$ such that $|I|=m$ , $|J|=n$ .

Remark 3.4 (Symmetrization of entries of a random matrix).

Let $\widetilde{Z}$ be an independent copy of a random matrix $Z$ with mean $0$ entries. Then for any norm $\|\cdot\|$ , including the operator norm from $\ell_{p}^{n}$ to $\ell_{q}^{m}$ , we have by Jensen’s inequality

\mathbb{E}\|Z\|=\mathbb{E}\|Z-\mathbb{E}\widetilde{Z}\|\leq\mathbb{E}\|Z-\widetilde{Z}\|\leq\mathbb{E}\|Z\|+\mathbb{E}\|\widetilde{Z}\|=2\mathbb{E}\|Z\|.

Therefore, in many cases we may simply assume that we deal with matrices with symmetric (not only mean $0$ ) entries. For example, in the setting of Theorem 3.3, the entries of $X-\widetilde{X}$ are symmetric and take values in $[-2,2]$ , so it suffices to prove the assertion of this theorem (with a two times smaller constant on the right-hand side) under the additional assumption that the entries of the given random matrix are symmetric.

Proof of Theorem 3.3.

By Remark 3.4 we may and do assume that the entries of $X$ are symmetric — in this case we need to prove the assertion with a two times smaller constant.

Since the entries of $X$ are independent and symmetric, $X$ has the same distribution as $(\varepsilon_{ij}|X_{ij}|)_{i,j}$ , where $(\varepsilon_{ij})_{i\leq M,j\leq N}$ is a random matrix with i.i.d. Rademacher entries, independent of all other random variables. Thus, the contraction principle (see Lemma 2.7) applied conditionally yields (below the suprema are taken over all sets $I\subset\{1,\ldots,M\}$ , $J\subset\{1,\ldots,N\}$ such that $|I|=m$ , $|J|=n$ , and over all $x\in B_{p}^{J},y\in B_{q^{*}}^{I}$ , and the sums run over all $i\in I$ and $j\in J$ )

	$\displaystyle\mathbb{E}\sup\sum_{I,J}y_{i}a_{ij}X_{ij}x_{j}$	$\displaystyle=\mathbb{E}\sup\sum_{I,J}y_{i}a_{ij}\varepsilon_{ij}\bigl{\|}X_{ij}\bigr{\|}x_{j}\leq\mathbb{E}\sup\sum_{I,J}y_{i}a_{ij}\varepsilon_{ij}x_{j}$
		$\displaystyle=\sqrt{\frac{\pi}{2}}\mathbb{E}\sup\sum_{I,J}y_{i}a_{ij}\varepsilon_{ij}\mathbb{E}\|g_{ij}\|x_{j}\leq\sqrt{\frac{\pi}{2}}\mathbb{E}\sup\sum_{I,J}y_{i}a_{ij}\varepsilon_{ij}\|g_{ij}\|x_{j}$
		$\displaystyle=\sqrt{\frac{\pi}{2}}\mathbb{E}\sup\sum_{I,J}y_{i}a_{ij}g_{ij}x_{j},$

and the assertion follows from Theorem 1.3. ∎

Theorem 3.5 (General version of Corollary 1.6).

Assume that $K,L>0$ , $r\in(0,2]$ , $m\leq M$ , $n\leq N$ , $1\leq p,q\leq\infty$ , and $X=(X_{ij})_{i\leq M,j\leq N}$ has independent mean-zero entries satisfying

\mathbb{P}(|X_{ij}|\geq t)\leq Ke^{-t^{r}/L}\qquad\text{for all }t\geq 0,\,i\leq M,\,j\leq N.

Then

\mathbb{E}\sup_{I,J}\|X_{A}\colon\ell_{p}^{J}\to\ell_{q}^{I}\|=\mathbb{E}\sup_{I,J}\sup_{x\in B_{p}^{J}}\sup_{y\in B_{q^{*}}^{I}}\sum_{i\in I}\sum_{j\in J}y_{i}a_{ij}X_{ij}x_{j}\\ \lesssim_{r,K,L}(\ln n)^{1/p^{*}}(\ln m)^{1/q}\ln(MN)^{\frac{1}{r}-\frac{1}{2}}\Bigl{[}\bigl{(}\sqrt{\ln(mn)}+\sqrt{\ln M}\bigr{)}\sup_{I,J}\|A\mathbin{\circ}A\colon\ell^{J}_{p/2}\to\ell^{I}_{q/2}\|^{1/2}\\ +\sqrt{\ln N}\sup_{I,J}\|(A\mathbin{\circ}A)^{T}\colon\ell^{I}_{q^{*}/2}\to\ell^{J}_{p^{*}/2}\|^{1/2}\Bigr{]},

where the suprema are taken over all sets $I\subset\{1,\ldots,M\}$ , $J\subset\{1,\ldots,N\}$ such that $|I|=m$ , $|J|=n$ .

Proof.

Let $\widetilde{X}$ be an independent copy of $X$ . Then

	$\displaystyle\mathbb{P}(\|X_{ij}-\widetilde{X}_{ij}\|\geq t)$	$\displaystyle\leq\mathbb{P}(\|X_{ij}\|\geq t/2\text{ or }\|\widetilde{X}_{ij}\|\geq t/2)$
		$\displaystyle\leq 2\mathbb{P}(\|X_{ij}\|\geq t/2)\leq 2Ke^{-t^{r}/(2^{r}L)}.$

This means that the symmetric matrix $X-\widetilde{X}$ satisfies the assumptions of Theorem 3.5. Hence, due to Remark 3.4, we may and do assume that the entries of $X$ are symmetric.

Take the unique positive parameter $s$ satisfying $\frac{1}{r}=\frac{1}{2}+\frac{1}{s}$ . For $i\leq M$ , $j\leq N$ , let $g_{ij}$ be i.i.d. standard Gaussian variables, independent of other variables, and let $Y_{ij}$ be i.i.d. non-negative Weibull random variables with shape parameter $s$ scale parameter $1$ (i.e., $\mathbb{P}(Y_{ij}\geq t)=e^{-t^{s}}$ for $t\geq 0$ ), independent of other variables. (In the case $r=2$ , we have $s=\infty$ and then $Y_{ij}=1$ almost surely.) Take

(U_{ij})_{i\leq M,j\leq N}\overset{d}{\sim}(|X_{ij}|)_{i\leq M,j\leq N},\qquad(V_{ij})_{i\leq M,j\leq N}\overset{d}{\sim}(|g_{ij}|Y_{ij})_{i\leq M,j\leq N}

as in Lemma 2.21 (we pick a pair $(U_{ij},V_{ij})$ separately for every $(i,j)$ , and then take such a version of each pair that the system of $MN$ random pairs $(U_{ij},V_{ij})$ is independent).

Let $(\varepsilon_{ij})_{i\leq M,j\leq N}$ be a random matrix with i.i.d. Rademacher entries, independent of all other random variables. Since the entries of $X$ are symmetric and independent, $X$ has the same distribution as $(\varepsilon_{ij}|X_{ij}|)_{ij}$ . By Lemma 2.21 we know that

U_{ij}\leq(8L)^{1/r}\Bigl{(}\Bigl{(}\frac{\ln(K/c)}{4}\Bigr{)}^{1/r}+V_{ij}\Bigr{)}\lesssim_{r,K,L}1+V_{ij}\qquad\text{a.s.}

We use the contraction principle conditionally for $\mathbb{E}_{\varepsilon}$ , i.e., for $U_{ij}$ ’s and $V_{ij}$ ’s fixed. More precisely, we apply Lemma 2.7 to the space $\bf{X}$ of all $M\times N$ matrices with real coefficients, equipped with the norm

\|(M_{ij})_{i\leq M,j\leq N}\|\coloneqq\sup_{I,J}\bigl{\|}(M_{ij})_{i\in I,j\in J}\colon\ell_{p}^{I}\to\ell_{q}^{J}\bigr{\|}=\sup\sum_{I,J}y_{i}M_{ij}x_{j}

(where the first supremum is taken over all sets $I\subset\{1,\ldots,M\}$ , $J\subset\{1,\ldots,N\}$ such that $|I|=m$ , $|J|=n$ ; recall that the second supremum is taken over all sets $I,J$ as in the first supremum, and over all $x\in B_{p}^{J},y\in B_{q^{*}}^{I}$ , and the sum runs over all $i\in I$ and $j\in J$ ); note that we identify $\bf{X}$ with $\mathbb{R}^{MN}$ (and $MN$ plays the role of $n$ from Lemma 2.7). We apply the contraction principle of Lemma 2.7 (conditionally, with the values of $U_{ij}$ ’s and $V_{ij}$ ’s fixed) with coefficients $\alpha_{ij}\coloneqq\frac{U_{ij}}{C(r,K,L)(1+V_{ij})}$ and points ${\bf x}_{ij}\coloneqq\bigl{(}a_{kl}C(r,K,L)(1+V_{kl})\mathbf{1}_{\{(k,l)=(i,j)\}}\bigr{)}_{kl}\in\bf{X}$ to get

	$\displaystyle\mathbb{E}\sup\sum_{I,J}y_{i}a_{ij}X_{ij}x_{j}$	$\displaystyle=\mathbb{E}\sup\sum_{I,J}y_{i}a_{ij}\varepsilon_{ij}\|X_{ij}\|x_{j}=\mathbb{E}\sup\sum_{I,J}y_{i}a_{ij}\varepsilon_{ij}U_{ij}x_{j}$
(3.12)			$\displaystyle\overset{\mathclap{\text{Lemma }\ref{lem:contraction-principle}}}{\lesssim_{r,K,L}}\ \mathbb{E}\sup\sum_{I,J}y_{i}a_{ij}\varepsilon_{ij}x_{j}+\mathbb{E}\sup\sum_{I,J}y_{i}a_{ij}\varepsilon_{ij}V_{ij}x_{j}.$

We may estimate the first term using Theorem 3.3 applied to the matrix $(\varepsilon_{ij})_{i\leq M,j\leq N}$ as follows,

	$\displaystyle\mathbb{E}\sup\sum_{I,J}y_{i}a_{ij}\varepsilon_{ij}x_{j}$	$\displaystyle\lesssim\ln(en)^{1/p^{*}}\ln(em)^{1/q}$
		$\displaystyle\qquad\cdot\Bigl{[}\bigl{(}\sqrt{\ln(mn)}+\sqrt{\ln M}\bigr{)}\sup_{I,J}\\|A\mathbin{\circ}A\colon\ell^{J}_{p/2}\to\ell^{I}_{q/2}\\|^{1/2}$
(3.13)			$\displaystyle\qquad\qquad+\sqrt{\ln N}\sup_{I,J}\\|(A\mathbin{\circ}A)^{T}\colon\ell^{I}_{q^{}/2}\to\ell^{J}_{p^{}/2}\\|^{1/2}\Bigr{]}.$

Recall that $(\varepsilon_{ij}V_{ij})_{i\leq M,j\leq N}\overset{d}{\sim}(\varepsilon_{ij}g_{ij}Y_{ij})_{i\leq M,j\leq N}$ and that $Y_{ij}\geq 0$ almost surely. Next we again use the contraction principle (applied conditionally for $\mathbb{E}_{\varepsilon}$ , i.e. for fixed $Y_{ij}$ ’s and $g_{ij}$ ’s) and get

	$\displaystyle\mathbb{E}\sup\sum_{I,J}y_{i}a_{ij}\varepsilon_{ij}V_{ij}x_{j}$	$\displaystyle=\mathbb{E}\sup\sum_{I,J}y_{i}a_{ij}\varepsilon_{ij}g_{ij}Y_{ij}x_{j}$
(3.14)			$\displaystyle\leq\mathbb{E}_{Y}\max_{i\leq M,j\leq N}\|Y_{ij}\|\ \mathbb{E}_{\varepsilon,g}\sup\sum_{I,J}y_{i}a_{ij}\varepsilon_{ij}g_{ij}x_{j}.$

Moreover, Theorem 1.3 and Lemma 2.22 (applied with $r=s$ , $k=MN$ , $Z_{ij}=Y_{ij}$ , and $K=1=L$ ), imply

	$\displaystyle\mathbb{E}_{Y}\max_{i\leq M,j\leq N}\|Y_{ij}\|\ \mathbb{E}_{\varepsilon,g}\sup\sum_{I,J}y_{i}a_{ij}\varepsilon_{ij}g_{ij}x_{j}$
	$\displaystyle\lesssim_{r}\ln(MN)^{1/s}\ \mathbb{E}\sup\sum_{I,J}y_{i}a_{ij}g_{ij}x_{j}$
	$\displaystyle\lesssim\ln(MN)^{\frac{1}{r}-\frac{1}{2}}(\ln n)^{1/p^{*}}(\ln m)^{1/q}$
	$\displaystyle\qquad\cdot\Bigl{[}\bigl{(}\sqrt{\ln(mn)}+\sqrt{\ln M}\bigr{)}\sup_{I,J}\\|A\mathbin{\circ}A\colon\ell^{J}_{p/2}\to\ell^{I}_{q/2}\\|^{1/2}$
(3.15)		$\displaystyle\qquad\qquad\qquad+\sqrt{\ln N}\sup_{I,J}\\|(A\mathbin{\circ}A)^{T}\colon\ell^{I}_{q^{}/2}\to\ell^{J}_{p^{}/2}\\|^{1/2}\Bigr{]}.$

Combining the estimates in (3.2)–(3.2) yields the assertion. ∎

Finally, we prove that these estimates of the operator norms translate into tail bounds.

Proof of Proposition 1.16.

Since (1.23) implies (1.24) (by Lemma 2.18), it suffices to prove inequality (1.23). By the symmetrization argument similar to the one from the first paragraph of the proof of Theorem 3.5, we may nad will assume that $X$ has independent and symmetric entries satisfying (1.21). By assumption (1.21), and the inequality $2(a+b)^{r}\geq a^{r}+b^{r}$ we have for every $t\geq 0$ ,

\mathbb{P}\bigl{(}(2L)^{-1/r}|X_{ij}|\geq t+(\ln K)^{1/r}\bigr{)}\\ \leq K\exp\Bigl{(}-2\bigl{(}t+(\ln K)^{1/r}\bigr{)}^{r}\Bigr{)}\leq K\exp\bigl{(}-t^{r}-\ln K\bigr{)}=e^{-t^{r}},

so (as in the proof of Lemma 2.21) there exists a random matrix $(Y_{ij})_{i\leq m,j\leq n}$ with i.i.d. entries with the symmetric Weibull distribution with shape parameter $r$ and scale parameter $1$ (i.e., $\mathbb{P}(|Y_{ij}|\geq t)=e^{-t^{r}}$ for $t\geq 0$ ) satisfying

(3.16)

|X_{ij}|\leq(2L)^{1/r}\bigl{(}(\ln K)^{1/r}+|Y_{ij}|\bigr{)}\lesssim_{r,K,L}1+Y_{ij}\qquad\text{a.s.}

Let $(\varepsilon_{ij})_{i\leq m,j\leq n}$ be a matrix of independent Rademacher random variables independent of all others, and let $\|\cdot\|$ denote the operator norm from $\ell_{p}^{n}$ to $\ell_{q}^{m}$ . Let $E_{ij}$ be a matrix with $1$ at the intersection of $i$ th row and $j$ th column and with other entries $0$ . The contraction principle (i.e., Lemma 2.7) applied conditionally, (3.16), and the triangle inequality yield for any $\rho\geq 1$ ,

	$\displaystyle\biggl{(}\mathbb{E}\Bigl{\\|}\sum_{i=1}^{m}\sum_{j=1}^{n}$	$\displaystyle X_{ij}a_{ij}E_{ij}\Bigr{\\|}^{\rho}\biggr{)}^{1/\rho}\leq\biggl{(}\mathbb{E}\Bigl{\\|}\sum_{i,j}\varepsilon_{ij}\|X_{ij}\|a_{ij}E_{ij}\Bigr{\\|}^{\rho}\biggr{)}^{1/\rho}$
		$\displaystyle\lesssim_{r,K,L}\biggl{(}\mathbb{E}\Bigl{\\|}\sum_{i,j}\varepsilon_{ij}a_{ij}E_{ij}\Bigr{\\|}^{\rho}\biggr{)}^{1/\rho}+\biggl{(}\mathbb{E}\Bigl{\\|}\sum_{i,j}\varepsilon_{ij}\|Y_{ij}\|a_{ij}E_{ij}\Bigr{\\|}^{\rho}\biggr{)}^{1/\rho}$
		$\displaystyle=\biggl{(}\mathbb{E}\Bigl{\\|}\sum_{i,j}\varepsilon_{ij}a_{ij}E_{ij}\Bigr{\\|}^{\rho}\biggr{)}^{1/\rho}+\biggl{(}\mathbb{E}\Bigl{\\|}\sum_{i,j}Y_{ij}a_{ij}E_{ij}\Bigr{\\|}^{\rho}\biggr{)}^{1/\rho}.$

Therefore, it suffices to prove (1.23) for random matrices $(Y_{ij})_{ij}$ and $(\varepsilon_{ij})_{ij}$ instead of $X$ .

Since by assumption $K,L\geq 1$ , both random matrices $(Y_{ij})_{ij}$ and $(\varepsilon_{ij})_{ij}$ satisfy (1.21), so for them inequality (1.22) holds. By the comparison of weak and strong moments [38, Theorem 1.1] (note that the random variables $Y_{ij}$ satisfy the assumption $\|Y_{ij}\|_{2s}\leq\alpha\|Y_{ij}\|_{s}$ for all $s\geq 2$ with $\alpha=2^{1/r}$ by [38, Remark 1.5]), we have

(3.17)

\biggl{(}\mathbb{E}\Bigl{\|}\sum_{i,j}Y_{ij}a_{ij}E_{ij}\Bigr{\|}^{\rho}\biggr{)}^{1/\rho}=\biggl{(}\mathbb{E}\sup_{x\in B_{p}^{n},\ y\in B_{q^{*}}^{m}}\Bigl{|}\sum_{i,j}y_{i}Y_{ij}a_{ij}x_{j}\Bigr{|}^{\rho}\biggr{)}^{1/\rho}\\ \lesssim_{r}\mathbb{E}\sup_{x\in B_{p}^{n},\ y\in B_{q^{*}}^{m}}\sum_{i,j}y_{i}Y_{ij}a_{ij}x_{j}+\sup_{x\in B_{p}^{n},\ y\in B_{q^{*}}^{m}}\biggl{(}\mathbb{E}\Bigl{|}\sum_{i,j}y_{i}Y_{ij}a_{ij}x_{j}\Bigr{|}^{\rho}\biggr{)}^{1/\rho}.

Because of inequality (1.22), the first summand on the right-hand side may be estimated by $\gamma D$ . Lemma 2.19 and the implication i $\implies$ ii from Lemma 2.18 yield

\biggl{(}\mathbb{E}\Bigl{|}\sum_{i,j}y_{i}Y_{ij}a_{ij}x_{j}\Bigr{|}^{\rho}\biggr{)}^{1/\rho}\lesssim_{r,K,L}\ \rho^{1/r}\sqrt{\sum_{i,j}y_{i}^{2}a_{ij}^{2}x_{j}^{2}}.

Moreover, by (3.10) and (3.1) (used with $m=M$ and $n=N$ ) and our assumption that $\|A\mathbin{\circ}A\colon\ell^{n}_{p/2}\to\ell^{m}_{q/2}\|^{1/2}\leq D$ ,

\sup_{x\in B_{p}^{n},\ y\in B_{q^{*}}^{m}}\sqrt{\sum_{i,j}y_{i}^{2}a_{ij}^{2}x_{j}^{2}}\leq D,

so the second summand on the right-hand side of (3.17) is bounded above (up to a multiplicative constant depending only on $r$ , $K$ , and $L$ ) by $\rho^{1/r}D$ . Thus, (1.23) indeed holds for the random matrix $(Y_{ij})_{ij}$ instead of $X$ . A similar reasoning shows that the same inequality holds also for the random matrix $(\varepsilon_{ij})_{ij}$ (one may also simply use the Khintchine–Kahane inequality and assumption (1.22)). ∎

4. Proofs of further results

4.1. Gaussian random variables

Proof of Proposition 1.7.

Fix $1\leq p\leq 2$ and $1\leq q\leq\infty$ . Let $K$ be the set defined in Lemma 2.3 for which $B_{p}^{n}\subset\ln(en)^{1/p^{*}}K$ . Then

(4.1)

\|G_{A}\colon\ell^{n}_{p}\to\ell^{m}_{q}\|=\sup_{x\in B_{p}^{n}}\|G_{A}x\|_{q}\leq\ln(en)^{1/p^{*}}\!\sup_{x\in\operatorname{Ext}(K)}\!\|G_{A}x\|_{q},

where $\operatorname{Ext}(K)$ is the set of extreme points of $K$ . We shall now estimate the expected value of the right-hand side of (4.1).

To this end, we first consider a fixed $x=(x_{j})_{j=1}^{n}\in\operatorname{Ext}(K)$ . Then there exists a non-empty index set $J\subset\{1,\dots,n\}$ of cardinality $k\leq n$ such that $x_{j}=\frac{\pm 1}{k^{1/p}}$ for $j\in J$ and $x_{j}=0$ for $j\notin J$ . We have

(4.2)

\|G_{A}x\|_{q}=\Bigl{\|}\Bigl{(}\sum_{j=1}^{n}a_{ij}g_{ij}x_{j}\Bigr{)}_{i=1}^{m}\Bigr{\|}_{q}=\Bigl{(}\sum_{i=1}^{m}\Bigl{|}\sum_{j=1}^{n}a_{ij}g_{ij}x_{j}\Bigr{|}^{q}\Bigr{)}^{1/q}.

Let us estimate the Lipschitz constant of the function

(4.3)

\displaystyle z=(z_{ij})_{ij}\mapsto\Bigl{\|}\Bigl{(}\sum_{j=1}^{n}a_{ij}z_{ij}x_{j}\Bigr{)}_{i=1}^{m}\Bigr{\|}_{q}=\sup_{y\in B_{q^{*}}^{m}}\sum_{i=1}^{m}\sum_{j=1}^{n}y_{i}a_{ij}z_{ij}x_{j}.

It follows from the Cauchy–Schwarz inequality (used in $\mathbb{R}^{m\times n}$ ) that

	$\displaystyle\sup_{y\in B_{q^{*}}^{m}}\sum_{i=1}^{m}\sum_{j=1}^{n}y_{i}a_{ij}z_{ij}x_{j}$	$\displaystyle\leq\\|z\\|_{2}\sqrt{\sup_{y\in B_{q^{*}}^{m}}\sum_{i=1}^{m}\sum_{j=1}^{n}y_{i}^{2}a_{ij}^{2}x_{j}^{2}}$
(4.4)			$\displaystyle=\\|z\\|_{2}\frac{1}{k^{1/p}}\sqrt{\sup_{y\in B^{m}_{q^{*}/2}}\sum_{i=1}^{m}\sum_{j\in J}y_{i}a_{ij}^{2}}=\\|z\\|_{2}\frac{b_{J}}{k^{1/p}},$

where we put

b_{J}\coloneqq\sqrt{\sup_{y\in B^{m}_{q^{*}/2}}\sum_{i=1}^{m}\sum_{j\in J}y_{i}a_{ij}^{2}}\,.

This shows that the function defined by (4.3) is $\frac{b_{J}}{k^{1/p}}$ -Lipschitz continuous. Therefore, by the Gaussian concentration inequality (see, e.g., [41, Chapter 5.1]), for any $u\geq 0$ ,

(4.5)

\mathbb{P}\bigl{(}\|G_{A}x\|_{q}\geq\mathbb{E}\|G_{A}x\|_{q}+u\bigr{)}\leq\exp\bigl{(}-\frac{k^{2/p}u^{2}}{2b_{J}^{2}}\bigr{)}.

We shall transform this inequality into a form which is more convenient to work with. We want to estimate $\mathbb{E}\|G_{A}x\|_{q}$ independently of $x$ and get rid of the dependence on $J$ and $p$ on the right-hand side. By (4.2) and the fact that $x\in\operatorname{Ext}(K)\subset B_{p}^{n}$ , we obtain

	$\displaystyle\mathbb{E}\\|G_{A}x\\|_{q}$	$\displaystyle\leq(\mathbb{E}\\|G_{A}x\\|_{q}^{q})^{1/q}=\gamma_{q}\Bigl{(}\sum_{i=1}^{m}\Bigl{\|}\sum_{j=1}^{n}a_{ij}^{2}x_{j}^{2}\Bigr{\|}^{q/2}\Bigr{)}^{1/q}$
		$\displaystyle\leq\gamma_{q}\sup_{z\in B_{p}^{n}}\Bigl{(}\sum_{i=1}^{m}\Bigl{\|}\sum_{j=1}^{n}a_{ij}^{2}z_{j}^{2}\Bigr{\|}^{q/2}\Bigr{)}^{1/q}=\gamma_{q}\\|A\mathbin{\circ}A\colon\ell^{n}_{p/2}\to\ell^{m}_{q/2}\\|^{1/2}\eqqcolon a.$

We use the definition of $b_{J}$ , then interchange the sums, use the triangle inequality, and then the inequality between the arithmetic mean and the power mean of order $p^{*}/2\geq 1$ (recall that $|J|=k$ and $p\leq 2$ ) to obtain

	$\displaystyle k^{2/p^{*}-1}b_{J}^{2}$	$\displaystyle=k^{2/p^{}-1}\sup_{y\in B^{m}_{q^{}/2}}\sum_{i=1}^{m}\sum_{j\in J}a_{ij}^{2}y_{i}=k^{2/p^{}-1}\sup_{y\in B^{m}_{q^{}/2}}\sum_{j\in J}\Bigl{\|}\sum_{i=1}^{m}a_{ij}^{2}y_{i}\Bigr{\|}$
		$\displaystyle\leq\sup_{y\in B^{m}_{q^{}/2}}\Bigl{(}\sum_{j\in J}\Bigl{\|}\sum_{i=1}^{m}a_{ij}^{2}y_{i}\Bigr{\|}^{p^{}/2}\Bigr{)}^{2/p^{}}\leq\sup_{y\in B^{m}_{q^{}/2}}\Bigl{(}\sum_{j=1}^{n}\Bigl{\|}\sum_{i=1}^{m}a_{ij}^{2}y_{i}\Bigr{\|}^{p^{}/2}\Bigr{)}^{2/p^{}}$
(4.6)			$\displaystyle=\\|(A\mathbin{\circ}A)^{T}\colon\ell^{m}_{q^{}/2}\to\ell^{n}_{p^{}/2}\\|\eqqcolon b^{2}.$

The two inequalities above, together with inequality (4.5) (applied with $u=k^{1/p^{*}-1/2}b_{J}\sqrt{2\ln(en)}s$ ), imply that

(4.7)		$\displaystyle\mathbb{P}\bigl{(}\\|G_{A}x\\|_{q}\geq a+b\sqrt{2\ln(en)}\,s\bigr{)}$	$\displaystyle\leq\exp\bigl{(}-k^{2/p+2/p^{*}-1}\ln(en)s^{2}\bigr{)}$
		$\displaystyle=\exp\bigl{(}-k\ln(en)s^{2}\bigr{)}$

holds for any $s\geq 0$ and all $x\in\operatorname{Ext}(K)$ with support of cardinality $k$ .

For any $k\leq n$ , there are $2^{k}\binom{n}{k}\leq 2^{k}n^{k}\leq\exp(k\ln(en))$ vectors in $\operatorname{Ext}(K)$ with support of cardinality $k$ . Therefore, using a union bound together with (4.7), we see that, for all $s\geq\sqrt{2}$ ,

\mathbb{P}\bigl{(}\sup_{x\in\operatorname{Ext}K}\!\|G_{A}x\|_{q}\geq a+b\sqrt{2\ln(en)}s\bigr{)}\leq\sum_{k=1}^{n}\exp(-k\ln(en)(s^{2}-1))\\ \leq n\exp(-\ln(en)(s^{2}-1))=n(en)^{-s^{2}+1}\leq e^{-s^{2}+1}.

Hence, by Lemma 2.6 (applied with $s_{0}\coloneqq\sqrt{2}$ , $\alpha\coloneqq e$ , $\beta\coloneqq 1$ , and $r\coloneqq 2$ ),

\displaystyle\mathbb{E}\!\sup_{x\in\operatorname{Ext}K}\!\|G_{A}x\|_{q}

\displaystyle\leq a+b\sqrt{2\ln(en)}\Bigl{(}\sqrt{2}+e\frac{e^{-2}}{2\sqrt{2}}\Bigr{)}\leq a+2.2b\sqrt{\ln(en)}.

Recalling (4.1) and the definitions of $a$ and $b$ yields the assertion. ∎

We now turn to the special case $q=1$ .

Proof of Proposition 1.8.

Since the first part of this proof works for general $q\geq 1$ , we do not restrict our attention to $q=1$ for now. First of all,

\mathbb{E}\|G_{A}\colon\ell_{p}^{n}\to\ell_{q}^{m}\|\leq\bigl{(}\mathbb{E}\|G_{A}\colon\ell_{p}^{n}\to\ell_{q}^{m}\|^{q}\bigr{)}^{1/q}=\Bigl{(}\mathbb{E}\sup_{x\in B_{p}^{n}}\sum_{i=1}^{m}|\langle X_{i},x\rangle|^{q}\Bigr{)}^{1/q},

where $X_{i}=(a_{ij}g_{ij})_{j=1}^{n}$ is the $i$ -th row of the matrix $G_{A}$ . Centering this expression gives

(4.8)		$\displaystyle\mathbb{E}\sup_{x\in B_{p}^{n}}\sum_{i=1}^{m}\|\langle X_{i},x\rangle\|^{q}$	$\displaystyle\leq\mathbb{E}\sup_{x\in B_{p}^{n}}\Big{[}\sum_{i=1}^{m}\|\langle X_{i},x\rangle\|^{q}-\mathbb{E}\|\langle X_{i},x\rangle\|^{q}\Big{]}$
		$\displaystyle\qquad+\sup_{x\in B_{p}^{n}}\sum_{i=1}^{m}\mathbb{E}\|\langle X_{i},x\rangle\|^{q}.$

We first take care of the second term on the right-hand side of (4.8). We have

	$\displaystyle\sup_{x\in B_{p}^{n}}\sum_{i=1}^{m}\mathbb{E}\|\langle X_{i},x\rangle\|^{q}$	$\displaystyle=\gamma_{q}^{q}\sup_{x\in B_{p}^{n}}\sum_{i=1}^{m}\Bigl{(}\sum_{j=1}^{n}a_{ij}^{2}x_{j}^{2}\Bigr{)}^{q/2}$
(4.9)			$\displaystyle=\gamma_{q}^{q}\sup_{z\in B_{p/2}^{n}}\Big{\\|}\Bigl{(}\sum_{j=1}^{n}a_{ij}^{2}z_{j}\Bigr{)}_{i\leq m}\Big{\\|}_{q/2}^{q/2}=\gamma_{q}^{q}\\|A\mathbin{\circ}A\colon\ell^{n}_{p/2}\to\ell^{m}_{q/2}\\|^{q/2}.$

In order to deal with the first term on the right-hand side of (4.8), we use a symmetrization trick together with the contraction principle. The latter is the reason that we need to work with $q=1$ here. We start with the symmetrization. Denoting by $\widetilde{X}_{1},\dots,\widetilde{X}_{n}$ independent copies of $X_{1},\dots,X_{n}$ and by $(\varepsilon_{i})_{i=1}^{m}$ a sequence of Rademacher random variables independent of all others, we obtain by Jensen’s and the triangle inequalities that

	$\displaystyle\mathbb{E}\sup_{x\in B_{p}^{n}}\Bigl{[}\sum_{i=1}^{m}$	$\displaystyle\|\langle X_{i},x\rangle\|^{q}-\mathbb{E}\|\langle X_{i},x\rangle\|^{q}\Bigr{]}=\mathbb{E}\sup_{x\in B_{p}^{n}}\Big{[}\sum_{i=1}^{m}\|\langle X_{i},x\rangle\|^{q}-\mathbb{E}\|\langle\widetilde{X}_{i},x\rangle\|^{q}\Bigr{]}$
		$\displaystyle\leq\mathbb{E}\sup_{x\in B_{p}^{n}}\Bigl{[}\sum_{i=1}^{m}\|\langle X_{i},x\rangle\|^{q}-\|\langle\widetilde{X}_{i},x\rangle\|^{q}\Bigr{]}$
(4.10)			$\displaystyle=\mathbb{E}\sup_{x\in B_{p}^{n}}\Bigl{[}\sum_{i=1}^{m}\varepsilon_{i}(\|\langle X_{i},x\rangle\|^{q}-\|\langle\widetilde{X}_{i},x\rangle\|^{q})\Bigr{]}\leq 2\cdot\mathbb{E}\sup_{x\in B_{p}^{n}}\sum_{i=1}^{m}\varepsilon_{i}\|\langle X_{i},x\rangle\|^{q}.$

If $q=1$ , we may use the contraction principle (i.e., Lemma 2.8 applied with functions $\varphi_{i}(t)=|t|$ ) conditionally to obtain

	$\displaystyle\mathbb{E}\sup_{x\in B_{p}^{n}}\sum_{i=1}^{m}\varepsilon_{i}\|\langle X_{i},x\rangle\|$	$\displaystyle\leq\mathbb{E}\sup_{x\in B_{p}^{n}}\sum_{i=1}^{m}\varepsilon_{i}\langle X_{i},x\rangle$
(4.11)			$\displaystyle=\mathbb{E}\sup_{x\in B_{p}^{n}}\sum_{j=1}^{n}x_{j}\sum_{i=1}^{m}a_{ij}\cdot\varepsilon_{i}g_{ij}=\mathbb{E}\sup_{x\in B_{p}^{n}}\sum_{j=1}^{n}x_{j}\sum_{i=1}^{m}a_{ij}g_{ij}.$

For $p>1$ , we have

	$\displaystyle\mathbb{E}\sup_{x\in B_{p}^{n}}\sum_{j=1}^{n}x_{j}\sum_{i=1}^{m}a_{ij}g_{ij}$	$\displaystyle=\mathbb{E}\Bigl{(}\sum_{j=1}^{n}\Big{\|}\sum_{i=1}^{m}a_{ij}g_{ij}\Big{\|}^{p^{}}\Bigr{)}^{1/p^{}}$
(4.12)			$\displaystyle\leq\Bigl{(}\sum_{j=1}^{n}\mathbb{E}\Big{\|}\sum_{i=1}^{m}a_{ij}g_{ij}\Big{\|}^{p^{}}\Bigr{)}^{1/p^{}}=\gamma_{p^{}}\Bigl{(}\sum_{j=1}^{n}\Bigl{(}\sum_{i=1}^{m}a_{ij}^{2}\Bigr{)}^{p^{}/2}\Bigr{)}^{1/p^{*}}.$

Moreover, we have

	$\displaystyle\Bigl{(}\sum_{j=1}^{n}\Bigl{(}\sum_{i=1}^{m}a_{ij}^{2}\Bigr{)}^{p^{}/2}\Bigr{)}^{1/p^{}}$	$\displaystyle=\sup_{\delta\in\{-1,1\}^{m}}\Bigl{\\|}\Bigl{(}\sum_{i=1}^{m}a_{ij}^{2}\delta_{i}\Bigr{)}_{j\leq n}\Bigr{\\|}_{p^{*}/2}^{1/2}$
(4.13)			$\displaystyle=\bigl{\\|}(A\mathbin{\circ}A)^{T}\colon\ell_{\infty}^{m}\to\ell_{p^{*}/2}^{n}\Bigr{\\|}^{1/2}.$

Inequalities (4.10)–(4.13) give the estimate of the first term on the right-hand side of (4.8). This ends the proof of the upper bound for $p>1$ .

If $p=1$ , then letting $g_{1},\ldots,g_{n}$ be i.i.d. standard Gaussian random variables, we have

	$\displaystyle\mathbb{E}\sup_{x\in B_{p}^{n}}\sum_{j=1}^{n}x_{j}\sum_{i=1}^{m}a_{ij}g_{ij}$	$\displaystyle=\mathbb{E}\max_{j\leq n}\Big{\|}\sum_{i=1}^{m}a_{ij}g_{ij}\Big{\|}$
(4.14)			$\displaystyle=\mathbb{E}\max_{j\leq n}g_{j}b_{j}\asymp\max_{j\leq n}(\sqrt{\ln(j+1)}b_{j}^{\downarrow{}}),$

where the last step follows from Lemmas 2.11 and 2.12 with $b_{j}\coloneqq\|(a_{ij})_{i\leq m}\|_{2}$ , $j\leq n$ . Putting together (4.8)–(4.11) and (4.1) completes the proof of the upper bound in the case $p=1$ .

The lower bound in the case $p>1$ follows from Proposition 5.1 and Corollary 5.2 below. In the case $p=1$ , we use Proposition 5.1, note that

\mathbb{E}\|G_{A}\colon\ell_{p}^{n}\to\ell_{1}^{m}\|\geq\mathbb{E}\sup_{x\in B_{p}^{n}}\sum_{j=1}^{n}x_{j}\sum_{i=1}^{m}a_{ij}g_{ij},

and use (4.1) to obtain a lower bound. ∎

Now we deal with another special case, the one where $p=1$ .

Proof of Proposition 1.10.

Recall that we deal with the range $p=1\leq q\leq 2$ . Using the structure of extreme points of $B_{1}^{n}$ we get

\mathbb{E}\|G_{A}\colon\ell_{1}^{n}\to\ell_{q}^{m}\|=\mathbb{E}\max_{j\leq n}\|(a_{ij}g_{ij})_{i\leq m}\|_{q}.

Denote $Z_{j}=\|(a_{ij}g_{ij})_{i\leq m}\|_{q}$ . By well-known tail estimates of norms of Gaussian variables with values in Banach spaces (see, e.g., [36, Corollary 1] for a more general formulation) we get for all $t>0$ ,

(4.15)		$\displaystyle\mathbb{P}\bigl{(}Z_{j}\geq C(\mathbb{E}Z_{j}+\sqrt{t}b_{j})\bigr{)}$	$\displaystyle\leq e^{-t},$
(4.16)		$\displaystyle\mathbb{P}\bigl{(}Z_{j}\geq c(\mathbb{E}Z_{j}+\sqrt{t}b_{j})\bigr{)}$	$\displaystyle\geq\min(c,e^{-t}),$

where $c,C$ are universal positive constants, and

b_{j}^{2}=\|(a_{ij}^{2})_{i\leq m}\|_{q/(2-q)}=\|(a_{ij}^{2})_{i\leq m}\|_{(q^{\ast}/2)^{\ast}}=\sup_{x\in B_{q^{\ast}}^{m}}\sum_{i=1}^{m}a_{ij}^{2}x_{i}^{2}.

Inequality (4.15) shows in particular that the random variables $(Z_{j}-C\mathbb{E}Z_{j})_{+}$ satisfy

\mathbb{P}((Z_{j}-C\mathbb{E}Z_{j})_{+}\geq t)\leq\exp\Bigl{(}-\frac{t^{2}}{C^{2}b_{j}^{2}}\Bigr{)}

for all $t>0$ , thus by Lemma 2.11 we get

	$\displaystyle\mathbb{E}\\|G_{A}\colon\ell_{1}^{n}\to\ell_{q}^{m}\\|=\mathbb{E}\max_{j\leq n}Z_{j}$	$\displaystyle\leq C\max_{j\leq n}\mathbb{E}Z_{j}+\max_{j\leq n}(Z_{j}-C\mathbb{E}Z_{j})_{+}$
		$\displaystyle\lesssim\Big{(}\max_{j\leq n}\mathbb{E}Z_{j}+\max_{j\leq n}(\sqrt{\ln(j+1)}b_{j}^{\downarrow{}})\Big{)},$

which together with the observation (following from Lemma 2.1 and the fact that $1=p\leq q\leq 2$ ) that

\mathbb{E}Z_{j}\leq\Big{(}\mathbb{E}\sum_{i=1}^{m}|a_{ij}|^{q}|g_{ij}|^{q}\Big{)}^{1/q}=\gamma_{q}\|(a_{ij})_{i\leq m}\|_{q}=\gamma_{q}\|A\mathbin{\circ}A\colon\ell_{1/2}^{n}\to\ell_{q/2}^{m}\|^{1/2},

proves the upper estimate of the proposition.

Using comparison of moments of norms of Gaussian random vectors, we also get

	$\displaystyle\mathbb{E}\\|G_{A}\colon\ell_{1}^{n}\to\ell_{q}^{m}\\|$	$\displaystyle\geq\max_{j\leq n}\mathbb{E}Z_{j}\gtrsim\max_{j\leq n}(\mathbb{E}Z_{j}^{q})^{1/q}$
(4.17)			$\displaystyle=\gamma_{q}\\|(a_{ij})_{i\leq m}\\|_{q}=\gamma_{q}\\|A\mathbin{\circ}A\colon\ell_{1/2}^{n}\to\ell_{q/2}^{m}\\|^{1/2},$

so to end the proof it is enough to show that

(4.18)

\displaystyle\mathbb{E}\|G_{A}\colon\ell_{1}^{n}\to\ell_{q}^{m}\|\geq\max_{j\leq n}(\sqrt{\ln(j+1)}b_{j}^{\downarrow{}}).

This will follow by a straightforward adaptation of the argument from the proof of Lemma 2.12. We may and do assume that the sequence $(b_{j})_{j\leq n}$ is non-increasing in $j$ . By (4.16) we have for any $j\leq n$ and $k\geq 1$ ,

\mathbb{P}(Z_{j}\geq c\sqrt{\ln(k+1)}b_{j})\geq\frac{c^{\prime}}{k}.

Thus, since $b_{j}\geq b_{k}$ for all $j\leq k$ , we have for any $k\leq n$ ,

	$\displaystyle\mathbb{P}(\max_{j\leq n}Z_{j}\geq\sqrt{\ln(k+1)}b_{k})$	$\displaystyle\geq\mathbb{P}(\exists_{j\leq k}\ Z_{j}\geq\sqrt{\ln(k+1)}b_{j})$
		$\displaystyle\geq 1-(1-c^{\prime}/k)^{k}\geq 1-e^{-c^{\prime}}>0.$

Thus,

\mathbb{E}\|G_{A}\colon\ell_{1}^{n}\to\ell_{q}^{m}\|=\mathbb{E}\max_{j\leq n}Z_{j}\gtrsim\sqrt{\ln(k+1)}b_{k}.

Taking maximum over $k\leq n$ gives (4.18) and ends the proof. ∎

4.2. Bounded random variables

Here we show how one can adapt the methods of [9] to prove Proposition 1.14, i.e., a version of Corollary 1.13 in the special case of bounded random variables with better logarithmic terms and with explicit numerical constants. Following [9], we start with a lemma.

Lemma 4.1.

Assume that $X$ is as in Proposition 1.14. Let $(b_{j})_{j\leq n}\in\mathbb{R}^{n}$ and suppose that $t_{0}$ is such that $\bigl{|}\sum_{j=1}^{n}b_{j}X_{ij}\bigr{|}\leq t_{0}$ almost surely. Then, for all $q\geq 2$ and $0\leq t\leq{t_{0}^{2-q}}(4\sum_{j=1}^{n}b_{j}^{2})^{-1}$ ,

(4.19)

\mathbb{E}\exp\bigl{(}t\bigl{|}\sum_{j=1}^{n}b_{j}X_{ij}\bigr{|}^{q}\bigr{)}\leq 1+C^{q}(q)\,t\,\bigl{(}\sum_{j=1}^{n}b_{j}^{2}\bigr{)}^{q/2},

where $C(q)\coloneqq 2(q\Gamma(q/2))^{1/q}\asymp\sqrt{q}$ .

Proof.

Without loss of generality we may and do assume that $\sum_{j=1}^{n}b_{j}^{2}=1$ .

Since $q\geq 2$ , for $s\in[0,t_{0}]$ and $t\in[0,\frac{1}{4}t_{0}^{2-q}]$ we have $ts^{q}-s^{2}/2\leq-s^{2}/4$ . Thus, integration by parts, our assumption $0\leq\bigl{|}\sum_{j=1}^{n}b_{j}X_{ij}\bigr{|}\leq t_{0}$ a.s., and Hoeffding’s inequality (i.e., Lemma 2.13) yield

	$\displaystyle\mathbb{E}\exp\bigl{(}t\bigl{\|}\sum_{j=1}^{n}b_{j}X_{ij}\bigr{\|}^{q}\bigr{)}$	$\displaystyle=1+qt\int_{0}^{t_{0}}s^{q-1}\exp(ts^{q})\mathbb{P}\bigl{(}\bigl{\|}\sum_{j=1}^{n}b_{j}X_{ij}\bigr{\|}\geq s\bigr{)}ds$
		$\displaystyle\leq 1+2qt\int_{0}^{t_{0}}s^{q-1}\exp(ts^{q}-s^{2}/2)ds$
		$\displaystyle\leq 1+2qt\int_{0}^{\infty}s^{q-1}\exp(-s^{2}/4)ds$
		$\displaystyle=1+t2^{q}q\Gamma(q/2).\qed$

Proof of Proposition 1.14.

We start with a bunch of reductions. Set

	$\displaystyle a$	$\displaystyle\coloneqq\\|A\mathbin{\circ}A\colon\ell^{n}_{p/2}\to\ell^{m}_{q/2}\\|^{1/2}=\max_{j\leq n}\\|(a_{ij})_{i=1}^{m}\\|_{q},$
	$\displaystyle b$	$\displaystyle\coloneqq\\|(A\mathbin{\circ}A)^{T}\colon\ell^{m}_{q^{}/2}\to\ell^{n}_{p^{}/2}\\|^{1/2}=\max_{i\leq m}\\|(a_{ij})_{j=1}^{n}\\|_{p^{*}}.$

(The equalities follow from Lemma 2.1, since $p/2\leq 1\leq q/2$ and $q^{*}/2\leq 1\leq p^{*}/2$ ). Let $K$ be the set defined in Lemma 2.3, so that $B_{p}^{n}\subset\ln(en)^{1/p^{*}}K$ . Then

(4.20)

\|X_{A}\colon\ell^{n}_{p}\to\ell^{m}_{q}\|=\sup_{x\in B_{p}^{n}}\|X_{A}x\|_{q}\leq\ln(en)^{1/p^{*}}\!\sup_{x\in\operatorname{Ext}(K)}\!\|X_{A}x\|_{q},

where $\operatorname{Ext}(K)$ is the set of extreme points of $K$ .

Consider first a fixed $x=(x_{j})_{j=1}^{n}\in\operatorname{Ext}(K)\subset B_{p}^{n}$ . We have

(4.21)

\|X_{A}x\|_{q}^{q}=\sum_{i=1}^{m}\Bigl{|}\sum_{j=1}^{n}a_{ij}X_{ij}x_{j}\Bigr{|}^{q}.

Denote

	$\displaystyle t_{0}$	$\displaystyle\coloneqq b=\max_{i\leq m}\\|(a_{ij})_{j=1}^{n}\\|_{p^{*}},$
	$\displaystyle t$	$\displaystyle\coloneqq\frac{t_{0}^{2-q}}{4\max_{i\leq m}\\|(a_{ij}x_{j})_{j=1}^{n}\\|_{2}^{2}}.$

Then, by the boundedness of $X_{ij}$ and by Hölder’s inequality, for every $i\leq m$ ,

\bigl{|}\sum_{j=1}^{n}a_{ij}x_{j}X_{ij}\bigr{|}\leq\sum_{j=1}^{n}|a_{ij}||x_{j}|\leq\|(a_{ij})_{j=1}^{n}\|_{p^{*}}\|(x_{j})_{j=1}^{n}\|_{p}\leq t_{0}.

We can now apply, for every $i\leq m$ , Lemma 4.1 (with $t$ and $t_{0}$ as above and with coefficients $b_{j}=a_{ij}x_{j}$ ). Since the random variables $\bigl{|}\sum_{j=1}^{n}a_{ij}x_{j}X_{ij}\bigr{|}$ , $i\leq m$ , are independent, using Lemma 4.1 yields

	$\displaystyle\mathbb{E}\exp\bigl{(}t\sum_{i=1}^{m}\bigl{\|}\sum_{j=1}^{n}a_{ij}x_{j}X_{ij}\bigr{\|}^{q}\bigr{)}$	$\displaystyle=\prod_{i=1}^{m}\Bigl{[}\mathbb{E}\exp\bigl{(}t\bigl{\|}\sum_{j=1}^{n}a_{ij}x_{j}X_{ij}\bigr{\|}^{q}\bigr{)}\Bigr{]}$
		$\displaystyle\leq\prod_{i=1}^{m}\Bigl{(}1+C^{q}(q)\,t\,\bigl{(}\sum_{j=1}^{n}a_{ij}^{2}x_{j}^{2}\bigr{)}^{q/2}\Bigr{)}$
		$\displaystyle\leq\exp\Bigl{(}C^{q}(q)\,t\sum_{i=1}^{m}\bigl{(}\sum_{j=1}^{n}a_{ij}^{2}x_{j}^{2}\bigr{)}^{q/2}\Bigr{)}\leq\exp\bigl{(}C^{q}(q)\,ta^{q}\bigr{)},$

where in the last step we used the definition of $a=\|A\mathbin{\circ}A\colon\ell^{n}_{p/2}\to\ell^{m}_{q/2}\|^{1/2}$ (and the fact that $x\in B^{n}_{p}$ ). By Chebyshev’s inequality and (4.21), we have, for every $s\geq 0$ ,

\mathbb{P}\bigl{(}t\|X_{A}x\|_{q}^{q}\geq\ln\bigl{[}\mathbb{E}\exp\bigl{(}t\sum_{i=1}^{m}\bigl{|}\sum_{j=1}^{n}a_{ij}x_{j}X_{ij}\bigr{|}^{q}\bigr{)}\bigr{]}+sk\bigr{)}\leq e^{-sk}.

Combining this with the previous estimate yields, for every $s\geq 0$ ,

\mathbb{P}\bigl{(}\|X_{A}x\|_{q}^{q}\geq C^{q}(q)\,a^{q}+\frac{sk}{t}\bigr{)}\leq e^{-sk}.

Recall that $x\in\operatorname{Ext}(K)$ . Thus, there exists an index set $J\subset\{1,\dots,n\}$ of cardinality $k\leq n$ , such that $x_{j}=\frac{\pm 1}{k^{1/p}}$ for $j\in J$ and $x_{j}=0$ for $j\notin J$ . We use the definition of $t$ and the inequality between the arithmetic mean and the power mean of order $p^{*}/2\geq 1$ (recall that $|J|=k$ and $p\leq 2$ ) to get

	$\displaystyle\frac{1}{4t}=b^{q-2}\max_{i\leq m}\\|(a_{ij}x_{j})_{j=1}^{n}\\|_{2}^{2}$	$\displaystyle=b^{q-2}k^{-2/p}\max_{i\leq m}\sum_{j\in J}a_{ij}^{2}$
		$\displaystyle\leq b^{q-2}k^{-2/p+1-2/p^{}}\max_{i\leq m}\bigl{(}\sum_{j\in J}\|a_{ij}\|^{p^{}}\bigr{)}^{2/p^{*}}=b^{q}k^{-1}.$

Putting everything together, we obtain

(4.22)

\mathbb{P}\bigl{(}\|X_{A}x\|_{q}^{q}\geq C^{q}(q)\,a^{q}+4b^{q}s\bigr{)}\leq e^{-sk}

for all $s\geq 0$ and all $x\in\operatorname{Ext}(K)$ with support of cardinality $k$ .

For any $k\leq n$ , there are $2^{k}\binom{n}{k}\leq 2^{k}n^{k}\leq\exp(k\ln(en))$ vectors in $\operatorname{Ext}(K)$ with support of cardinality $k$ . Thus, using the union bound and (4.22), we see that, for all $s\geq 2$ ,

\mathbb{P}\bigl{(}\sup_{x\in\operatorname{Ext}K}\!\|X_{A}x\|_{q}^{q}\geq C^{q}(q)\,a^{q}+4b^{q}\ln(en)s\bigr{)}\leq\sum_{k=1}^{n}\exp(-k\ln(en)(s-1))\\ \leq n\exp(-\ln(en)(s-1))=n(en)^{-s+1}\leq e^{-s+1}.

Hence, by Lemma 2.6,

	$\displaystyle\mathbb{E}\!\!\sup_{x\in\operatorname{Ext}K}\!\!\ \\|X_{A}x\\|_{q}\leq\bigl{(}\mathbb{E}\!\!\sup_{x\in\operatorname{Ext}K}\!\!\\|X_{A}x\\|_{q}^{q}\bigr{)}^{1/q}$	$\displaystyle\leq\bigl{(}C^{q}(q)\,a^{q}+4b^{q}\ln(en)(2+e\cdot e^{-2})\bigr{)}^{1/q}$
		$\displaystyle\leq C(q)a+10^{1/q}\ln(en)^{1/q}b.$

Recalling (4.20) and the definitions of $a$ , $b$ , and $C(q)$ yields the assertion. ∎

Remark 4.2.

In the unstructured case, for $X_{ij}$ which are independent, mean-zero, and take values in $[-1,1]$ , it is easy to extend (1.2) to the whole range of $p,q\in[1,\infty]$ (see [8, 13]). Indeed, for $p\geq 2$ and $q\geq 2$ ,

	$\displaystyle\mathbb{E}\\|X\colon\ell^{n}_{p}\to\ell^{m}_{q}\\|$	$\displaystyle\leq\\|\ell^{n}_{p}\hookrightarrow\ell^{n}_{2}\\|\cdot\mathbb{E}\\|X\colon\ell^{n}_{2}\to\ell^{m}_{q}\\|$
		$\displaystyle\lesssim_{q}n^{1/2-1/p}\cdot\max\{n^{1/2},m^{1/q}\}=\max\{n^{1-1/p},n^{1/2-1/p}m^{1/q}\}.$

Thus, for $p\geq 2$ and $1\leq q\leq 2$ ,

	$\displaystyle\mathbb{E}\\|X\colon\ell^{n}_{p}\to\ell^{m}_{q}\\|$	$\displaystyle\leq\mathbb{E}\\|X\colon\ell^{n}_{p}\to\ell^{m}_{2}\\|\cdot\\|\ell^{m}_{2}\hookrightarrow\ell^{m}_{q}\\|$
		$\displaystyle\lesssim_{q}\max\{n^{1-1/p},n^{1/2-1/p}m^{1/2}\}\cdot m^{1/q-1/2}$
		$\displaystyle=\max\{n^{1-1/p}m^{1/q-1/2},n^{1/2-1/p}m^{1/q}\}.$

Suppose now that $1\leq p\leq 2\leq q\leq\infty$ and $1/p+1/q\leq 1$ (i.e., $q\geq p^{*}$ ). Choose $\theta\in[0,1]$ and $r\geq 2$ so that $\frac{1}{p}=\frac{\theta}{2}+\frac{1-\theta}{1}$ and $\frac{1}{q}=\frac{\theta}{r}+\frac{1-\theta}{\infty}$ , i.e., $\theta=2/p^{*}$ and $r=2q/p^{*}$ . Using the Riesz–Thorin interpolation theorem, the fact that $\|X\colon\ell^{n}_{1}\to\ell^{m}_{\infty}\|\leq 1$ (since the entries take values in $[-1,1]$ ), and Jensen’s inequality, we arrive at

	$\displaystyle\mathbb{E}\\|X\colon\ell^{n}_{p}\to\ell^{m}_{q}\\|$	$\displaystyle\leq\mathbb{E}\\|X\colon\ell^{n}_{2}\to\ell^{m}_{r}\\|^{\theta}\\|X\colon\ell^{n}_{1}\to\ell^{m}_{\infty}\\|^{1-\theta}$
		$\displaystyle\leq\mathbb{E}\\|X\colon\ell^{n}_{2}\to\ell^{m}_{r}\\|^{\theta}\leq\bigl{(}\mathbb{E}\\|X\colon\ell^{n}_{2}\to\ell^{m}_{r}\\|\bigr{)}^{\theta}$
		$\displaystyle\leq\max\{n^{1/2},m^{1/r}\}^{\theta}=\max\{n^{1/p^{*}},m^{1/q}\}.$

The estimates in the remaining ranges of $p,q$ follow by duality (1.12). Moreover, up to constants, all these estimates are optimal, as they can be reversed for matrices with $\pm 1$ entries (see [8, Proposition 3.2] or [13, Satz 2]).

4.3. $\psi_{r}$ random variables

In this section, we prove Theorem 1.15. To this end we shall split the matrix $X$ into two parts $X^{(1)}$ and $X^{(2)}$ such that all entries of $X^{(1)}$ are bounded by $C\ln(mn)^{1/r}$ . Then, we shall deal with $X^{(2)}$ using the following crude bound and the fact that the probability that $X^{(2)}\neq 0$ is very small. In order to bound the expectation of the norm of $X^{(1)}$ we need a cut-off version of Theorem 1.15 – see Lemma 4.4 below.

Lemma 4.3.

Let $r\in(0,2]$ . Assume that $X=(X_{ij})_{i\leq m,j\leq n}$ satisfies the assumptions of Theorem 1.15. Then

\bigl{(}\mathbb{E}\|X_{A}\colon\ell_{p}^{n}\to\ell_{q}^{m}\|^{2}\bigr{)}^{1/2}\lesssim_{r,K,L}(m+n)^{1/r}\|A\mathbin{\circ}A\colon\ell^{n}_{p/2}\to\ell^{m}_{q/2}\|^{1/2}.

Proof.

By a standard volumetric estimate (see, e.g., [64, Corollary 4.2.13]), we know that there exists (in the metric $\|\cdot\|_{p}$ ) a $1/2$ -net $S$ in $B_{p}^{n}$ of size at most $5^{n}$ . In other words, for any $x\in B_{p}^{n}$ there exists $y\in S$ such that $x-y\in\frac{1}{2}B_{p}^{n}$ . Thus, for any $z\in\mathbb{R}^{n}$ ,

	$\displaystyle\sup_{x\in B_{p}^{n}}\sum_{j=1}^{n}x_{j}z_{j}$	$\displaystyle\leq\sup_{x\in B_{p}^{n}}\min_{y\in S}\sum_{j=1}^{n}(x_{j}-y_{j})z_{j}+\sup_{y\in S}\sum_{j=1}^{n}y_{j}z_{j}$
		$\displaystyle\leq\sup_{u\in\frac{1}{2}B_{p}^{n}}\sum_{j=1}^{n}u_{j}z_{j}+\sup_{y\in S}\sum_{j=1}^{n}y_{j}z_{j}=\frac{1}{2}\sup_{x\in B_{p}^{n}}\sum_{j=1}^{n}x_{j}z_{j}+\sup_{y\in S}\sum_{j=1}^{n}y_{j}z_{j}.$

Hence,

(4.23)

\sup_{x\in B_{p}^{n}}\sum_{j=1}^{n}x_{j}z_{j}\leq 2\sup_{y\in S}\sum_{j=1}^{n}y_{j}z_{j}.

Likewise, if we denote by $T$ the $1/2$ -net in $B_{q^{*}}^{m}$ (in the metric $\|\cdot\|_{q^{*}}$ ) of size at most $5^{m}$ , then

(4.24)

\sup_{x\in B_{q^{*}}^{m}}\sum_{i=1}^{m}x_{i}z_{i}\leq 2\sup_{y\in T}\sum_{i=1}^{m}y_{i}z_{i}.

Combining these two estimates, we see that

(4.25)		$\displaystyle\bigl{(}\mathbb{E}\\|X_{A}\colon\ell_{p}^{n}\to\ell_{q}^{m}\\|^{2}\bigr{)}^{1/2}$	$\displaystyle=\bigl{(}\mathbb{E}\sup_{x\in B_{p}^{n},y\in B_{q^{*}}^{m}}\bigl{(}\sum_{i=1}^{m}\sum_{j=1}^{n}y_{i}a_{ij}X_{ij}x_{j}\bigr{)}^{2}\bigr{)}^{1/2}$
		$\displaystyle\leq 4\bigl{(}\mathbb{E}\sup_{x\in S,y\in T}\bigl{(}\sum_{i=1}^{m}\sum_{j=1}^{n}y_{i}a_{ij}X_{ij}x_{j}\bigr{)}^{2}\bigr{)}^{1/2}.$

Lemma 2.19 implies that for any $x\in\mathbb{R}^{n}$ , $y\in\mathbb{R}^{m}$ , the random variable

Z(x,y)\coloneqq\bigl{(}\sum_{i,j}y_{i}^{2}a_{ij}^{2}x_{j}^{2}\bigr{)}^{-1/2}\sum_{i=1}^{m}\sum_{j=1}^{n}y_{i}a_{ij}X_{ij}x_{j}

satisfies condition (i) in Lemma 2.18. Thus, Lemma 2.18 implies that

(4.26)

\mathbb{E}\exp\Bigl{(}c(r,K,L)\bigl{(}\sum_{i,j}y_{i}^{2}a_{ij}^{2}x_{j}^{2}\bigr{)}^{-r/2}\Bigl{(}\sum_{i=1}^{m}\sum_{j=1}^{n}y_{i}a_{ij}X_{ij}x_{j}\Bigr{)}^{r}\Bigr{)}\leq C(r,K,L),

where $c(r,K,L)\in(0,\infty)$ and $C(r,K,L)\in(0,\infty)$ depend only on $r$ , $K$ , and $L$ .

The function $z\mapsto e^{z^{r/2}}$ is convex on $[(2r^{-1}-1)^{2/r},\infty)$ . Therefore, by Jensen’s inequality, for any $u>0$ and any nonnegative random variable $Z$ ,

	$\displaystyle\exp\bigl{(}u(\mathbb{E}Z^{2})^{r/2}\bigr{)}$	$\displaystyle\leq\exp\bigl{(}(u^{2/r}\mathbb{E}Z^{2}+(2r^{-1}-1)^{2/r})^{r/2}\bigr{)}$
		$\displaystyle\leq\mathbb{E}\exp\bigl{(}(u^{2/r}Z^{2}+(2r^{-1}-1)^{2/r})^{r/2}\bigr{)}$
		$\displaystyle\leq\mathbb{E}\exp\bigl{(}uZ^{r}+(2r^{-1}-1)\bigr{)}\leq e^{2/r}\mathbb{E}\exp(uZ^{r}).$

Hence,

(\mathbb{E}Z^{2})^{1/2}\leq u^{-1/r}\Bigl{(}\ln\bigl{(}e^{2/r}\mathbb{E}\exp(uZ^{r})\bigr{)}\Bigr{)}^{1/r}.

Thus, when

u\coloneqq c(r,K,L)\bigl{(}\max_{x\in S,y\in T}\sum_{i,j}y_{i}^{2}a_{ij}^{2}x_{j}^{2}\bigr{)}^{-r/2},

we get by (4.26), (3.10), and (3.1),

	$\displaystyle\bigl{(}\mathbb{E}\sup_{x\in S,y\in T}\bigl{(}\sum_{i=1}^{m}\sum_{j=1}^{n}y_{i}a_{ij}X_{ij}x_{j}\bigr{)}^{2}\bigr{)}^{1/2}$
	$\displaystyle\leq u^{-1/r}\ln^{1/r}\Bigl{(}e^{2/r}\mathbb{E}\exp\Bigl{(}c(r,K,L)\sup_{x\in S,y\in T}Z(x,y)^{r}\Bigr{)}\Bigr{)}$
	$\displaystyle\leq u^{-1/r}\ln^{1/r}\Bigl{(}e^{2/r}\mathbb{E}\!\!\sum_{x\in S,y\in T}\!\!\exp\bigl{(}c(r,K,L)Z(x,y)^{r}\bigr{)}\Bigr{)}$
	$\displaystyle\leq u^{-1/r}\ln^{1/r}\Bigl{(}e^{2/r}\|S\|\|T\|C(r,K,L)\Bigr{)}$
	$\displaystyle\leq\frac{1}{c(r,K,L)}\max_{x\in S,y\in T}\bigl{(}\sum_{i,j}y_{i}^{2}a_{ij}^{2}x_{j}^{2}\bigr{)}^{1/2}\ln^{1/r}\Bigl{(}e^{2/r}5^{m}5^{n}C(r,K,L)\Bigr{)}$
	$\displaystyle\overset{\mathclap{\eqref{eq:manipulations-to-fix1},\ \eqref{eq:manipulations-to-fix2}}}{\lesssim_{r,K,L}}\ \\|A\mathbin{\circ}A\colon\ell^{n}_{p/2}\to\ell^{m}_{q/2}\\|^{1/2}\bigl{(}m+n+\widetilde{C}(r,K,L)\bigr{)}^{1/r},$

where in the last two inequalities we also used inequalities $|S|\leq 5^{n}$ and $|T|\leq 5^{m}$ , and the inclusions $S\subset B_{p}^{n}$ , $T\subset B_{q^{*}}^{m}$ . Recalling (4.25) completes the proof. ∎

The following cut-off version of Theorem 1.15 can be proved similarly as Proposition 1.7.

Lemma 4.4.

Let $K,L,M>0$ and $r\in(0,2]$ . Assume $X=(X_{ij})_{i\leq m,j\leq n}$ is a random matrix with independent symmetric entries taking values in $[-M,M]$ and satisfying the condition

(4.27)

\mathbb{P}\bigl{(}|X_{ij}|\geq t\bigr{)}\leq Ke^{-t^{r}/L}\quad\text{for all }t\geq 0.

Then, for $1\leq p\leq 2$ and $1\leq q<\infty$ , we have

	$\displaystyle\mathbb{E}\\|X_{A}\colon\ell^{n}_{p}\to\ell^{m}_{q}\\|$	$\displaystyle\lesssim q^{1/r}C(r,K,L)\ln(en)^{1/p^{*}}\\|A\mathbin{\circ}A\colon\ell^{n}_{p/2}\to\ell^{m}_{q/2}\\|^{1/2}$
		$\displaystyle\qquad+M\ln(en)^{1/2+1/p^{}}\\|(A\mathbin{\circ}A)^{T}\colon\ell^{m}_{q^{}/2}\to\ell^{n}_{p^{*}/2}\\|^{1/2}.$

Proof.

Fix $1\leq p\leq 2$ and $1\leq q\leq\infty$ . Let $K$ be the set defined in Lemma 2.3 so that $B_{p}^{n}\subset\ln(en)^{1/p^{*}}K$ . Then

(4.28)

\|X_{A}\colon\ell^{n}_{p}\to\ell^{m}_{q}\|=\sup_{x\in B_{p}^{n}}\|X_{A}x\|_{q}\leq\ln(en)^{1/p^{*}}\!\sup_{x\in\operatorname{Ext}(K)}\!\|X_{A}x\|_{q},

where $\operatorname{Ext}(K)$ is the set of extreme points of $K$ . We shall now estimate the expected value of the right-hand side of (4.28).

To this end, we consider a fixed $x=(x_{j})_{j=1}^{n}\in\operatorname{Ext}(K)$ . This means that there exists a non-empty index set $J\subset\{1,\dots,n\}$ of cardinality $k\leq n$ such that $x_{j}=\frac{\pm 1}{k^{1/p}}$ for $j\in J$ and $x_{j}=0$ for $j\notin J$ . We know from (4.1) that the Lipschitz constant of the convex function

z=(z_{ij})_{ij}\mapsto\Bigl{\|}\Bigl{(}\sum_{j=1}^{n}a_{ij}z_{ij}x_{j}\Bigr{)}_{i}\Bigr{\|}_{q}=\sup_{y\in B_{q^{*}}^{m}}\sum_{i=1}^{m}\sum_{j=1}^{n}y_{i}a_{ij}z_{ij}x_{j}

is less than or equal to

\frac{1}{k^{1/p}}\sqrt{\sup_{y\in B^{m}_{q^{*}/2}}\sum_{i=1}^{m}\sum_{j\in J}y_{i}a_{ij}^{2}}\,\eqqcolon\frac{b_{J}}{k^{1/p}}.

Thus, Talagrand’s concentration for convex functions and random vectors with independent bounded coordinates (see [56, Theorem 6.6 and Equation (6.18)]), together with the inequality $\operatorname{Med}(|Z|)\leq 2\mathbb{E}|Z|$ , implies

(4.29)

\mathbb{P}(\|X_{A}x\|_{q}\geq 2\mathbb{E}\|X_{A}x\|_{q}+t)\leq 4\exp\Bigl{(}-\frac{k^{2/p}t^{2}}{16M^{2}b_{J}^{2}}\Bigr{)}\qquad\text{for all }t\geq 0.

Similar to the proof in the Gaussian case (i.e., proof of Proposition 1.7), we shall transform this into a more convenient form by getting rid of $b_{J}$ and estimating $\mathbb{E}\|X_{A}x\|_{q}$ . Let us denote, for each $i\in\{1,\dots,m\}$ ,

\displaystyle Z_{i}

\displaystyle\coloneqq\sum_{j=1}^{n}a_{ij}X_{ij}x_{j}.

From our assumption (4.27) as well as Lemmas 2.19 and 2.18, we obtain that $(\mathbb{E}|Z_{i}|^{q})^{1/q}\lesssim_{r,K,L}q^{1/r}\sqrt{\sum_{j=1}^{n}a_{ij}^{2}x_{j}^{2}}$ . Hence,

	$\displaystyle\mathbb{E}\\|X_{A}x\\|_{q}$	$\displaystyle\leq\bigl{(}\mathbb{E}\\|X_{A}x\\|_{q}^{q}\bigr{)}^{1/q}=\bigl{(}\sum_{i=1}^{m}\mathbb{E}\|(X_{A}x)_{i}\|^{q}\bigr{)}^{1/q}\lesssim_{r,K,L}q^{1/r}\Bigl{(}\sum_{i=1}^{m}\bigl{(}\sum_{j=1}^{n}a_{ij}^{2}x_{j}^{2}\bigr{)}^{q/2}\Bigr{)}^{1/q}$
		$\displaystyle\leq q^{1/r}\sup_{z\in B_{p}^{n}}\Bigl{(}\sum_{i=1}^{m}\Bigl{\|}\sum_{j=1}^{n}a_{ij}^{2}z_{j}^{2}\Bigr{\|}^{q/2}\Bigr{)}^{1/q}=q^{1/r}\\|A\mathbin{\circ}A\colon\ell^{n}_{p/2}\to\ell^{m}_{q/2}\\|^{1/2}\eqqcolon q^{1/r}a.$

From (4.1), we see that

k^{2/p^{*}-1}b_{J}^{2}\leq\|(A\mathbin{\circ}A)^{T}\colon\ell^{m}_{q^{*}/2}\to\ell^{n}_{p^{*}/2}\|\eqqcolon b^{2}.

The above two inequalities together with estimate (4.29) (applied with $t=4k^{\frac{1}{p^{*}}-\frac{1}{2}}b_{J}M\sqrt{\ln(en)}s$ ), imply that

(4.30)

\mathbb{P}\bigl{(}\|X_{A}x\|_{q}\geq C(r,K,L)q^{1/r}a+4bM\sqrt{\ln(en)}s\bigr{)}\leq 4\exp\bigl{(}-k\ln(en)s^{2}\bigr{)}

for every $s\geq 0$ and any $x\in\operatorname{Ext}(K)$ with support of cardinality $k$ .

\mathbb{P}\bigl{(}\sup_{x\in\operatorname{Ext}K}\!\|X_{A}x\|_{q}\geq C(r,K,L)q^{1/r}a+4bM\sqrt{\ln(en)}s\bigr{)}\leq 4\sum_{k=1}^{n}\exp(-k\ln(en)(s^{2}-1))\\ \leq 4n\exp(-\ln(en)(s^{2}-1))=4n(en)^{-s^{2}+1}\leq 4e^{-s^{2}+1}.

Hence, by Lemma 2.6,

\displaystyle\mathbb{E}\!\sup_{x\in\operatorname{Ext}K}\!\|X_{A}x\|_{q}

\displaystyle\leq C(r,K,L)q^{1/r}a+4bM\sqrt{\ln(en)}\Bigl{(}\sqrt{2}+4e\frac{e^{-2}}{2\sqrt{2}}\Bigr{)}.

Recalling (4.28) and the definitions of $a$ and $b$ yields the assertion. ∎

Proof of Theorem 1.15.

By a symmetrization argument (as in the first paragraph of the proof of Theorem 3.5), we may and do assume that all the entries $X_{ij}$ are symmetric. Set $M=(4L\ln(mn)/r)^{1/r}$ . Denote $\widehat{X}_{ij}=X_{ij}\mathbf{1}_{\{|X_{ij}|\leq M\}}$ and let $\widehat{X}$ be the $m\times n$ matrix with entries $\widehat{X}_{ij}$ . We have

	$\displaystyle\mathbb{E}\\|X_{A}\colon\ell^{n}_{p}\to\ell^{m}_{q}\\|$	$\displaystyle=\mathbb{E}\\|X_{A}\colon\ell^{n}_{p}\to\ell^{m}_{q}\\|\mathbf{1}_{\{\max_{k,l}\|X_{kl}\|\leq M\}}$
		$\displaystyle\quad+\mathbb{E}\\|X_{A}\colon\ell^{n}_{p}\to\ell^{m}_{q}\\|\mathbf{1}_{\{\max_{k,l}\|X_{kl}\|>M\}}.$

The random matrix $\widehat{X}$ satisfies the assumptions of Lemma 4.4. Thus, the first summand above can be estimated as follows:

	$\displaystyle\mathbb{E}\\|X_{A}\colon\ell^{n}_{p}\to\ell^{m}_{q}\\|\mathbf{1}_{\{\max_{k,l}\|X_{kl}\|\leq M\}}$
	$\displaystyle=\mathbb{E}\sup_{y\in B_{q^{*}}^{m},x\in B_{p}^{n}}\bigl{\{}\sum_{i=1}^{m}\sum_{j=1}^{n}y_{i}a_{ij}X_{ij}x_{j}\bigr{\}}\cdot\mathbf{1}_{\{\max_{k,l}\|X_{kl}\|\leq M\}}$
	$\displaystyle=\mathbb{E}\sup_{y\in B_{q^{*}}^{m},x\in B_{p}^{n}}\bigl{\{}\sum_{i=1}^{m}\sum_{j=1}^{n}y_{i}a_{ij}X_{ij}\mathbf{1}_{\{\|X_{ij}\|\leq M\}}x_{j}\bigr{\}}\cdot\mathbf{1}_{\{\max_{k,l}\|X_{kl}\|\leq M\}}$
	$\displaystyle=\mathbb{E}\\|\widehat{X}_{A}\colon\ell^{n}_{p}\to\ell^{m}_{q}\\|\mathbf{1}_{\{\max_{k,l}\|X_{kl}\|\leq M\}}\leq\mathbb{E}\\|\widehat{X}_{A}\colon\ell^{n}_{p}\to\ell^{m}_{q}\\|$
	$\displaystyle\lesssim_{r,K,L}\ q^{1/r}\ln(en)^{1/p^{*}}\\|A\mathbin{\circ}A\colon\ell^{n}_{p/2}\to\ell^{m}_{q/2}\\|^{1/2}$
	$\displaystyle\qquad\qquad\ +M\ln(en)^{1/2+1/p^{}}\\|(A\mathbin{\circ}A)^{T}\colon\ell^{m}_{q^{}/2}\to\ell^{n}_{p^{*}/2}\\|^{1/2}$
	$\displaystyle\lesssim_{r,K,L}\ q^{1/r}\ln(en)^{1/p^{*}}\\|A\mathbin{\circ}A\colon\ell^{n}_{p/2}\to\ell^{m}_{q/2}\\|^{1/2}$
	$\displaystyle\qquad\qquad\ +\ln(mn)^{1/r}\ln(en)^{1/2+1/p^{}}\\|(A\mathbin{\circ}A)^{T}\colon\ell^{m}_{q^{}/2}\to\ell^{n}_{p^{*}/2}\\|^{1/2}.$

For the second summand we write, using the Cauchy–Schwarz inequality and then Lemma 4.3 and Lemma 2.22 (with $k=mn$ and $v=4/r$ ; recall that $M=(4L\ln(mn)/r)^{1/r}$ ),

	$\displaystyle\mathbb{E}\\|X_{A}\colon\ell^{n}_{p}\to\ell^{m}_{q}\\|\mathbf{1}_{\{\max_{k,l}\|X_{kl}\|>M\}}$
	$\displaystyle\leq\bigl{(}\mathbb{E}\\|X_{A}\colon\ell^{n}_{p}\to\ell^{m}_{q}\\|^{2}\bigr{)}^{1/2}\mathbb{P}(\max_{k\leq m,l\leq n}\|X_{kl}\|>M)^{1/2}$
	$\displaystyle\lesssim_{r,K,L}(m+n)^{1/r}\\|A\mathbin{\circ}A\colon\ell^{n}_{p/2}\to\ell^{m}_{q/2}\\|^{1/2}\cdot(mn)^{-2/r+1/2}$
	$\displaystyle\lesssim_{r}\\|A\mathbin{\circ}A\colon\ell^{n}_{p/2}\to\ell^{m}_{q/2}\\|^{1/2}.$

Combinging the above three inequalities ends the proof. ∎

5. Lower bounds and further discussion of conjectures

5.1. Lower bounds

Let us first provide lower bounds showing that the upper bounds obtained above are indeed sharp (up to logarithms).

Proposition 5.1.

Let $X=(X_{ij})_{i\leq m,j\leq n}$ be a random matrix with independent mean-zero entries satisfying $\mathbb{E}|X_{ij}|\geq c$ for some $c\in(0,\infty)$ . Then, for all $1\leq p,q\leq\infty$ ,

\mathbb{E}\|X_{A}\colon\ell_{p}^{n}\to\ell_{q}^{m}\|\geq\frac{c}{2\sqrt{2}}\|A\mathbin{\circ}A\colon\ell^{n}_{p/2}\to\ell^{m}_{q/2}\|^{1/2}.

Using duality (1.12) we immediately obtain the following corollary.

Corollary 5.2.

Let $X=(X_{ij})_{i\leq m,j\leq n}$ be as in Proposition 5.1. Then, for all $1\leq p,q\leq\infty$ ,

\mathbb{E}\|X_{A}\colon\ell_{p}^{n}\to\ell_{q}^{m}\|\geq\frac{c}{2\sqrt{2}}\|(A\mathbin{\circ}A)^{T}\colon\ell^{m}_{q^{*}/2}\to\ell^{n}_{p^{*}/2}\|^{1/2}.

Proof of Proposition 5.1.

Let $\|\cdot\|$ denote the operator norm from $\ell_{p}^{n}$ to $\ell_{q}^{m}$ . For $i\in\{1,\dots,m\}$ and $j\in\{1,\dots,n\}$ , let us denote by $E_{ij}$ the $m\times n$ matrix with entry $1$ at the intersection of $i$ th row and $j$ th column and with all other entries $0$ . By the symmetrization trick described in Remark 3.4, it suffices to consider matrices $X$ with symmetric entries and prove the assertion with a twice better constant $c/\sqrt{2}$ (note that, also by Remark 3.4, the lower bound for the absolute first moment of the symmetrized entries does not change and is still equal to $c$ ).

If $X$ has symmetric independent entries, it has the same distribution as $(\varepsilon_{ij}|X_{ij}|)_{ij}$ , where $\varepsilon_{ij}$ , $i\leq m$ , $j\leq n$ , are i.i.d. Rademacher random variables, independent of all other random variables. Hence, by Jensen’s inequality and the contraction principle (Lemma 2.7 applied with $\alpha_{ij}=1/\mathbb{E}|X_{ij}|\leq 1/c$ and $x_{ij}=a_{ij}\mathbb{E}|X_{ij}|E_{ij}$ ), we get

	$\displaystyle\mathbb{E}\Bigl{\\|}\sum_{i=1}^{m}\sum_{j=1}^{n}X_{ij}a_{ij}E_{ij}\Bigr{\\|}$	$\displaystyle=\mathbb{E}\Bigl{\\|}\sum_{i,j}\varepsilon_{ij}\|X_{ij}\|a_{ij}E_{ij}\Bigr{\\|}\geq\mathbb{E}\Bigl{\\|}\sum_{i,j}\varepsilon_{ij}\mathbb{E}\|X_{ij}\|a_{ij}E_{ij}\Bigr{\\|}$
(5.1)			$\displaystyle\geq c\ \mathbb{E}\Bigl{\\|}\sum_{i,j}\varepsilon_{ij}a_{ij}E_{ij}\Bigr{\\|}.$

Thus, it suffices to estimate from below $\ \mathbb{E}\|\sum_{i,j}\varepsilon_{ij}a_{ij}E_{ij}\|$ .

Since the $\ell_{q}$ norm is unconditional, we obtain from the inequalities of Jensen and Khintchine (see [26]) that

	$\displaystyle\mathbb{E}\Bigl{\\|}\sum_{i=1}^{m}\sum_{j=1}^{n}\varepsilon_{ij}a_{ij}E_{ij}\Bigr{\\|}$	$\displaystyle=\mathbb{E}\sup_{x\in B_{p}^{n}}\Bigl{\\|}\bigl{(}\sum_{j=1}^{n}a_{ij}\varepsilon_{ij}x_{j}\bigr{)}_{i=1}^{m}\Bigr{\\|}_{q}=\mathbb{E}\sup_{x\in B_{p}^{n}}\Bigl{\\|}\Bigl{(}\bigl{\|}\sum_{j=1}^{n}a_{ij}\varepsilon_{ij}x_{j}\bigr{\|}\Bigr{)}_{i=1}^{m}\Bigr{\\|}_{q}$
		$\displaystyle\geq\sup_{x\in B_{p}^{n}}\Bigl{\\|}\Bigl{(}\mathbb{E}\bigl{\|}\sum_{j=1}^{n}a_{ij}\varepsilon_{ij}x_{j}\bigr{\|}\Bigr{)}_{i=1}^{m}\Bigr{\\|}_{q}$
		$\displaystyle\mathop{\geq}^{\text{Khintchine's}}_{\text{inequality}}\frac{1}{\sqrt{2}}\sup_{x\in B_{p}^{n}}\Bigl{\\|}\Bigl{(}\bigl{(}\sum_{j=1}^{n}a_{ij}^{2}x_{j}^{2}\bigr{)}^{1/2}\Bigr{)}_{i=1}^{m}\Bigr{\\|}_{q}$
		$\displaystyle=\frac{1}{\sqrt{2}}\sup_{z\in B_{p/2}^{n}}\Bigl{\\|}\Bigl{(}\sum_{j=1}^{n}a_{ij}^{2}z_{j}\Bigr{)}_{i=1}^{m}\Bigr{\\|}_{q/2}^{1/2}=\frac{1}{\sqrt{2}}\\|A\mathbin{\circ}A\colon\ell^{n}_{p/2}\to\ell^{m}_{q/2}\\|^{1/2}.$

This together with the estimate in (5.1) yields the assertion. ∎

Since $\|G_{A}\colon\ell_{p}^{n}\to\ell_{q}^{m}\|\geq\max_{i,j}|a_{ij}g_{ij}|$ , it suffices to prove the following proposition in order to provide the lower bound in Conjecture 1.

Proposition 5.3.

For the $m\times n$ Gaussian matrix $G_{A}$ , we have

(5.2)

\mathbb{E}\|G_{A}\colon\ell_{p}^{n}\to\ell_{q}^{m}\|\gtrsim_{p,q}\begin{cases}\max_{j\leq n}\sqrt{\ln(j+1)}b_{j}^{\downarrow{}}&\text{if }\ p\leq q\leq 2,\\ \max_{i\leq m}\sqrt{\ln(i+1)}d_{i}^{\downarrow{}}&\text{if }\ 2\leq p\leq q,\\ 0&\text{otherwise,}\end{cases}

where $b_{j}=\|(a_{ij})_{i\leq m}\|_{2q/(2-q)}$ and $d_{i}=\|(a_{ij})_{j\leq n}\|_{2p/(p-2)}$ .

Proof.

Since $B_{1}^{n}\subset B_{p}^{n}$ for $p\geq 1$ and the $b_{j}$ ’s do not depend on $p$ , it suffices to prove the first part of the assertion (in the range $p\leq q\leq 2$ ) only in the case $p=1\leq q\leq 2$ . In this case (5.2) follows by Propostion 1.10.

The assertion in the range $2\leq p\leq q$ follows by duality (1.12). ∎

5.2. The proof of Inequalities (1.13) and (1.11)

Let us now show that in the case $q<p$ , the third term on the right-hand side in Conjecture 1 is not needed. To this end it suffices to prove (1.13) only in the case $q<2$ , since the case $p>2$ follows by duality (1.12).

Proposition 5.4.

Whenever $1\leq q<p\leq\infty$ and $q<2$ , we have

(5.3)

D_{2}=\|(A\mathbin{\circ}A)^{T}\colon\ell^{m}_{q^{*}/2}\to\ell^{n}_{p^{*}/2}\|^{1/2}\gtrsim_{p,q}\max_{j\leq n}\sqrt{\ln(j+1)}b_{j}^{\downarrow{}},

where $b_{j}=\|(a_{ij})_{i\leq m}\|_{2q/(2-q)}$ .

Proof.

Since the right-hand side of (5.3) does not depend on $p$ , and the left-hand side is non-decreasing with $p$ , we may consider only the case $1\leq q<p\leq 2$ . By permuting the columns of $A$ we may and do assume without loss of generality that the sequence $(b_{j})_{j}$ is non-increasing.

Fix $j_{0}\leq n$ . Let $r$ be the midpoint of the non-empty interval $(\frac{2-p}{p},\frac{2-q}{q})$ . Take $x=(x_{j})_{j\leq n}$ with $x_{j}=\frac{1}{j^{r}}$ . Since $rp/(2-p)>1$ , we have

\sum_{j=1}^{n}x_{j}^{p/(2-p)}\leq\sum_{j=1}^{\infty}\frac{1}{j^{rp/{(2-p)}}}=C(p,q)<\infty,

so $x\in C^{\prime}(p,q)B^{n}_{p/(2-p)}=C^{\prime}(p,q)B^{n}_{(p^{\ast}/2)^{\ast}}$ . Therefore, the inequality $(q^{\ast}/2)^{\ast}=q/(2~{}-~{}q)\geq~{}1$ and the facts that $b_{j}\geq b_{j_{0}}$ for all $j\leq j_{0}$ , and that $r<(2-q)/q$ imply

	$\displaystyle D_{2}^{2}$	$\displaystyle=\sup_{z\in B_{(p^{\ast}/2)^{\ast}}^{n}}\biggl{(}\sum_{i=1}^{m}\Bigl{(}\sum_{j=1}^{n}a_{ij}^{2}z_{j}\Bigr{)}^{(q^{\ast}/2)^{\ast}}\biggr{)}^{1/(q^{\ast}/2)^{\ast}}\gtrsim_{p,q}\biggl{(}\sum_{i=1}^{m}\Bigl{(}\sum_{j=1}^{j_{0}}a_{ij}^{2}j^{-r}\Bigr{)}^{q/(2-q)}\biggr{)}^{(2-q)/q}$
		$\displaystyle\geq\Bigl{(}\sum_{i=1}^{m}\sum_{j=1}^{j_{0}}a_{ij}^{2q/(2-q)}j^{-{rq/(2-q)}}\Bigr{)}^{(2-q)/q}=\Bigl{(}\sum_{j=1}^{j_{0}}b_{j}^{2q/(q-2)}j^{-{rq/(2-q)}}\Bigr{)}^{(2-q)/q}$
		$\displaystyle\geq b_{j_{0}}^{2}j_{0}^{-r+(2-q)/q}\gtrsim_{p,q}b_{j_{0}}^{2}\ln(j_{0}+1).$

Taking the maximum over all $j_{0}\leq n$ completes the proof. ∎

Now we turn to the proof of (1.11). Note that it suffices to prove only the first two-sided inequality in (1.11), since the second one follows from it by duality (1.12).

Proposition 5.5.

For all $1\leq p,q\leq\infty$ , we have

(5.4)

\|A\mathbin{\circ}A\colon\ell^{n}_{p/2}\to\ell^{m}_{q/2}\|^{1/2}+\mathbb{E}\max_{i\leq m,j\leq n}|a_{ij}g_{ij}|\\ \asymp_{q}\|A\mathbin{\circ}A\colon\ell^{n}_{p/2}\to\ell^{m}_{q/2}\|^{1/2}+\max_{i\leq m,j\leq n}\sqrt{\ln(j+1)}a_{ij}^{\prime},

where the matrix $(a_{ij}^{\prime})_{i,j}$ is obtained by permuting the columns of the matrix $(|a_{ij}|)_{i,j}$ in such a way that $\max_{i}a_{i1}^{\prime}\geq\dots\geq\max_{i}a_{in}^{\prime}$ .

Proof.

By permuting the columns of the matrix $A$ , we can assume that the sequence $(\max_{i\leq m}|a_{ij}|)_{j=1}^{n}$ is non-increasing. We have

(5.5)

\mathbb{E}\max_{i\leq m,j\leq n}|a_{ij}g_{ij}|\leq\mathbb{E}\max_{j\leq n}\Big{(}\max_{i\leq m}|a_{ij}g_{ij}|-\mathbb{E}\max_{i\leq m}|a_{ij}g_{ij}|\Big{)}\\ +\max_{j\leq n}\mathbb{E}\max_{i\leq m}|a_{ij}g_{ij}|.

The function $y\mapsto\max_{i\leq m}|a_{ij}y_{i}|$ is $\max_{i\leq m}|a_{ij}|$ -Lipschitz with respect to the Euclidean norm on $\mathbb{R}^{m}$ , so by Gaussian concentration (see, e.g., [41, Chapter 5.1]),

\mathbb{P}\bigl{(}\max_{i\leq m}|a_{ij}g_{ij}|-\mathbb{E}\max_{i\leq m}|a_{ij}g_{ij}|\geq t\bigr{)}\leq\exp\Bigl{(}-\frac{t^{2}}{2\max_{i\leq m}|a_{ij}|}\Bigr{)}

for all $t\geq 0$ , $j\leq n$ . Thus, Lemma 2.11 and inequality (5.5) imply

(5.6)

\displaystyle\mathbb{E}\max_{i\leq m,j\leq n}|a_{ij}g_{ij}|\lesssim\max_{j\leq n}\Big{(}\sqrt{\ln(j+1)}\max_{i\leq m}|a_{ij}|\Big{)}+\max_{j\leq n}\mathbb{E}\max_{i\leq m}|a_{ij}g_{ij}|.

We have

	$\displaystyle\max_{j\leq n}\mathbb{E}\max_{i\leq m}\|a_{ij}g_{ij}\|$	$\displaystyle\leq\max_{j\leq n}\mathbb{E}\Big{(}\sum_{i=1}^{m}\|a_{ij}g_{ij}\|^{q}\Big{)}^{1/q}\leq\gamma_{q}\max_{j\leq n}\\|(a_{ij})_{i}\\|_{q}$
		$\displaystyle=\gamma_{q}\max_{j\leq n}\\|(a_{ij}^{2})_{i}\\|_{q/2}^{1/2}\leq\gamma_{q}\\|A\mathbin{\circ}A\colon\ell^{n}_{p/2}\to\ell^{m}_{q/2}\\|^{1/2},$

which, together with (5.6), provides the asserted upper bound.

On the other hand, if $(a_{l}^{\downarrow{}})_{l\leq mn}$ denotes the non-increasing rearrangement of the sequence of all absolute values of entries of $A$ , then Lemma 2.12 implies

	$\displaystyle\mathbb{E}\max_{j\leq n}\max_{i\leq m}\|a_{ij}g_{ij}\|\gtrsim\max_{l\leq mn}\sqrt{\ln(l+1)}a_{l}^{\downarrow{}}$	$\displaystyle\geq\max_{j\leq n}\sqrt{\ln(j+1)}a_{j}^{\downarrow{}}$
		$\displaystyle\geq\max_{j\leq n}\Big{(}\sqrt{\ln(j+1)}\max_{i\leq m}a_{ij}^{\prime}\Big{)},$

which provides the asserted lower bound. ∎

Note that the above proof shows in fact that

\max_{j\leq n}\|(a_{ij})_{i}\|_{q}+\mathbb{E}\max_{i\leq m,j\leq n}|a_{ij}g_{ij}|\\ \asymp_{q}\max_{j\leq n}\|(a_{ij})_{i}\|_{q}+\max_{i\leq m,j\leq n}\sqrt{\ln(j+1)}a_{ij}^{\prime},

(5.7)

\max_{j\leq n}\|(a_{ij})_{i}\|_{q}+\max_{i\leq m}\|(a_{ij})_{j}\|_{p^{\ast}}+\max_{j\leq n,i\leq m}\sqrt{\ln(i+1)}a_{ij}^{\prime\prime}\\ \asymp_{q}\max_{j\leq n}\|(a_{ij})_{i}\|_{q}+\max_{i\leq m}\|(a_{ij})_{j}\|_{p^{\ast}}+\max_{i\leq m,j\leq n}\sqrt{\ln(j+1)}a_{ij}^{\prime},

where the matrix $(a_{ij}^{\prime\prime})_{i,j}$ is obtained by permuting the rows of the matrix $(|a_{ij}|)_{i,j}$ in such a way that $\max_{j}a_{1j}^{\prime\prime}\geq\dots\geq\max_{j}a_{mj}^{\prime\prime}$ .

5.3. Counterexample to a seemingly natural conjecture

In this subsection we provide an example showing that for any $p\leq q<2$ the bound

	$\displaystyle\mathbb{E}\\|G_{A}\colon\ell_{p}^{n}\to\ell_{q}^{m}\\|\lesssim_{p,q}\\|A\mathbin{\circ}A\colon\ell^{n}_{p/2}\to\ell^{m}_{q/2}\\|^{1/2}$	$\displaystyle+\\|(A\mathbin{\circ}A)^{T}\colon\ell^{m}_{q^{}/2}\to\ell^{n}_{p^{}/2}\\|^{1/2}$
(5.8)			$\displaystyle+\mathbb{E}\max_{i\leq m,j\leq n}\|a_{ij}g_{ij}\|.$

cannot hold. By duality (1.12), it also cannot hold for any $2<p\leq q$ . This explains that Conjecture 1 cannot be simplified into a form like on the right-hand side of (1.8).

Let $p\leq q<2$ , $k,N\in\mathbb{N}$ , and let $A_{1},\ldots,A_{N}$ be $k\times k$ matrices with all entries equal to one. Consider a block matrix

A=\begin{pmatrix}\begin{matrix}A_{1}&\\ &A_{2}\end{matrix}&0\\ 0&\begin{matrix}\ddots&\\ &A_{N}\end{matrix}\end{pmatrix}

of size $kN\times kN$ , with blocks $A_{1},\ldots A_{N}$ on the diagonal and with all other entries equal to $0$ .

Note that since $p\leq q\leq 2$ ,

	$\displaystyle\\|A\mathbin{\circ}A\colon\ell^{kN}_{p/2}\to\ell^{kN}_{q/2}\\|$	$\displaystyle=\max_{l\leq N}\\|A_{l}\mathbin{\circ}A_{l}\colon\ell^{k}_{p/2}\to\ell^{k}_{q/2}\\|=\\|A_{1}\mathbin{\circ}A_{1}\colon\ell^{k}_{p/2}\to\ell^{k}_{q/2}\\|$
		$\displaystyle=\sup_{x\in B_{p/2}^{k}}\Bigl{(}\sum_{i=1}^{k}\Bigl{\|}\sum_{j=1}^{k}x_{i}\Bigr{\|}^{q/2}\Bigr{)}^{2/q}=\sup_{x\in B_{p/2}^{k}}k^{2/q}\Bigl{\|}\sum_{i=1}^{k}x_{i}\Bigr{\|}=k^{2/q},$

and similarly, since $2\leq q^{\ast}\leq p^{\ast}$ ,

\|(A\mathbin{\circ}A)^{T}\colon\ell^{kN}_{q^{\ast}/2}\to\ell^{kN}_{p^{\ast}/2}\|=\|(A_{1}\mathbin{\circ}A_{1})^{T}\colon\ell^{k}_{q^{\ast}/2}\to\ell^{k}_{p^{\ast}/2}\|=k^{2/p^{\ast}+1-2/q^{\ast}}.

The two bounds above and Lemma 2.10 imply that the right-hand side of (5.8) is bounded from above by

(5.9)

C\Bigl{(}k^{1/q}+k^{1/p^{\ast}+1/2-1/q^{\ast}}+\sqrt{\ln(kN)}\Bigr{)}.

On the other hand, since for all $j\leq kN$ , $\|(a_{ij})_{i}\|_{2q/(2-q)}=k^{(2-q)/(2q)}$ , we obtain from the lower bound (5.2) that

(5.10)

\mathbb{E}\|G_{A}\colon\ell_{p}^{kN}\to\ell_{q}^{kN}\|\gtrsim\sqrt{\ln(kN)}k^{(2-q)/(2q)}.

If we take $N\asymp e^{e^{k}}$ , then (5.10) is of larger order than (5.9) as $k\to\infty$ , so (5.8) cannot hold.

5.4. Discussion of another natural conjecture

In this subsection we prove all the assertions of Remark 1.1. We begin by showing that for every $1\leq p\leq 2\leq q\leq\infty$ ,

(5.11)

D_{1}+D_{2}+\mathbb{E}\max_{i,j}|a_{ij}g_{ij}|\asymp_{p,q}\mathbb{E}\max_{i\leq m}\|(a_{ij}g_{ij})_{j}\|_{p^{\ast}}+\mathbb{E}\max_{j\leq n}\|(a_{ij}g_{ij})_{i}\|_{q},

and, in the case $p,q\geq 2$ ,

(5.12)

\mathbb{E}\max_{i\leq m}\|(a_{ij}g_{ij})_{j}\|_{p^{\ast}}+\mathbb{E}\max_{j\leq n}\|(a_{ij}g_{ij})_{i}\|_{q}\\ \lesssim_{p,q}\max_{i\leq m}\|(a_{ij})_{j}\|_{p^{\ast}}+\max_{j\leq n}\|(a_{ij})_{i}\|_{q}+\max_{i\leq m}\sqrt{\ln(i+1)}d_{i}^{\downarrow{}},

where $D_{1}=\|A\mathbin{\circ}A\colon\ell^{n}_{p/2}\to\ell^{m}_{q/2}\|^{1/2}$ , $D_{2}=\|(A\mathbin{\circ}A)^{T}\colon\ell^{m}_{q^{*}/2}\to\ell^{n}_{p^{*}/2}\|^{1/2}$ , and $d_{i}=\|(a_{ij})_{j\leq n}\|_{2p/(p-2)}$ . In other words, (5.11) shows that Conjecture 1 is equivalent to (1.15) as long as $1\leq p\leq 2\leq q\leq\infty$ .

Proof of (5.11) and (5.12).

Fix $i\leq m$ and let $f(x)=\|(a_{ij}x_{j})_{j}\|_{p^{\ast}}$ for $x\in\mathbb{R}^{n}$ . For $p\geq 2$ we have $p^{\ast}(2/p^{\ast})^{\ast}=2p/(p-2)$ . Thus $f$ is Lipschitz continuous with constant $L_{i}$ equal to

\sup_{x\in B_{2}^{n}}\Bigl{(}\sum_{j=1}^{n}|a_{ij}x_{j}|^{p^{\ast}}\Bigr{)}^{1/p^{\ast}}=\sup_{y\in B_{2/p^{\ast}}^{n}}\Bigl{(}\sum_{j=1}^{n}|a_{ij}|^{p^{\ast}}y_{j}\Bigr{)}^{1/p^{\ast}}=\begin{cases}\max_{j\leq n}|a_{ij}|&\text{if }\ p\leq 2,\\ \|(a_{ij})_{j}\|_{{2p/(p-2)}}&\text{if }\ p\geq 2.\end{cases}

Therefore, the Gaussian concentration inequality (see, e.g., [41, Chapter 5.1]) implies that for every $t\geq 0$ and every $i\leq m$ ,

\mathbb{P}\Bigl{(}\|(a_{ij}g_{ij})_{j}\|_{p^{\ast}}-\mathbb{E}\|(a_{ij}g_{ij})_{j}\|_{p^{\ast}}\geq t\Bigr{)}\leq e^{-t^{2}/2L_{i}^{2}},

so by Lemma 2.11 we get

(5.13)

\mathbb{E}\max_{i\leq m}\Bigl{(}\|(a_{ij}g_{ij})_{j}\|_{p^{\ast}}-\mathbb{E}\|(a_{ij}g_{ij})_{j}\|_{p^{\ast}}\Bigr{)}\\ \lesssim\begin{cases}\max_{i\leq m}\max_{j\leq n}\sqrt{\ln(i+1)}a_{ij}^{\prime\prime}&\text{if }\ p\leq 2,\\ \max_{i\leq m}\sqrt{\ln(i+1)}d_{i}^{\downarrow{}}&\text{if }\ p\geq 2,\end{cases}

Moreover, by Jensen’s inequality,

\mathbb{E}\|(a_{ij}g_{ij})_{j}\|_{p^{\ast}}\leq\bigl{(}\mathbb{E}\|(a_{ij}g_{ij})_{j}\|_{p^{\ast}}^{p^{\ast}}\bigr{)}^{1/p^{\ast}}=\Bigl{(}\mathbb{E}\sum_{j=1}^{n}|a_{ij}g_{ij}|^{p^{\ast}}\Bigr{)}^{1/p^{\ast}}=\gamma_{p^{\ast}}\|(a_{ij})_{j}\|_{p^{\ast}}.

This together with the triangle inequality and (5.13) implies

\mathbb{E}\max_{i\leq m}\|(a_{ij}g_{ij})_{j}\|_{p^{\ast}}\lesssim_{p}\max_{i\leq m}\|(a_{ij})_{j}\|_{p^{\ast}}+\begin{cases}\max_{i\leq m}\max_{j\leq n}\sqrt{\ln(i+1)}a_{ij}^{\prime\prime}&\text{if }\ p\leq 2,\\ \max_{i\leq m}\sqrt{\ln(i+1)}d_{i}^{\downarrow{}}&\text{if }\ p\geq 2,\end{cases}

and, by duality,

\mathbb{E}\max_{j\leq n}\|(a_{ij}g_{ij})_{i}\|_{q}\lesssim_{q}\max_{j\leq n}\|(a_{ij})_{i}\|_{q}+\begin{cases}\max_{j\leq n}\max_{i\leq m}\sqrt{\ln(j+1)}a_{ij}^{\prime}&\text{if }\ q\geq 2,\\ \max_{j\leq n}\sqrt{\ln(j+1)}b_{j}^{\downarrow{}}&\text{if }\ q\leq 2,\end{cases}

where $b_{j}=\|(a_{ij})_{i})\|_{2q/(2-q)}$ , and the matrix $(a_{ij}^{\prime})_{i,j}$ is obtained by permuting the columns of the matrix $(|a_{ij}|)_{i,j}$ in such a way that $\max_{i}a_{i1}^{\prime}\geq\dots\geq\max_{i}a_{in}^{\prime}$ . This, together with Lemma 2.1 and (5.4) yields in the case $p\leq 2\leq q$ ,

\mathbb{E}\max_{i\leq m}\|(a_{ij}g_{ij})_{j}\|_{p^{\ast}}+\mathbb{E}\max_{j\leq n}\|(a_{ij}g_{ij})_{i}\|_{q}\lesssim_{p,q}D_{1}+D_{2}+\mathbb{E}\max_{i,j}|a_{ij}g_{ij}|,

what implies the lower bound of (5.11). In the case $2<p,q$ we additionally use (5.7) and the simple observation that

\max_{i\leq m}\max_{j\leq n}\sqrt{\ln(i+1)}a_{ij}^{\prime\prime}\leq\max_{i\leq m}\sqrt{\ln(i+1)}d_{i}^{\downarrow{}}

to get (5.12).

Now we move to the proof of the upper bound of (5.11) in the case $p\leq 2\leq q$ . Since the $\ell_{p^{\ast}}^{n}$ norm is unconditional, we have by Jensen’s inequality and Lemma 2.1

	$\displaystyle\mathbb{E}\max_{i\leq m}\\|(a_{ij}g_{ij})_{j}\\|_{p^{\ast}}=\mathbb{E}\max_{i\leq m}\\|(\|a_{ij}g_{ij}\|)_{j}\\|_{p^{\ast}}$	$\displaystyle\geq\max_{i\leq m}\\|(\|a_{ij}\|\mathbb{E}\|g_{ij}\|)_{j}\\|_{p^{\ast}}$
		$\displaystyle=\sqrt{2/\pi}\max_{i\leq m}\\|(\|a_{ij}\|)_{j}\\|_{p^{\ast}}=\sqrt{2/\pi}D_{2},$

and dually

\mathbb{E}\max_{j\leq n}\|(a_{ij}g_{ij})_{i}\|_{q}\geq\sqrt{2/\pi}D_{1}.

Moreover, since $\|\cdot\|_{q}\geq\|\cdot\|_{\infty}$ ,

\mathbb{E}\max_{j\leq n}\|(a_{ij}g_{ij})_{i}\|_{q}\geq\mathbb{E}\max_{j}\max_{i}|a_{ij}g_{ij}|,

which finishes the proof of the upper bound of (5.11). ∎

Next, for every pair $(p,q)\in[1,\infty]^{2}$ which does not satisfy the condition $1\leq p\leq 2\leq q\leq\infty$ we shall give examples of $m,n\in\mathbb{N}$ , and $m\times n$ matrices $A$ , for which

(5.14)

\mathbb{E}\|G_{A}\colon\ell_{p}^{n}\to\ell_{q}^{m}\|\gg\mathbb{E}\max_{i\leq m}\|(a_{ij}g_{ij})_{j}\|_{p^{\ast}}+\mathbb{E}\max_{j\leq n}\|(a_{ij}g_{ij})_{i}\|_{q}

when $m,n\to\infty$ . This shows that the natural conjecture (1.15) is wrong outside the range $1\leq p\leq 2\leq q\leq\infty$ . The case $p=2=q$ , when (1.15) is valid (cf. (1.4)), is in a sense a boundary case, for which (1.15) (i.e., a natural generalization of (1.4)) may hold.

Example 5.6 (for (5.14) in the case $q<p$ .).

Let $m=n$ , and $A=\operatorname{Id}_{n}$ . Then by Lemmas 2.10 and 2.12 we have

\mathbb{E}\max_{i\leq m}\|(a_{ij}g_{ij})_{j}\|_{p^{\ast}}+\mathbb{E}\max_{j\leq n}\|(a_{ij}g_{ij})_{i}\|_{q}=2\max_{i\leq n}|g_{ii}|\asymp\sqrt{\ln n},

whereas Proposition 5.1 and our assumption $p/q>1$ imply

	$\displaystyle\mathbb{E}\\|G_{A}\colon\ell_{p}^{n}\to\ell_{q}^{n}\\|$	$\displaystyle\gtrsim\\|\operatorname{Id}_{n}\colon\ell_{p/2}^{n}\to\ell_{q/2}^{n}\\|^{1/2}=\sup_{x\in B_{p/2}^{n}}\Bigl{(}\sum_{i=1}^{n}\|x_{i}\|^{q/2}\Bigr{)}^{1/q}$
		$\displaystyle=\Bigl{(}\sup_{y\in B_{p/q}^{n}}\sum_{i=1}^{n}\|y_{i}\|\Bigr{)}^{1/q}=\bigl{(}n^{1/(p/q)^{*}}\bigr{)}^{1/q}\gg\sqrt{\ln n}.$

Since cases $2<p\leq q$ and $p\leq q<2$ are dual (see (1.12)), we give an example for which (5.14) holds only in the first case.

Example 5.7 (for (5.14) in the case $2<p\leq q$ .).

Fix $p$ and $q$ satisfying $2<p\leq q$ . Let $m,n\to\infty$ be such that $m^{1/q}\gg n^{1/p^{\ast}}$ , and let $A$ be an $m\times n$ matrix with all entries equal to $1$ . For $p>2$ we have $2(p/2)^{\ast}=2p/(p-2)$ . This together with (5.12) implies

	$\displaystyle\mathbb{E}\max_{i\leq m}\\|(a_{ij}g_{ij})_{j}\\|_{p^{\ast}}+\mathbb{E}\max_{j\leq n}\\|(a_{ij}g_{ij})_{i}\\|_{q}$
	$\displaystyle\lesssim_{p,q}\max_{i\leq m}\\|(a_{ij})_{j}\\|_{p^{\ast}}+\max_{j\leq n}\\|(a_{ij})_{i}\\|_{q}+\max_{i\leq m}\sqrt{\ln(i+1)}d_{i}^{\downarrow{}}$
	$\displaystyle=n^{1/p^{\ast}}+m^{1/q}+\sqrt{\ln(m+1)}n^{(p-2)/2p}\lesssim m^{1/q}+\sqrt{\ln m}\,n^{\frac{1}{2(p/2)^{\ast}}}.$

On the other hand, Proposition 5.1 and our assumption $p/2>1$ imply

	$\displaystyle\mathbb{E}\\|G_{A}\colon\ell_{p}^{n}\to\ell_{q}^{n}\\|$	$\displaystyle\gtrsim\\|A\colon\ell_{p/2}^{n}\to\ell_{q/2}^{n}\\|^{1/2}=\sup_{x\in B_{p/2}^{n}}\Bigl{(}\sum_{i=1}^{m}\Bigl{\|}\sum_{j=1}^{n}x_{j}\Bigr{\|}^{q/2}\Bigr{)}^{1/q}$
		$\displaystyle=m^{1/q}\sup_{x\in B_{p/2}^{n}}\Bigl{(}\Bigl{\|}\sum_{j=1}^{n}x_{j}\Bigr{\|}\Bigr{)}^{1/2}=m^{1/q}n^{\frac{1}{2(p/2)^{\ast}}}\gg m^{1/q}+\sqrt{\ln m}\,n^{\frac{1}{2(p/2)^{\ast}}}.$

5.5. Infinite dimensional Gaussian operators

In this subsection we prove Proposition 1.2 concerning infinite dimensional Gaussian operators. It allows us to see that Conjecture 1 implies Conjecture 2.

Proof of Proposition 1.2.

We adapt the proof of [40, Corollary 1.2] to prove Proposition 1.2 in the case $p\leq 2\leq q$ – remaining cases may be proven similarly. Fix $1\leq p\leq 2\leq q\leq\infty$ for which (1.14) holds and a deterministic infinite matrix $A=(a_{ij})_{i,j\in\mathbb{N}}$ . Using the monotone convergence theorem one can show that a matrix $B=(b_{ij})_{i,j\in\mathbb{N}}$ defines a bounded operator between $\ell_{p}(\mathbb{N})$ and $\ell_{q}(\mathbb{N})$ if an only if $\sup_{n\in\mathbb{N}}\|(b_{ij})_{i,j\leq n}\colon\ell_{p}^{n}\to\ell_{q}^{n}\|<\infty$ . Interpreting $\|B\colon\ell_{p}(\mathbb{N})\to\ell_{q}(\mathbb{N})\|$ as infinity for matrices which do not define a bounded operator, we have

		$\displaystyle\mathbb{E}\\|G_{A}\colon\ell_{p}(\mathbb{N})\to\ell_{q}(\mathbb{N})\\|=\mathbb{E}\sup_{x\in B_{p}^{\infty}}\biggl{(}\sum_{i=1}^{\infty}\Bigl{\|}\sum_{j=1}^{\infty}a_{ij}g_{ij}x_{j}\Bigr{\|}^{q}\biggr{)}^{1/q}$
	$\displaystyle=\mathbb{E}\lim_{n\to\infty}\sup_{x\in B_{p}^{n}}\biggl{(}\sum_{i=1}^{n}\Bigl{\|}\sum_{j=1}^{n}a_{ij}g_{ij}x_{j}\Bigr{\|}^{q}\biggr{)}^{1/q}$	$\displaystyle=\lim_{n\to\infty}\mathbb{E}\sup_{x\in B_{p}^{n}}\biggl{(}\sum_{i=1}^{n}\Bigl{\|}\sum_{j=1}^{n}a_{ij}g_{ij}x_{j}\Bigr{\|}^{q}\biggr{)}^{1/q}$
		$\displaystyle=\lim_{n\to\infty}\mathbb{E}\bigl{\\|}(g_{ij}a_{ij})_{i,j\leq n}\colon\ell_{p}^{n}\to\ell_{q}^{n}\bigr{\\|}$

and similarly

	$\displaystyle\\|A\mathbin{\circ}A\colon\ell_{p/2}(\mathbb{N})\to\ell_{q/2}(\mathbb{N})\\|$	$\displaystyle=\lim_{n\to\infty}\\|(a_{ij}^{2})_{i,j\leq n}\colon\ell^{n}_{p/2}\to\ell^{n}_{q/2}\\|,$
	$\displaystyle\\|(A\mathbin{\circ}A)^{T}\colon\ell_{q^{}/2}(\mathbb{N})\to\ell_{p^{}/2}(\mathbb{N})\\|$	$\displaystyle=\lim_{n\to\infty}\\|(a_{ji}^{2})_{i,j\leq n}\colon\ell^{n}_{q^{}/2}\to\ell^{n}_{p^{}/2}\\|,$
and
	$\displaystyle\mathbb{E}\sup_{i,j\in\mathbb{N}}\|a_{ij}g_{ij}\|$	$\displaystyle=\lim_{n\to\infty}\mathbb{E}\sup_{i,j\leq n}\|a_{ij}g_{ij}\|.$

Therefore, (1.14) implies the following: $\mathbb{E}\|G_{A}\colon\ell_{p}(\mathbb{N})\to\ell_{q}(\mathbb{N})\|<\infty$ if and only if $\|A\mathbin{\circ}A\colon\ell_{p/2}(\mathbb{N})\to\ell_{q/2}(\mathbb{N})\|<\infty$ , $\|(A\mathbin{\circ}A)^{T}\colon\ell_{q^{*}/2}(\mathbb{N})\to\ell_{p^{*}/2}(\mathbb{N})\|<\infty$ , and $\mathbb{E}\sup_{i,j\in\mathbb{N}}|a_{ij}g_{ij}|<\infty$ . It thus suffices to prove the following claim: $\|G_{A}\colon\ell_{p}(\mathbb{N})\to\ell_{q}(\mathbb{N})\|<\infty$ almost surely if and only if $\mathbb{E}\|G_{A}\colon\ell_{p}(\mathbb{N})\to\ell_{q}(\mathbb{N})\|<\infty$ .

If $\mathbb{P}(\|G_{A}\colon\ell_{p}(\mathbb{N})\to\ell_{q}(\mathbb{N})\|<\infty)<1$ , then $\mathbb{P}(\|G_{A}\colon\ell_{p}(\mathbb{N})\to\ell_{q}(\mathbb{N})\|=\infty)>0$ , so $\mathbb{E}\|G_{A}\colon\ell_{p}(\mathbb{N})\to\ell_{q}(\mathbb{N})\|=\infty$ .

Assume now that $\mathbb{P}(\|G_{A}\colon\ell_{p}(\mathbb{N})\to\ell_{q}(\mathbb{N})\|<\infty)=1$ . By (4.23) and (4.24) we know that for every $n\in\mathbb{N}$ there exist finite sets $S_{n}$ and $T_{n}$ such that

	$\displaystyle\\|G_{A}\colon\ell_{p}(\mathbb{N})\to\ell_{q}(\mathbb{N})\\|$	$\displaystyle=\sup_{n\in\mathbb{N}}\sup_{x\in B_{p}^{n},y\in B_{q^{\ast}}^{n}}\sum_{i=1}^{n}\sum_{j=1}^{n}y_{i}a_{ij}g_{ij}x_{j}$
		$\displaystyle\asymp\sup_{n}\sup_{x\in S_{n},y\in T_{n}}\sum_{i=1}^{n}\sum_{j=1}^{n}y_{i}a_{ij}g_{ij}x_{j}\qquad\text{a.s.}$

In particular, there exist Gaussian random variables $(\Gamma_{k})_{k\in\mathbb{N}}$ such that

\|G_{A}\colon\ell_{p}(\mathbb{N})\to\ell_{q}(\mathbb{N})\|\asymp\sup_{k\in\mathbb{N}}\Gamma_{k}\qquad\text{a.s.}

Therefore, we may apply [35, (1.2)] to see that there exists $\varepsilon>0$ such that $\mathbb{E}\exp(\varepsilon\|G_{A}\colon\ell_{p}(\mathbb{N})\to\ell_{q}(\mathbb{N})\|^{2})<\infty$ , so $\mathbb{E}\|G_{A}\colon\ell_{p}(\mathbb{N})\to\ell_{q}(\mathbb{N})\|<\infty$ , which completes the proof of the claim. ∎

Acknowledgments

R. Adamczak is partially supported by the National Science Center, Poland via the Sonata Bis grant no. 2015/18/E/ST1/00214. R. Adamczak was partially supported by by the WTZ Grant PL 06/2018 of the OeAD. J. Prochno and M. Strzelecka are — and M. Strzelecki was — supported by the Austrian Science Fund (FWF) Project P32405 Asymptotic Geometric Analysis and Applications. M. Strzelecka was partially supported by the National Science Center, Poland, via the Maestro grant no. 2015/18/A/ST1/00553.

References

[1] D. Achlioptas and F. Mcsherry, Fast computation of low-rank matrix approximations, J. ACM 54 (2007), no. 2, 9–es.
[2] R. Adamczak, R. Latała, A. E. Litvak, A. Pajor, and N. Tomczak-Jaegermann, Chevet type inequality and norms of submatrices, Studia Math. 210 (2012), no. 1, 35–56. MR 2949869
[3] R. Adamczak, R. Latała, Z. Puchała, and K. Życzkowski, Asymptotic entropic uncertainty relations, J. Math. Phys. 57 (2016), no. 3, 032204, 24. MR 3478525
[4] N. Ailon and B. Chazelle, The fast Johnson-Lindenstrauss transform and approximate nearest neighbors, SIAM J. Comput. 39 (2009), no. 1, 302–322. MR 2506527
[5] G. Akemann, J. Baik, and P. Di Francesco (eds.), The Oxford handbook of random matrix theory, Oxford University Press, Oxford, 2015.
[6] G. W. Anderson, A. Guionnet, and O. Zeitouni, An introduction to random matrices, Cambridge Studies in Advanced Mathematics, vol. 118, Cambridge University Press, Cambridge, 2010. MR 2760897
[7] A. S. Bandeira and R. van Handel, Sharp nonasymptotic bounds on the norm of random matrices with independent entries, Ann. Probab. 44 (2016), no. 4, 2479–2506. MR 3531673
[8] G. Bennett, Schur multipliers, Duke Math. J. 44 (1977), no. 3, 603–639. MR 493490
[9] G. Bennett, V. Goodman, and C. M. Newman, Norms of random matrices, Pacific J. Math. 59 (1975), no. 2, 359–365. MR 393085
[10] Y. Benyamini and Y. Gordon, Random factorization of operators between Banach spaces, J. Analyse Math. 39 (1981), 45–74. MR 632456
[11] S. Boucheron, G. Lugosi, and P. Massart, Concentration inequalities, Oxford University Press, Oxford, 2013, A nonasymptotic theory of independence, With a foreword by Michel Ledoux. MR 3185193
[12] J. Bourgain, S. Dirksen, and J. Nelson, Toward a unified theory of sparse dimensionality reduction in Euclidean space, Geom. Funct. Anal. 25 (2015), no. 4, 1009–1088. MR 3385629
[13] B. Carl, B. Maurey, and J. Puhl, Grenzordnungen von absolut- $(r,\,p)$ -summierenden Operatoren, Math. Nachr. 82 (1978), 205–218. MR 498116
[14] D. Chafaï, O. Guédon, G. Lecué, and A. Pajor, Interactions between compressed sensing random matrices and high dimensional geometry, Panoramas et Synthèses [Panoramas and Syntheses], vol. 37, Société Mathématique de France, Paris, 2012. MR 3113826
[15] K. R. Davidson and S. J. Szarek, Local operator theory, random matrices and Banach spaces, Handbook of the geometry of Banach spaces, Vol. I, North-Holland, Amsterdam, 2001, pp. 317–366. MR 1863696
[16] S. Foucart and H. Rauhut, A mathematical introduction to compressive sensing, Applied and Numerical Harmonic Analysis, Birkhäuser/Springer, New York, 2013. MR 3100033
[17] O. Friedland and P. Youssef, Approximating matrices and convex bodies, Int. Math. Res. Not. IMRN (2019), no. 8, 2519–2537. MR 3942169
[18] E. D. Gluskin, Norms of random matrices and diameters of finite-dimensional sets, Mat. Sb. (N.S.) 120(162) (1983), no. 2, 180–189, 286. MR 687610
[19] E. D. Gluskin and S. Kwapień, Tail and moment estimates for sums of independent random variables with logarithmically concave tails, Studia Math. 114 (1995), no. 3, 303–309. MR 1338834
[20] H. H. Goldstine and J. von Neumann, Numerical inverting of matrices of high order. II, Proc. Amer. Math. Soc. 2 (1951), 188–202. MR 41539
[21] Y. Gordon, Some inequalities for Gaussian processes and applications, Israel J. Math. 50 (1985), no. 4, 265–289. MR 800188
[22] Y. Gordon, A. E. Litvak, C. Schütt, and E. M. Werner, Geometry of spaces between polytopes and related zonotopes, Bull. Sci. Math. 126 (2002), no. 9, 733–762. MR 1941083
[23] O. Guédon, A. Hinrichs, A. E. Litvak, and J. Prochno, On the expectation of operator norms of random matrices, Geometric aspects of functional analysis, Lecture Notes in Math., vol. 2169, Springer, Cham, 2017, pp. 151–162. MR 3645120
[24] O. Guédon, S. Mendelson, A. Pajor, and N. Tomczak-Jaegermann, Majorizing measures and proportional subsets of bounded orthonormal systems, Rev. Mat. Iberoam. 24 (2008), no. 3, 1075–1095. MR 2490210
[25] O. Guédon and M. Rudelson, $L_{p}$ -moments of random vectors via majorizing measures, Adv. Math. 208 (2007), no. 2, 798–823. MR 2304336
[26] U. Haagerup, The best constants in the Khintchine inequality, Studia Math. 70 (1981), no. 3, 231–283 (1982). MR 654838
[27] A. Hinrichs, D. Krieg, E. Novak, J. Prochno, and M. Ullrich, On the power of random information, Multivariate Algorithms and Information-Based Complexity (F. J. Hickernell and P. Kritzer, eds.), De Gruyter, Berlin/Boston, 1994, pp. 43–64.
[28] by same author, Random sections of ellipsoids and the power of random information, Trans. Amer. Math. Soc. 374 (2021), no. 12, 8691–8713. MR 4337926
[29] A. Hinrichs, J. Prochno, and M. Sonnleitner, Random sections of $\ell_{p}$ -ellipsoids, optimal recovery and Gelfand numbers of diagonal operators, 2021.
[30] A. Hinrichs, J. Prochno, and J. Vybíral, Gelfand numbers of embeddings of Schatten classes, Math. Ann. 380 (2021), no. 3-4, 1563–1593. MR 4297193
[31] P. Hitczenko, S. J. Montgomery-Smith, and K. Oleszkiewicz, Moment inequalities for sums of certain independent symmetric random variables, Studia Math. 123 (1997), no. 1, 15–42. MR 1438303
[32] W. Hoeffding, Probability inequalities for sums of bounded random variables, J. Amer. Statist. Assoc. 58 (1963), 13–30. MR 144363
[33] D. Krieg and M. Ullrich, Function values are enough for $L_{2}$ -approximation, Found. Comput. Math. 21 (2021), no. 4, 1141–1151. MR 4298242
[34] S. Kwapień, Decoupling inequalities for polynomial chaos, Ann. Probab. 15 (1987), no. 3, 1062–1071. MR 893914
[35] H. J. Landau and L. A. Shepp, On the supremum of a Gaussian process, Sankhyā Ser. A 32 (1970), 369–378. MR 286167
[36] R. Latała, Tail and moment estimates for sums of independent random vectors with logarithmically concave tails, Studia Math. 118 (1996), no. 3, 301–304. MR 1388035
[37] by same author, Some estimates of norms of random matrices, Proc. Amer. Math. Soc. 133 (2005), no. 5, 1273–1282. MR 2111932
[38] R. Latała and M. Strzelecka, Comparison of weak and strong moments for vectors with independent coordinates, Mathematika 64 (2018), no. 1, 211–229. MR 3778221
[39] R. Latała and W. Świątkowski, Norms of randomized circulant matrices, Electron. J. Probab. 27 (2022), Paper No. 80, 23. MR 4441144
[40] R. Latała, R. van Handel, and P. Youssef, The dimension-free structure of nonhomogeneous random matrices, Invent. Math. 214 (2018), no. 3, 1031–1080. MR 3878726
[41] M. Ledoux, The concentration of measure phenomenon, Mathematical Surveys and Monographs, vol. 89, American Mathematical Society, Providence, RI, 2001. MR 1849347
[42] by same author, Deviation inequalities on largest eigenvalues, Geometric aspects of functional analysis, Lecture Notes in Math., vol. 1910, Springer, Berlin, 2007, pp. 167–219. MR 2349607
[43] M. Ledoux and M. Talagrand, Probability in Banach spaces, Ergebnisse der Mathematik und ihrer Grenzgebiete (3) [Results in Mathematics and Related Areas (3)], vol. 23, Springer-Verlag, Berlin, 1991, Isoperimetry and processes. MR 1102015
[44] N. Linial, E. London, and Y. Rabinovich, The geometry of graphs and some of its algorithmic applications, Combinatorica 15 (1995), no. 2, 215–245. MR 1337355
[45] D. Matlak, Oszacowania norm macierzy losowych, Master’s thesis, Uniwersytet Warszawski, 2017.
[46] A. Naor, O. Regev, and T. Vidick, Efficient rounding for the noncommutative Grothendieck inequality, Theory Comput. 10 (2014), 257–295. MR 3267842
[47] Z. Puchała, Ł. Rudnicki, and K. Życzkowski, Majorization entropic uncertainty relations, J. Phys. A 46 (2013), no. 27, 272002, 12. MR 3081910
[48] H. Rauhut, Compressive sensing and structured random matrices, Theoretical foundations and numerical methods for sparse recovery, Radon Ser. Comput. Appl. Math., vol. 9, Walter de Gruyter, Berlin, 2010, pp. 1–92. MR 2731597
[49] S. Riemer and C. Schütt, On the expectation of the norm of random matrices with non-identically distributed entries, Electron. J. Probab. 18 (2013), no. 29, 13. MR 3035757
[50] M. Rudelson and R. Vershynin, Sampling from large matrices: an approach through geometric functional analysis, J. ACM 54 (2007), no. 4, Art. 21, 19. MR 2351844
[51] Y. Seginer, The expected norm of random matrices, Combin. Probab. Comput. 9 (2000), no. 2, 149–166. MR 1762786
[52] D. Slepian, The one-sided barrier problem for Gaussian noise, Bell System Tech. J. 41 (1962), 463–501. MR 133183
[53] A. M.-C. So, Moment inequalities for sums of random matrices and their applications in optimization, Math. Program. 130 (2011), no. 1, Ser. A, 125–151. MR 2853163
[54] D. A. Spielman and S.-H. Teng, Nearly-linear time algorithms for graph partitioning, graph sparsification, and solving linear systems, Proceedings of the Thirty-Sixth Annual ACM Symposium on Theory of Computing (New York, NY, USA), STOC ’04, Association for Computing Machinery, 2004, p. 81–90.
[55] M. Strzelecka, Estimates of norms of log-concave random matrices with dependent entries, Electron. J. Probab. 24 (2019), Paper No. 107, 15. MR 4017125
[56] M. Talagrand, A new look at independence, Ann. Probab. 24 (1996), no. 1, 1–34. MR 1387624
[57] J. A. Tropp, Norms of random submatrices and sparse approximation, C. R. Math. Acad. Sci. Paris 346 (2008), no. 23-24, 1271–1274. MR 2473306
[58] by same author, On the conditioning of random subdictionaries, Appl. Comput. Harmon. Anal. 25 (2008), no. 1, 1–24. MR 2419702
[59] by same author, User-friendly tail bounds for sums of random matrices, Foundations of Computational Mathematics 12 (2012), no. 4, 389–434.
[60] by same author, An introduction to matrix concentration inequalities, Foundations and Trends® in Machine Learning 8 (2015), no. 1-2, 1–230.
[61] R. van Handel, On the spectral norm of Gaussian random matrices, Trans. Amer. Math. Soc. 369 (2017), no. 11, 8161–8178. MR 3695857
[62] by same author, Structured random matrices, Convexity and concentration, IMA Vol. Math. Appl., vol. 161, Springer, New York, 2017, pp. 107–156. MR 3837269
[63] R. Vershynin, Introduction to the non-asymptotic analysis of random matrices, Compressed sensing, Cambridge Univ. Press, Cambridge, 2012, pp. 210–268. MR 2963170
[64] by same author, High-dimensional probability, Cambridge Series in Statistical and Probabilistic Mathematics, vol. 47, Cambridge University Press, Cambridge, 2018, An introduction with applications in data science, With a foreword by Sara van de Geer. MR 3837109
[65] J. von Neumann and H. H. Goldstine, Numerical inverting of matrices of high order, Bull. Amer. Math. Soc. 53 (1947), no. 11, 1021–1099.
[66] E. P. Wigner, Characteristic vectors of bordered matrices with infinite dimensions, Ann. of Math. (2) 62 (1955), 548–564. MR 77805
[67] by same author, Characteristic vectors of bordered matrices with infinite dimensions. II, Ann. of Math. (2) 65 (1957), 203–207. MR 83848
[68] by same author, On the distribution of the roots of certain symmetric matrices, Ann. of Math. (2) 67 (1958), 325–327. MR 95527
[69] J. Wishart, The generalised product moment distribution in samples from a normal multivariate population, Biometrika 20A (1928), no. 1/2, 32–52.

(1.4)		$\displaystyle\mathbb{E}\\|G_{A}\colon\ell_{2}^{n}\to\ell_{2}^{m}\\|$	$\displaystyle\asymp\mathbb{E}\max_{j\leq n}\\|(a_{ij}g_{ij})_{i=1}^{m}\\|_{2}+\mathbb{E}\max_{i\leq m}\\|(a_{ij}g_{ij})_{j=1}^{n}\\|_{2}$
		$\displaystyle\asymp\max_{j\leq n}\\|(a_{ij})_{i=1}^{m}\\|_{2}+\max_{i\leq m}\\|(a_{ij})_{j=1}^{n}\\|_{2}+\mathbb{E}\max_{i\leq m,j\leq n}\|a_{ij}g_{ij}\|.$

	$\displaystyle\mathbb{E}\\|G_{A}\colon\ell_{p}^{n}\to\ell_{q}^{m}\\|\lesssim\gamma_{q}\max_{j\leq n}\\|(a_{ij})_{i=1}^{m}\\|_{q}$	$\displaystyle+(p^{})^{5/q}(\ln m)^{1/q}\gamma_{p^{}}\max_{i\leq m}\\|(a_{ij})_{j=1}^{n}\\|_{p^{*}}$
(1.5)			$\displaystyle+(p^{*})^{5/q}(\ln m)^{1/q}\gamma_{q}\ \mathbb{E}\max_{i\leq m,j\leq n}\|a_{ij}g_{ij}\|,$

(1.7)		$\displaystyle\mathbb{E}\\|G_{A}\colon\ell_{p}^{n}\to\ell_{q}^{m}\\|\gtrsim_{p,q}\max_{j\leq n}\\|(a_{ij})_{i=1}^{m}\\|_{q}$	$\displaystyle+\max_{i\leq m}\\|(a_{ij})_{j=1}^{n}\\|_{p^{*}}$
		$\displaystyle+\mathbb{E}\max_{i\leq m,j\leq n}\|a_{ij}g_{ij}\|.$

	$\displaystyle\mathbb{E}\\|G_{A}\colon\ell_{p}^{n}\to\ell_{q}^{m}\\|\gtrsim\\|A\mathbin{\circ}A\colon\ell^{n}_{p/2}\to\ell^{m}_{q/2}\\|^{1/2}$	$\displaystyle+\\|(A\mathbin{\circ}A)^{T}\colon\ell^{m}_{q^{}/2}\to\ell^{n}_{p^{}/2}\\|^{1/2}$
(1.8)			$\displaystyle+\mathbb{E}\max_{i\leq m,j\leq n}\|a_{ij}g_{ij}\|.$

		$\displaystyle\mathbb{E}\sup_{I_{0},J_{0}}\\|G_{A}\colon\ell_{p}^{J_{0}}\to\ell_{q}^{I_{0}}\\|=\mathbb{E}\sup_{I_{0},J_{0}}\sup_{x\in B_{p}^{J_{0}}}\sup_{y\in B_{q^{*}}^{I_{0}}}\sum_{i\in I_{0}}\sum_{j\in J_{0}}y_{i}a_{ij}g_{ij}x_{j}$
	$\displaystyle\leq\ln(en)^{1/p^{*}}\ln(em)^{1/q}\Bigl{[}$	$\displaystyle\bigl{(}2.4\sqrt{\ln(mn)}+8\sqrt{\ln M}+\sqrt{2/\pi}\bigr{)}\sup_{I_{0},J_{0}}\\|A\mathbin{\circ}A\colon\ell^{J_{0}}_{p/2}\to\ell^{I_{0}}_{q/2}\\|^{1/2}$
		$\displaystyle+\bigl{(}8\sqrt{\ln N}+2\sqrt{2/\pi}\bigr{)}\sup_{I_{0},J_{0}}\\|(A\mathbin{\circ}A)^{T}\colon\ell^{I_{0}}_{q^{}/2}\to\ell^{J_{0}}_{p^{}/2}\\|^{1/2}\Bigr{]},$

Norms of structured random matrices

Abstract.

Key words and phrases:

2020 Mathematics Subject Classification:

1. Introduction and main results

1.1. History of the problem and known results

1.2. Lower bounds and conjectures

Conjecture 1.

Remark 1.1.

Conjecture 2.

Proposition 1.2.

1.3. Main results valid for 1≤p,q≤∞1\leq p,q\leq\infty

Theorem 1.3 (Main theorem in a general version with sets I0I_{0}, J0J_{0}).

Corollary 1.4 (Main theorem – ℓp\ell_{p} to ℓq\ell_{q} version).

Corollary 1.5.

Corollary 1.6.

1.4. Results for particular ranges of pp, qq

Proposition 1.7.

Proposition 1.8.

Remark 1.9.

Proposition 1.10.

Corollary 1.11.

Remark 1.12.

Corollary 1.13.

Proposition 1.14.

Theorem 1.15.

1.5. Tail bounds

Proposition 1.16 (Tail bound).

1.6. Organization of the paper

2. Preliminaries

2.1. General facts

Lemma 2.1.

Proof.

Definition 2.2.

Lemma 2.3.

Proof.

Remark 2.4.

Remark 2.5.

Lemma 2.6.

Proof.

2.2. Contraction principles

Lemma 2.7 (Contraction principle).

Lemma 2.8 (Contraction principle).

2.3. Gaussian random variables

Lemma 2.9 (Slepian’s lemma).

Lemma 2.10.

Proof.

Lemma 2.11 ([61, Lemma 2.3]).

Lemma 2.12 ([61, Lemma 2.4]).

Lemma 2.13 (Hoeffding’s inequality, [32, Theorem 2]).

2.4. Random variables with heavy tails

Lemma 2.14 (Contraction principle).

Lemma 2.15 ([31, Theorem 6.2]).

Remark 2.16 (Moments of Weibull random variables).

Proposition 2.17.

Proof.

Lemma 2.18.

Proof.

Lemma 2.19.

Proof.

Lemma 2.20.

Proof.

Lemma 2.21.

Proof.

Lemma 2.22.

Proof.

3. Proofs of the main results

3.1. General bound via Slepian’s lemma

Proposition 3.1.

Proof.

Remark 3.2.

Proof of Theorem 1.3.

3.2. Coupling

Theorem 3.3 (General version of Corollary 1.5).

Remark 3.4 (Symmetrization of entries of a random matrix).

Proof of Theorem 3.3.

Theorem 3.5 (General version of Corollary 1.6).

Proof.

Proof of Proposition 1.16.

4. Proofs of further results

1.3. Main results valid for $1\leq p,q\leq\infty$

Theorem 1.3 (Main theorem in a general version with sets $I_{0}$ , $J_{0}$ ).

Corollary 1.4 (Main theorem – $\ell_{p}$ to $\ell_{q}$ version).

1.4. Results for particular ranges of $p$ , $q$

4.3. $\psi_{r}$ random variables

Example 5.6 (for (5.14) in the case $q<p$ .).

Example 5.7 (for (5.14) in the case $2<p\leq q$ .).