This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

A Sharp Fourier Inequality
and the Epanechnikov Kernel

Sean Richardson Sean Richardson, Department of Mathematics, University of Washington, Seattle, WA [email protected]
Abstract.

We consider functions f:f:\mathbb{Z}\to\mathbb{R} and kernels u:{n,,n}u:\{-n,\cdots,n\}\to\mathbb{R} normalized by =nnu()=1\sum_{\ell=-n}^{n}u(\ell)=1, making the convolution ufu\ast f a “smoother” local average of ff. We identify which choice of uu most effectively smooths the second derivative in the following sense. For each uu, basic Fourier analysis implies there is a constant C(u)C(u) so Δ(uf)2()C(u)f2()\|\Delta(u\ast f)\|_{\ell^{2}(\mathbb{Z})}\leq C(u)\|f\|_{\ell^{2}(\mathbb{Z})} for all f:f:\mathbb{Z}\to\mathbb{R}. By compactness, there is some uu that minimizes C(u)C(u) and in this paper, we find explicit expressions for both this minimal C(u)C(u) and the minimizing kernel uu for every nn. The minimizing kernel is remarkably close to the Epanechnikov kernel in Statistics. This solves a problem of Kravitz-Steinerberger and an extremal problem for polynomials is solved as a byproduct.

1. Introduction

We are interested in the study of functions f:f:\mathbb{Z}\rightarrow\mathbb{R}. These functions appear naturally in many applications (for example as “time series”) and a natural problem that arises frequently is to take local averages at a fixed, given scale. A popular way is to fix a kernel u:{n,n+1,,n1,n}u:\{-n,-n+1,\cdots,n-1,n\}\to\mathbb{R} and consider the convolution

(uf)(k)==nnu()f(k).(u\ast f)(k)=\sum_{\ell=-n}^{n}u(\ell)f(k-\ell).

A natural question is now which kernel u:{n,,n}u:\{-n,\cdots,n\}\to\mathbb{R} one should pick. We will always assume that the kernel is normalized =nnu()=1\sum_{\ell=-n}^{n}u(\ell)=1 so that the convolution ufu\ast f is indeed a local average. There is no “right choice” of a kernel uu as different choices of weights are optimal in different ways. For example, in image processing theory [16] one is interested in kernels uu so that for any ff, the convolution ufu\ast f has fewer local extrema than ff; this property together with the other “scale-space axioms” uniquely characterizes the Gaussian kernel [1, 12, 23]. In kernel density estimation, one is interested in a kernel that minimizes least-squared error, which uniquely characterizes the Epanechnikov kernel [7]. In this paper, we build on the work of Kravitz and Steinerberger [15] and take the approach of asking that the convolution ufu\ast f be as smooth as possible, which we show uniquely characterizes yet another kernel in the second derivative case.

To make this precise, first define the discrete derivative Df:Df:\mathbb{Z}\rightarrow\mathbb{R} by Df(k)=f(k+1)f(k)Df(k)=f(k+1)-f(k) and define higher-order derivatives inductively by Dmf=D(Dm1f)D^{m}f=D(D^{m-1}f). Then it follows from basic Fourier analysis (see Section 3.3) that for every kernel u:{n,,n}u:\{-n,\cdots,n\}\to\mathbb{R} and every mm\in\mathbb{N}, there exists a constant Cm(u)<C_{m}(u)<\infty so that

f2(),Dm(uf)2()Cm(u)f2().\forall~{}f\in\ell^{2}(\mathbb{Z}),\qquad\|D^{m}(u\ast f)\|_{\ell^{2}(\mathbb{Z})}\leq C_{m}(u)\|f\|_{\ell^{2}(\mathbb{Z})}.

We can now ask a natural question.

Question. Given a positive integer mm, how small can Cm(u)C_{m}(u) be and which convolution kernels uu attain the optimal constant?

Such a kernel would then be the “canonical” kernel producing the smallest mmth derivatives and is a natural candidate for use in practice. The problem has been solved by Kravitz and Steinerberger when m=1m=1.

Theorem (Kravitz-Steinerberger [15]).

For any normalized u:{n,,n}u:\{-n,\cdots,n\}\to\mathbb{R},

C1(u)22n+1C_{1}(u)\geq\frac{2}{2n+1}

with equality if and only if u(k)=1/(2n+1)u(k)=1/(2n+1) is the constant kernel.

That is, averaging by convolving with the characteristic function of an interval best minimizes the first derivative. Kravitz and Steinerberger also studied the m=2m=2 case under the assumption uu has non-negative Fourier transform.

Theorem (Kravitz-Steinerberger [15]).

For any normalized u:{n,,n}u:\{-n,\cdots,n\}\to\mathbb{R} with nonnegative Fourier transform,

C2(u)4(n+1)2C_{2}(u)\geq\frac{4}{(n+1)^{2}}

with equality if and only if uu is the triangle function u(k)=(n+1|k|)/(n+1)2.u(k)=(n+1-|k|)/(n+1)^{2}.

The main result of this paper resolves the m=2m=2 case without additional assumptions on the kernel, providing the optimal constant and optimal kernel for all nn.

Theorem (Main Result).

For any normalized u:{n,,n},u:\{-n,\cdots,n\}\to\mathbb{R},

C2(u)4n+1sin(π2n+2)1+cos(π2n+2)C_{2}(u)\geq\frac{4}{n+1}\frac{\sin(\frac{\pi}{2n+2})}{1+\cos(\frac{\pi}{2n+2})}

with equality if and only if uu is given by the kernel in (3).

A quick computation shows the optimal kernels un:{n,,n}u_{n}:\{-n,\cdots,n\}\to\mathbb{R} satisfy, as nn\to\infty, the following asymptotic equivalence; notice the asymptotic improvement from 44 to π\pi when removing the nonnegative Fourier transform restriction:

(1) C2(un)π(n+1)2.C_{2}(u_{n})\sim\frac{\pi}{(n+1)^{2}}.

The optimal kernel unu_{n} for each nn is given as an integral expression by (3) in the following section, and Figure 1 pictures the optimal kernels u10u_{10} and u1000u_{1000}. Figure 2 depicts a time series function f:f:\mathbb{Z}\to\mathbb{R} encoding the water level of Lake Chelan over a two week period as well as the smoothed data u10fu_{10}\ast f. Notice this smoothing reduces noise and clarifies long-term trends.

Refer to caption
(a) u10(k)u_{10}(k)
Refer to caption
(b) u1000(k)u_{1000}(k)
Figure 1. The optimal kernel un(k)u_{n}(k)
Refer to caption
(a) Original data ff
Refer to caption
(b) Smoothed data u10fu_{10}\ast f
Figure 2. Water level of Lake Chelan over time [21]

As seen in Figure 1, the optimal kernel resembles a parabola, but it turns out the points do not quite lie on any parabola. However, choosing weights by sampling from a parabola results in the discrete Epanechnikov kernel EnE_{n} for each nn, which is a simple and effective approximation of the optimal kernel. Indeed, we show the parabolic Epanechnikov kernel has constant C2(En)C_{2}(E_{n}) within 2%2\% of the optimal constant C2(un)C_{2}(u_{n}) for large nn, providing a new reason to use the Epanechnikov kernel.

2. Results

2.1. A Sharp Fourier Inequality

We can rewrite the main result discussed in the Introduction as the following Fourier inequality. As usual, we only consider kernels u:{n,n+1,,n1,n}u:\{-n,-n+1,\cdots,n-1,n\}\to\mathbb{R} normalized so that =nnu()=1\sum_{\ell=-n}^{n}u(\ell)=1.

Theorem 1 (Main Result, restated).

For any normalized u:{n,,n}u:\left\{-n,\dots,n\right\}\rightarrow\mathbb{R},

(2) sup0f2()Δ(uf)2()f2()4n+1sin(π2n+2)1+cos(π2n+2).\sup_{0\neq f\in\ell^{2}(\mathbb{Z})}\frac{\|\Delta(u\ast f)\|_{\ell^{2}(\mathbb{Z})}}{\|f\|_{\ell^{2}(\mathbb{Z})}}\geq\frac{4}{n+1}\cdot\frac{\sin\left(\frac{\pi}{2n+2}\right)}{1+\cos\left(\frac{\pi}{2n+2}\right)}.

With equality if and only if u(k)u(k) is as in (3).

Note the discrete Laplacian (Δf)(k)f(k+2)2f(k+1)+f(k)(\Delta f)(k)\coloneqq f(k+2)-2f(k+1)+f(k) is precisely the discrete second derivative (D2f)(k)(D^{2}f)(k) defined in the Introduction. The optimal un:{n,,n}u_{n}:\{-n,\cdots,n\}\to\mathbb{R} that yields equality in Theorem 1 for any nn can be written explicitly by defining unu_{n} to be symmetric un(k)=un(k)u_{n}(k)=u_{n}(-k), then for k0k\geq 0 setting

(3) un(k)=1π11Sn(x)Tk(x)dx1x2u_{n}(k)=\frac{1}{\pi}\int_{-1}^{1}S_{n}(x)T_{k}(x)\frac{dx}{\sqrt{1-x^{2}}}

where Tk(x)T_{k}(x) is the kkth Chebyshev polynomial (as defined in Section 3.1), and

(4) Sn1(x)=1x12sin(π2n)n(1+cos(π2n))Tn(1+cos(π2n)2(x+1)1).S_{n-1}(x)=\frac{1}{x-1}\cdot\frac{2\sin\left(\frac{\pi}{2n}\right)}{n\left(1+\cos\left(\frac{\pi}{2n}\right)\right)}\cdot T_{n}\left(\frac{1+\cos\left(\frac{\pi}{2n}\right)}{2}(x+1)-1\right).

This result extends the work of Kravitz and Steinerberger [15] and adds to the recent research activity on sharp Fourier inequalities. For example, there is current research on sharp Fourier restriction and extension inequalities [5, 4, 17, 19, 20], on sharp Strichartz inequalities [8, 11], on sharp Hausdorff-Young inequalities [13, 14], and other Fourier inequalities [2, 3, 6, 9]

2.2. The Epanechnikov Kernel

The optimal kernel unu_{n} has a complicated expression as given in (3). However, as seen in Figure 1, this optimal unu_{n} resembles a parabola, and conversely we find that choosing weights by sampling from a parabola does extraordinarily well in smoothing the Laplacian of a given function. This choice of weights yields the discrete normalized Epanechnikov kernel En:{n,,n}E_{n}:\{-n,\cdots,n\}\to\mathbb{R} defined by

(5) En(k)=3n(4n21)(n2k2).E_{n}(k)=\frac{3}{n(4n^{2}-1)}\left(n^{2}-k^{2}\right).

The Epanechnikov kernel is widely used [10, 18, 22] in both theory and applications. This popularity stems from its computational efficiency and from Epanechnikov’s proof [7] that it is the optimal kernel to use in kernel density estimation (KDE) in terms of minimizing expected mean integrated square error. The following theorem reveals the Epanechnikov kernel is less than 2% worse than optimal in smoothing the Laplacian, providing another reason to use the Epanechnikov kernel in practice.

Theorem 2.

Let μ\mu be as in (6). Then as nn\to\infty we get the asymptotic equivalence

sup0f2()Δ(Enf)2()f2()3μππn2.\sup_{0\neq f\in\ell^{2}(\mathbb{Z})}\frac{\|\Delta(E_{n}\ast f)\|_{\ell^{2}(\mathbb{Z})}}{\|f\|_{\ell^{2}(\mathbb{Z})}}\sim\frac{3\mu}{\pi}\cdot\frac{\pi}{n^{2}}.

The constant 3μ/π3\mu/\pi in the above theorem is a universal constant defined by

(6) 3μπ3πmaxα[0,16]|sinααcosα|1.015.\frac{3\mu}{\pi}\coloneqq\frac{3}{\pi}\max_{\alpha\in[0,16]}\left|\frac{\sin\alpha}{\alpha}-\cos{\alpha}\right|\approx 1.015.

Comparing Theorem 1, whose asymptotics are given by (1), and the statement of Theorem 2 reveals the asymptotics of C2(un)C_{2}(u_{n}) and C2(En)C_{2}(E_{n}) differ only by a factor of 3μ/π1.0153\mu/\pi\approx 1.015 as nn\to\infty. Hence the Epanechnikov kernel performs less than 2%2\% worse than optimal asymptotically.

2.3. A Sharp Polynomial Inequality

After taking a Fourier transform (see Section 3.3), Theorem 1 reduces to the following claim about polynomials, which is interesting in its own right.

Theorem 3.

Let p(x)p(x) be a polynomial of degree at most n1n-1 with p(1)=1p(1)=1. Then

maxx[1,1]|(1x)p(x)|2sin(π2n)n(1+cos(π2n))\max_{x\in[-1,1]}|(1-x)p(x)|\geq\frac{2\sin\left(\frac{\pi}{2n}\right)}{n\left(1+\cos\left(\frac{\pi}{2n}\right)\right)}

with equality if and only if p(x)=Sn1(x)p(x)=S_{n-1}(x) where Sn1(x)S_{n-1}(x) is given in (4).

Refer to caption
Optimal polynomial S5(x)S_{5}(x)
Refer to caption
Polynomial (1x)S5(x)(1-x)S_{5}(x)
Figure 3. Optimal Polynomial and Product

3. Proofs

3.1. A Sharp Polynomial Inequality

This section proves Theorem 3, which is the key to the proof of Theorem 1 and the formula for the optimal kernel given in (3). This proof makes heavy use Chebyshev polynomials, so we first recall some facts about Chebyshev polynomials for the convenience of the reader.

Chebyshev polynomials are a family {Tn(x)}\{T_{n}(x)\} of polynomials defined by declaring T0(x)=1T_{0}(x)=1 and T1(x)=xT_{1}(x)=x, then defining the rest with the recurrence

Tn+1(x)=2xTn(x)Tn1(x).T_{n+1}(x)=2xT_{n}(x)-T_{n-1}(x).

A quick induction argument reveals Chebyshev polynomials also satisfy

(7) Tn(cosθ)=cos(nθ).T_{n}(\cos\theta)=\cos(n\theta).

The above relation is responsible for many nice properties of Chebyshev polynomials and so we typically consider Chebyshev polynomials Tn(x)T_{n}(x) over the domain [1,1][-1,1] where this formula can apply. This relation also reveals the zeroes of the Chebyshev polynomial Tn(x)T_{n}(x) are located at cos(πn(k+12))\cos\left(\frac{\pi}{n}(k+\frac{1}{2})\right) for kk\in\mathbb{Z}. By taking the derivative of both sides of (7), we find that the derivative Tn(x)T_{n}^{\prime}(x) of Chebyshev polynomials over [1,1][-1,1] can be written Tn(x)=nUn1(x)T_{n}^{\prime}(x)=nU_{n-1}(x) for some polynomial Un1(x)U_{n-1}(x) that satisfies Un1(cos(θ))sinθ=sin(nθ)U_{n-1}(\cos(\theta))\sin\theta=\sin(n\theta). These polynomials {Un(x)}\{U_{n}(x)\} are called Chebyshev polynomials of the second kind. Finally, due to (7), each Chebyshev polynomial Tn(x)T_{n}(x) satisfies the equioscillation property, meaning there exists n+1n+1 extrema 1=x0>x1>>xn=11=x_{0}>x_{1}>\cdots>x_{n}=-1 so that Tn(xi)=(1)iT_{n}(x_{i})=(-1)^{i}.

We are now equipped to prove Theorem 3. The equation defining the optimal polynomial Sn1(x)S_{n-1}(x) is long, but the idea behind its construction is a simple modification of Chebyshev polynomials. To construct a degree nn function (1x)p(x)(1-x)p(x) that stays minimal over [1,1][-1,1], we slightly stretch the Chebyshev polynomial Tn(x)T_{n}(x) to the function qn(x)q_{n}(x) so that qn(1)=0q_{n}(1)=0 and qn(1)=1q_{n}^{\prime}(1)=-1. Then defining Sn(x)=qn(x)/(1x)S_{n}(x)=q_{n}(x)/(1-x) provides the minimal degree n1n-1 polynomial that satisfies the necessary conditions.

Proof of Theorem 3.

We start by verifying the claimed optimal polynomial Sn1(x)S_{n-1}(x) fulfills the restrictions and satisfies the claimed inequality for any positive integer nn. We denote (1x)Sn1(x)(1-x)S_{n-1}(x) by qn(x)q_{n}(x) and observe qn(x)=αTn(L(x))q_{n}(x)=-\alpha T_{n}(L(x)) is a scaled and stretched Chebyshev polynomial for carefully chosen constant α\alpha and linear change of variables L(x)L(x):

α=2sin(π2n)n(1+cos(π2n))andL(x)=1+cos(π2n)2(x+1)1.\displaystyle\alpha=\frac{2\sin\left(\frac{\pi}{2n}\right)}{n\left(1+\cos\left(\frac{\pi}{2n}\right)\right)}\quad\text{and}\quad L(x)=\frac{1+\cos\left(\frac{\pi}{2n}\right)}{2}(x+1)-1.

Writing qn(x)=αTn(L(x))q_{n}(x)=-\alpha T_{n}(L(x)) immediately reveals qn(x)q_{n}(x) is a polynomial of order nn. The change of variables L(x)L(x) is designed so we have

qn(1)=αTn(L(1))=αTn(cos(π2n))=0,q_{n}(1)=-\alpha T_{n}(L(1))=-\alpha T_{n}\left(\cos\left(\frac{\pi}{2n}\right)\right)=0,

using that cos(π2n)\cos\left(\frac{\pi}{2n}\right) is a zero of the Chebyshev polynomial Tn(x)T_{n}(x). Thus we have Sn1(x)=qn(x)/(1x)S_{n-1}(x)=q_{n}(x)/(1-x) is a well-defined polynomial of order n1n-1 as claimed. Next we show Sn1(1)=1S_{n-1}(1)=1. First observe Sn1(1)=qn(1)S_{n-1}(1)=-q_{n}^{\prime}(1) and now compute

Sn1(1)\displaystyle S_{n-1}(1) =qn(1)=ddx|x=1(αTn(L(x)))\displaystyle=-q_{n}^{\prime}(1)=-\left.\frac{d}{dx}\right|_{x=1}(-\alpha T_{n}(L(x)))
=αL(1)Tn(L(1))=αn2(1+cos(π2n))Un1(cos(π2n))\displaystyle=\alpha L^{\prime}(1)T_{n}^{\prime}(L(1))=\alpha\frac{n}{2}\left(1+\cos\left(\frac{\pi}{2n}\right)\right)U_{n-1}\left(\cos\left(\frac{\pi}{2n}\right)\right)

where we used Tn(x)=nUn1(x)T_{n}^{\prime}(x)=nU_{n-1}(x) for Un1(x)U_{n-1}(x) the Chebyshev polynomial of the second kind with order n1n-1. Recall Un1(cosθ)=sin(nθ)/sin(θ)U_{n-1}(\cos\theta)=\sin(n\theta)/\sin(\theta) and therefore, continuing our computation, we see the constant α\alpha is chosen so that we get

Sn1(1)=αn21+cos(π2n)sin(π2n)=αα1=1.\displaystyle S_{n-1}(1)=\alpha\cdot\frac{n}{2}\cdot\frac{1+\cos\left(\frac{\pi}{2n}\right)}{\sin\left(\frac{\pi}{2n}\right)}=\alpha\cdot\alpha^{-1}=1.

Therefore Sn1(x)S_{n-1}(x) fulfills the necessary conditions. To verify Sn1(x)S_{n-1}(x) satisfies the equality, use L([1,1])[1,1]L([-1,1])\subset[-1,1] and that |Tn(y)|1|T_{n}(y)|\leq 1 on [1,1][-1,1] to see

maxx[1,1]|(1x)Sn1(x)|\displaystyle\max_{x\in[-1,1]}|(1-x)S_{n-1}(x)| =maxx[1,1]|qn(x)|\displaystyle=\max_{x\in[-1,1]}|q_{n}(x)|
=maxx[1,1]|αTn(L(x))|αmaxy[1,1]|Tn(y)|α.\displaystyle=\max_{x\in[-1,1]}|\alpha T_{n}(L(x))|\leq\alpha\max_{y\in[-1,1]}|T_{n}(y)|\leq\alpha.

This is indeed an equality because

maxx[1,1]|(1x)Sn1(x)||αTn(L(1))|=α|Tn(1)|=α.\displaystyle\max_{x\in[-1,1]}|(1-x)S_{n-1}(x)|\geq|\alpha T_{n}(L(-1))|=\alpha\cdot|T_{n}(-1)|=\alpha.

We now show Sn1(x)S_{n-1}(x) is the unique polynomial of degree at most n1n-1 that achieves this equality by deriving an equioscillation property for qn(x)q_{n}(x) and modifying the standard argument for the minimizing property of Chebyshev polynomials. Recall the Chebyshev polynomial Tn(x)T_{n}(x) has n+1n+1 extrema 1=x0>x1>>xn=11=x_{0}>x_{1}>\cdots>x_{n}=-1 so that the equioscillation property Tn(xi)=(1)iT_{n}(x_{i})=(-1)^{i} is satisfied. To see qn(x)q_{n}(x) has a similar equioscillation property, define inputs yi=L1(xi)y_{i}=L^{-1}(x_{i}) for 0in0\leq i\leq n, and observe y0yny_{0}\geq\cdots\geq y_{n} by L1(x)L^{-1}(x) strictly increasing. Next, note the second extrema of Chebyshev polynomials is given by x1=cos(πn)x_{1}=\cos\left(\frac{\pi}{n}\right) and so x1<cos(π2n)x_{1}<\cos\left(\frac{\pi}{2n}\right); thus by L1(x)L^{-1}(x) strictly increasing, y1=L1(x1)<L1(cos(π2n))=1y_{1}=L^{-1}(x_{1})<L^{-1}(\cos\left(\frac{\pi}{2n}\right))=1. Now compute L(1)=1L(-1)=-1, which implies L1(1)=1L^{-1}(-1)=-1. Therefore we find the inputs yiy_{i} satisfy 1>y1yn=11>y_{1}\geq\cdots\geq y_{n}=-1, and qn(yi)=αTn(L(yi))=αTn(xi)=α(1)iq_{n}(y_{i})=-\alpha T_{n}(L(y_{i}))=-\alpha T_{n}(x_{i})=-\alpha(-1)^{i}. That is, qn(x)q_{n}(x) has nn maxima satisfying the equioscillation property in [1,1][-1,1]. Next suppose p(x)p(x) is any polynomial of degree n1n-1 so that p(1)=0p(1)=0 and

maxx[1,1]|(1x)p(x)|α.\max_{x\in[-1,1]}|(1-x)p(x)|\leq\alpha.

We argue p(x)=Sn1(x)p(x)=S_{n-1}(x) by using our equioscillation property to show the polynomials must intersect sufficiently many times. Formally, we count the zeros of the polynomial z(x)=qn(x)(1x)p(x)z(x)=q_{n}(x)-(1-x)p(x). By our equioscillation property qn(yi)=α(1)iq_{n}(y_{i})=-\alpha(-1)^{i} and the assumption |(1x)p(x)|α|(1-x)p(x)|\leq\alpha over [1,1][-1,1], we require z(yi)0z(y_{i})\geq 0 for ii odd and z(yi)0z(y_{i})\leq 0 for ii even. Therefore z(x)z(x) must have a zero in each interval [yi,yi+1][y_{i},y_{i+1}] by the intermediate value theorem. That is, the n1n-1 intervals [y1,y2],,[yn1,yn][y_{1},y_{2}],\cdots,[y_{n-1},y_{n}] all contain a zero. Furthermore if any two intervals [yi1,yi][y_{i-1},y_{i}] and [yi,yi+1][y_{i},y_{i+1}] share a zero at yiy_{i}, then we can show z(x)z(x) will have a zero of multiplicity at least two at yiy_{i} and therefore z(x)z(x) still has at least n1n-1 zeros counted with multiplicity on [y0,yn1][y_{0},y_{n-1}] as follows. By qn(yi)=±αq_{n}(y_{i})=\pm\alpha at yiy_{i}, we know qn(x)q_{n}(x) has a minimum or maximum at yiy_{i}, implying qn(yi)=0q_{n}^{\prime}(y_{i})=0. Similarly, if z(yi)=0z(y_{i})=0, then (1yi)p(yi)=±α(1-y_{i})p(y_{i})=\pm\alpha and so we also find that (1x)p(x)(1-x)p(x) has a minimum or maximum at yiy_{i}, so ddx|yi(1x)p(x)=0\left.\frac{d}{dx}\right|_{y_{i}}(1-x)p(x)=0. Therefore z(yi)=qn(yi)ddx|yi(1x)p(x)=0z^{\prime}(y_{i})=q_{n}^{\prime}(y_{i})-\left.\frac{d}{dx}\right|_{y_{i}}(1-x)p(x)=0 and so z(x)z(x) indeed has a zero of multiplicity at least two at yiy_{i} and so zz has n1n-1 zeros on [1,1)[-1,1).

Refer to caption
qn(x)q_{n}(x) and (1x)p(x)(1-x)p(x) must intersect
Refer to caption
Sn1(x)S_{n-1}(x) and p(x)p(x) must intersect

Now note z(1)=qn(1)(11)p(1)=00=0z(1)=q_{n}(1)-(1-1)p(1)=0-0=0 and so z(x)/(1x)z(x)/(1-x) is a polynomial of degree n1n-1 with the same roots on [1,1)[-1,1). Additionally observe

z(x)1x=qn(x)(1x)p(x)1x=Sn1(x)p(x)\frac{z(x)}{1-x}=\frac{q_{n}(x)-(1-x)p(x)}{1-x}=S_{n-1}(x)-p(x)

and therefore, evaluating z(x)/(1x)z(x)/(1-x) at 11 reduces to S(1)p(1)=11=0S(1)-p(1)=1-1=0 and so z(x)/(1x)z(x)/(1-x) has nn roots. However because z(x)/(1x)z(x)/(1-x) is of degree n1n-1 this implies z(x)/(1x)z(x)/(1-x) is the zero polynomial. That is, p(x)=Sn1(x)p(x)=S_{n-1}(x). ∎

3.2. Reduction to symmetric kernels

Our Fourier analysis argument for Theorem 1 given in Section 3.3 only holds for symmetric kernels, satisfying u(k)=u(k)u(k)=u(-k). Luckily, the following lemma demonstrates that it is sufficient to only prove Theorem 1 for the class of symmetric normalized kernels.

Lemma 4.

Suppose for all symmetric and normalized kernels u:{n,,n}u:\{-n,\cdots,n\}\to\mathbb{R},

(8) supf0Δ(uf)2()f2()βn\sup_{f\neq 0}\frac{\|\Delta(u\ast f)\|_{\ell^{2}(\mathbb{Z})}}{\|f\|_{\ell^{2}(\mathbb{Z})}}\geq\beta_{n}

for some βn>0\beta_{n}>0. Then (8) holds for all normalized kernels u:{n,,n}u:\{-n,\cdots,n\}\to\mathbb{R}.

Proof.

Let u:{n,,n}u:\{-n,\cdots,n\}\to\mathbb{R} be any kernel normalized by k=nnu(k)=1\sum_{k=-n}^{n}u(k)=1. For any function g(k)g(k), define its reflection g(k)=g(k)g^{-}(k)=g(-k). Next consider the symmetrization kernel u~(k)=12(u(k)+u(k))\widetilde{u}(k)=\frac{1}{2}(u(k)+u^{-}(k)), which will also satisfy k=nnu~(k)=1\sum_{k=-n}^{n}\widetilde{u}(k)=1. Now take any f2()f\in\ell^{2}(\mathbb{Z}) and compute

Δ(u~f)2()f2()\displaystyle\frac{\|\Delta(\widetilde{u}\ast f)\|_{\ell^{2}(\mathbb{Z})}}{\|f\|_{\ell^{2}(\mathbb{Z})}} =Δ(12(u+u)f)2()f2()\displaystyle=\frac{\|\Delta(\frac{1}{2}(u+u^{-})\ast f)\|_{\ell^{2}(\mathbb{Z})}}{\|f\|_{\ell^{2}(\mathbb{Z})}}
12(Δ(uf)2()f2()+Δ(uf)2()f2())\displaystyle\leq\frac{1}{2}\left(\frac{\|\Delta(u\ast f)\|_{\ell^{2}(\mathbb{Z})}}{\|f\|_{\ell^{2}(\mathbb{Z})}}+\frac{\|\Delta(u^{-}\ast f)\|_{\ell^{2}(\mathbb{Z})}}{\|f\|_{\ell^{2}(\mathbb{Z})}}\right)
=12(Δ(uf)2()f2()+Δ(uf)2()f2()).\displaystyle=\frac{1}{2}\left(\frac{\|\Delta(u\ast f)\|_{\ell^{2}(\mathbb{Z})}}{\|f\|_{\ell^{2}(\mathbb{Z})}}+\frac{\|\Delta(u\ast f^{-})\|_{\ell^{2}(\mathbb{Z})}}{\|f^{-}\|_{\ell^{2}(\mathbb{Z})}}\right).

Therefore we find that for all f2()f\in\ell^{2}(\mathbb{Z}), either

Δ(uf)2()f2()Δ(u~f)2()f2()orΔ(uf)2()f2()Δ(u~f)2()f2().\frac{\|\Delta(u\ast f)\|_{\ell^{2}(\mathbb{Z})}}{\|f\|_{\ell^{2}(\mathbb{Z})}}\geq\frac{\|\Delta(\widetilde{u}\ast f)\|_{\ell^{2}(\mathbb{Z})}}{\|f\|_{\ell^{2}(\mathbb{Z})}}\quad\text{or}\quad\frac{\|\Delta(u\ast f^{-})\|_{\ell^{2}(\mathbb{Z})}}{\|f^{-}\|_{\ell^{2}(\mathbb{Z})}}\geq\frac{\|\Delta(\widetilde{u}\ast f)\|_{\ell^{2}(\mathbb{Z})}}{\|f\|_{\ell^{2}(\mathbb{Z})}}.

Hence we can conclude

supf0Δ(uf)2()f2()supf0Δ(u~f)2()f2()βn.\displaystyle\sup_{f\neq 0}\frac{\|\Delta(u\ast f)\|_{\ell^{2}(\mathbb{Z})}}{\|f\|_{\ell^{2}(\mathbb{Z})}}\geq\sup_{f\neq 0}\frac{\|\Delta(\widetilde{u}\ast f)\|_{\ell^{2}(\mathbb{Z})}}{\|f\|_{\ell^{2}(\mathbb{Z})}}\geq\beta_{n}.

3.3. From discrete kernels to polynomial extremizers

Lemma 5 (Kravitz and Steinerberger [15]).

Given a symmetric and normalized kernel u:{n,,n}u:\{-n,\cdots,n\}\to\mathbb{R}, define the polynomial

pu(x)=u(0)+k=1n2u(k)Tk(x)\displaystyle p_{u}(x)=u(0)+\sum_{k=1}^{n}2u(k)T_{k}(x)

where Tk(x)T_{k}(x) is the kkth Chebyshev polynomial. Then,

sup0f2()Δ(uf)2()f2()=2max1x1|(1x)pu(x)|.\displaystyle\sup_{0\neq f\in\ell^{2}(\mathbb{Z})}\frac{\|\Delta(u\ast f)\|_{\ell^{2}(\mathbb{Z})}}{\|f\|_{\ell^{2}(\mathbb{Z})}}=2\max_{-1\leq x\leq 1}|(1-x)p_{u}(x)|.

For the proof of Lemma 5 we follow the argument given by Kravitz and Steinerberger [15], which uses the Fourier transform and Plancherel’s theorem. Before the proof, first recall that a function f:f:\mathbb{Z}\to\mathbb{R} on the integers has a continuous Fourier transform f^:𝕋\widehat{f}:\mathbb{T}\to\mathbb{C} defined on the 1-torus given by

f^(ξ)=kf(k)eiξk.\displaystyle\widehat{f}(\xi)=\sum_{k\in\mathbb{Z}}f(k)e^{-i\xi k}.

This is called the discrete-time Fourier transform, which we will also denote by (f)(ξ)=f^(ξ)\mathcal{F}(f)(\xi)=\widehat{f}(\xi). This Fourier transform takes convolution to multiplication by

fg^=f^g^.\displaystyle\widehat{f\ast g}=\widehat{f}\cdot\widehat{g}.

Another useful property is the Plancherel identity, which relates the inner product of functions with that of their Fourier transform by

kf(k)g(k)¯=12π𝕋f^(ξ)g^(ξ)¯𝑑ξ.\displaystyle\sum_{k\in\mathbb{Z}}f(k)\overline{g(k)}=\frac{1}{2\pi}\int_{\mathbb{T}}\widehat{f}(\xi)\overline{\widehat{g}(\xi)}d\xi.

Note we can express the Fourier transform of a shifted function f(m)f(\cdot-m) by

f(m)^(ξ)\displaystyle\widehat{f(\cdot-m)}(\xi) =kf(km)eiξk=kf(k)eiξ(k+m)\displaystyle=\sum_{k\in\mathbb{Z}}f(k-m)e^{-i\xi k}=\sum_{k\in\mathbb{Z}}f(k)e^{-i\xi(k+m)}
=eiξmkf(k)eiξk=eiξmf^(ξ).\displaystyle=e^{-i\xi m}\sum_{k\in\mathbb{Z}}f(k)e^{-i\xi k}=e^{-i\xi m}\widehat{f}(\xi).

Using the above, we can compute the Fourier transform of the discrete Laplacian.

(Δf)(ξ)\displaystyle\mathcal{F}(\Delta f)(\xi) =(f(k+2)2f(k+1)+f(k))(ξ)\displaystyle=\mathcal{F}(f(k+2)-2f(k+1)+f(k))(\xi)
=(e2iξf^(ξ)2eiξf^(ξ)+f^(ξ)=(eiξ1)2f^(ξ).\displaystyle=(e^{-2i\xi}\widehat{f}(\xi)-2e^{-i\xi}\widehat{f}(\xi)+\widehat{f}(\xi)=(e^{-i\xi}-1)^{2}\widehat{f}(\xi).

We are equipped to prove Lemma 5, but first we follow up with a claim made in the Introduction. Indeed, observe that by the first computation in the following proof, which does not yet use the symmetry of uu, we can conclude quickly that Δ(uf)2()C2(u)f2()\|\Delta(u\ast f)\|_{\ell^{2}(\mathbb{Z})}\leq C_{2}(u)\|f\|_{\ell^{2}(\mathbb{Z})} for some constant C2(u)C_{2}(u) depending continuously on uu. Furthermore, because the space of normalized kernels u:{n,,n}{u:\{-n,\cdots,n\}\to\mathbb{R}} is compact, we conclude that there exists some optimal kernel unu_{n} that minimizes C2(u)C_{2}(u). This argument can be easily generalized to higher derivatives.

Proof of Lemma 5.

Let uu be a fixed symmetric, normalized kernel and let f:f:\mathbb{Z}\to\mathbb{R} be any function in 2()\ell^{2}(\mathbb{Z}). Plancherel’s identity allows us to equate Δ(uf)2()\|\Delta(u\ast f)\|_{\ell^{2}(\mathbb{Z})} to an easier expression in terms of the Fourier transform by computing

k|(Δ(uf))(k)|2\displaystyle\sum_{k\in\mathbb{Z}}|(\Delta(u\ast f))(k)|^{2} =12π𝕋|eiξ1|4|u^(ξ)|2|f^(ξ)|2𝑑ξ\displaystyle=\frac{1}{2\pi}\int_{\mathbb{T}}|e^{-i\xi}-1|^{4}|\widehat{u}(\xi)|^{2}|\widehat{f}(\xi)|^{2}d\xi
(9) (eiξ1)4u^(ξ)2L(𝕋)12π𝕋|f^(ξ)|2𝑑ξ\displaystyle\leq\|(e^{-i\xi}-1)^{4}\widehat{u}(\xi)^{2}\|_{L^{\infty}(\mathbb{T})}\cdot\frac{1}{2\pi}\int_{\mathbb{T}}|\widehat{f}(\xi)|^{2}d\xi
=(eiξ1)4u^(ξ)2L(𝕋)k|f(k)|2.\displaystyle=\|(e^{-i\xi}-1)^{4}\widehat{u}(\xi)^{2}\|_{L^{\infty}(\mathbb{T})}\cdot\sum_{k\in\mathbb{Z}}|f(k)|^{2}.

After taking a square root we get

Δ(uf)2()(eiξ1)2u^(ξ)L(𝕋)f2()\displaystyle\|\Delta(u\ast f)\|_{\ell^{2}(\mathbb{Z})}\leq\|(e^{-i\xi}-1)^{2}\widehat{u}(\xi)\|_{L^{\infty}(\mathbb{T})}\cdot\|f\|_{\ell^{2}(\mathbb{Z})}

Note that the only inequality in the derivation of the above is in (9). Furthermore, by choosing f(k)f(k) so that f^(ξ)\widehat{f}(\xi) has L2L^{2} mass concentrated at the ξ\xi in which the function |eiξ1|4|u^(ξ)|2|e^{-i\xi}-1|^{4}|\widehat{u}(\xi)|^{2} achieves it’s maximum, we can make the above inequality arbitrary close to an equality. Thus we have shown our expression measuring the smoothing of the second derivative is equivalent to the following simpler expression:

sup0f2()Δ(uf)2()f2()=(eiξ1)2u^(ξ)L(𝕋).\displaystyle\sup_{0\neq f\in\ell^{2}(\mathbb{Z})}\frac{\|\Delta(u\ast f)\|_{\ell^{2}(\mathbb{Z})}}{\|f\|_{\ell^{2}(\mathbb{Z})}}=\|(e^{-i\xi}-1)^{2}\widehat{u}(\xi)\|_{L^{\infty}(\mathbb{T})}.

Denote this simpler expression by

L(u)(eiξ1)2u^(ξ)L(𝕋)=maxξ[0,2π)|eiξ1|2|u^(ξ)|.\displaystyle L(u)\coloneqq\|(e^{i\xi}-1)^{2}\widehat{u}(\xi)\|_{L^{\infty}(\mathbb{T})}=\max_{\xi\in[0,2\pi)}|e^{i\xi}-1|^{2}|\widehat{u}(\xi)|.

Because u(k)u(k) is symmetric, real-valued, and only supported on {n,,n}\{-n,\cdots,n\}, we can rewrite u^(ξ)\widehat{u}(\xi) by

u^(ξ)=ku(k)eiξk=u(0)+2k=1nu(k)cos(ξk).\displaystyle\widehat{u}(\xi)=\sum_{k\in\mathbb{Z}}u(k)e^{-i\xi k}=u(0)+2\sum_{k=1}^{n}u(k)\cos(\xi k).

Therefore we find

L(u)=maxξ[0,2π)|eiξ1|2|u^(ξ)|=2maxξ[0,2π)|(1cos(ξ))(u(0)+2k=1nu(k)cos(ξk))|.\displaystyle L(u)=\max_{\xi\in[0,2\pi)}|e^{i\xi}-1|^{2}|\widehat{u}(\xi)|=2\max_{\xi\in[0,2\pi)}\left|(1-\cos(\xi))\left(u(0)+2\sum_{k=1}^{n}u(k)\cos(\xi k)\right)\right|.

The substitution x=cos(ξ)x=\cos(\xi) gives rise to

L(u)=2max1x1|(1x)(u(0)+2k=1nu(k)Tk(x))|\displaystyle L(u)=2\max_{-1\leq x\leq 1}\left|(1-x)\left(u(0)+2\sum_{k=1}^{n}u(k)T_{k}(x)\right)\right|

where Tk(x)T_{k}(x) denotes the Chebyshev polynomial of degree kk. Therefore, defining the polynomial pu(x)p_{u}(x) by

pu(x)=u(0)+k=1n2u(k)Tk(x),\displaystyle p_{u}(x)=u(0)+\sum_{k=1}^{n}2u(k)T_{k}(x),

we have our desired equality

sup0f2()Δ(uf)2()f2()=L(u)=2max1x1|(1x)pu(x)|.\displaystyle\sup_{0\neq f\in\ell^{2}(\mathbb{Z})}\frac{\|\Delta(u\ast f)\|_{\ell^{2}(\mathbb{Z})}}{\|f\|_{\ell^{2}(\mathbb{Z})}}=L(u)=2\max_{-1\leq x\leq 1}|(1-x)p_{u}(x)|.

3.4. The Sharp Fourier Inequality and Optimal Kernel

Proof of Theorem 1.

For any symmetric and normalized u:{n,,n}u:\{-n,\cdots,n\}\to\mathbb{R}, define the degree nn polynomial

pu(x)=u(0)+k=1n2u(k)Tk(x)\displaystyle p_{u}(x)=u(0)+\sum_{k=1}^{n}2u(k)T_{k}(x)

as given in Lemma 5. Then note pu(1)=u(0)+k=1n2u(k)Tk(1)=1p_{u}(1)=u(0)+\sum_{k=1}^{n}2u(k)T_{k}(1)=1 by the normalization of uu. Therefore combining Lemma 5 and Theorem 3 yields

sup0f2()Δ(uf)2()f2()=2max1x1|(1x)pu(x)|4sin(π2n+2)(n+1)(1+cos(π2n+2)).\displaystyle\sup_{0\neq f\in\ell^{2}(\mathbb{Z})}\frac{\|\Delta(u\ast f)\|_{\ell^{2}(\mathbb{Z})}}{\|f\|_{\ell^{2}(\mathbb{Z})}}=2\max_{-1\leq x\leq 1}|(1-x)p_{u}(x)|\geq\frac{4\sin\left(\frac{\pi}{2n+2}\right)}{(n+1)\left(1+\cos\left(\frac{\pi}{2n+2}\right)\right)}.

Because the above inequality holds for symmetric normalized kernels, Lemma 4 implies this in fact holds for all normalized kernels. To see this inequality is sharp, let Sn(x)S_{n}(x) be the optimal degree nn polynomial as given in Theorem 3. Because the Chebyshev polynomials {T0(x),,Tn(x)}\{T_{0}(x),\dots,T_{n}(x)\} form a basis for the space of all degree nn polynomials, there exists unique coefficients αk\alpha_{k} so that

Sn(x)=α0T0(x)+k=1n2αkTk(x).\displaystyle S_{n}(x)=\alpha_{0}T_{0}(x)+\sum_{k=1}^{n}2\alpha_{k}T_{k}(x).

These coefficients define a corresponding kernel un:{n,,n}u_{n}:\{-n,\cdots,n\}\to\mathbb{R} by setting un(k)=un(k)=αku_{n}(k)=u_{n}(-k)=\alpha_{k} for k0k\geq 0. Using Tk(1)=1T_{k}(1)=1, we find this kernel is properly normalized by computing

k=nnun(k)=α0+2k=1nαk=α0T0(1)+k=1n2αkTk(1)=S(1)=1.\displaystyle\sum_{k=-n}^{n}u_{n}(k)=\alpha_{0}+2\sum_{k=1}^{n}\alpha_{k}=\alpha_{0}T_{0}(1)+\sum_{k=1}^{n}2\alpha_{k}T_{k}(1)=S(1)=1.

Noting T0(x)1T_{0}(x)\equiv 1, we can rewrite write Sn(x)S_{n}(x) as

Sn(x)=un(0)+k=1n2un(k)Tk(x).\displaystyle S_{n}(x)=u_{n}(0)+\sum_{k=1}^{n}2u_{n}(k)T_{k}(x).

Using the equality condition in Theorem 3, we find un(k)u_{n}(k) indeed satisfies

sup0f2()Δ(unf)2()f2()=2max1x1|(1x)Sn(x)|=4sin(π2n+2)(n+1)(1+cos(π2n+2)).\displaystyle\sup_{0\neq f\in\ell^{2}(\mathbb{Z})}\frac{\|\Delta(u_{n}\ast f)\|_{\ell^{2}(\mathbb{Z})}}{\|f\|_{\ell^{2}(\mathbb{Z})}}=2\max_{-1\leq x\leq 1}|(1-x)S_{n}(x)|=\frac{4\sin\left(\frac{\pi}{2n+2}\right)}{(n+1)\left(1+\cos\left(\frac{\pi}{2n+2}\right)\right)}.

To find an explicit expression for the minimizing kernel un(k)u_{n}(k), recall that Chebyshev polynomials are orthogonal and in particular

11Ti(x)Tj(x)dx1x2={0ifij,πifi=j=0,π2ifi=j0.\displaystyle\int_{-1}^{1}T_{i}(x)T_{j}(x)\frac{dx}{\sqrt{1-x^{2}}}=\begin{cases}0\quad\text{if}\quad i\neq j,\\ \pi\quad\text{if}\quad i=j=0,\\ \frac{\pi}{2}\quad\text{if}\quad i=j\neq 0.\\ \end{cases}

Therefore for any jj we have

11Sn(x)Tj(x)dx1x2=11(un(0)+k=1n2un(k)Tk(x))Tj(x)dx1x2=πun(j).\displaystyle\int_{-1}^{1}S_{n}(x)T_{j}(x)\frac{dx}{\sqrt{1-x^{2}}}=\int_{-1}^{1}\left(u_{n}(0)+\sum_{k=1}^{n}2u_{n}(k)T_{k}(x)\right)\frac{T_{j}(x)dx}{\sqrt{1-x^{2}}}=\pi\cdot u_{n}(j).

Thus we can write un(k)u_{n}(k) as

un(k)=1π11Sn(x)Tk(x)dx1x2\displaystyle u_{n}(k)=\frac{1}{\pi}\int_{-1}^{1}S_{n}(x)T_{k}(x)\frac{dx}{\sqrt{1-x^{2}}}

where Tk(x)T_{k}(x) is the kkth Chebyshev polynomial, and

Sn(x)=1x12sin(π2n+2)(n+1)(1+cos(π2n+2))Tn+1(1+cos(π2n+2)2(x+1)1).\displaystyle S_{n}(x)=\frac{1}{x-1}\frac{2\sin\left(\frac{\pi}{2n+2}\right)}{(n+1)\left(1+\cos\left(\frac{\pi}{2n+2}\right)\right)}T_{n+1}\left(\frac{1+\cos\left(\frac{\pi}{2n+2}\right)}{2}(x+1)-1\right).

3.5. The Epanechnikov Kernel

This section is dedicated to proving Theorem 2, which offers a kernel that is nearly optimal and easy to implement in practice. First we verify the Epanechnikov kernel En:{n,,n}E_{n}:\{-n,\cdots,n\}\to\mathbb{R} satisfies the normalization requirement. Indeed, an induction argument gives the relation

k=nnk2=13n(n+1)(2n+1).\displaystyle\sum_{k=-n}^{n}k^{2}=\frac{1}{3}n(n+1)(2n+1).

Therefore we can compute

k=nnEn(k)\displaystyle\sum_{k=-n}^{n}E_{n}(k) =k=nn3n(4n21)(n2k2)\displaystyle=\sum_{k=-n}^{n}\frac{3}{n(4n^{2}-1)}\left(n^{2}-k^{2}\right)
=3n(4n21)(n2(2n+1)13n(n+1)(2n+1))=1.\displaystyle=\frac{3}{n(4n^{2}-1)}\left(n^{2}(2n+1)-\frac{1}{3}n(n+1)(2n+1)\right)=1.

Next, note Lemma 5 allows us to reduce the asymptotic relation of Theorem 2 to a claim about the polynomials

pn(x)=En(0)+2k=1nEn(k)Tk(x).\displaystyle p_{n}(x)=E_{n}(0)+2\sum_{k=1}^{n}E_{n}(k)T_{k}(x).

In particular, Lemma 5 promises the equivalence

sup0f2()Δ(Enf)2()f2()=2max1x1|(1x)pn(x)|.\displaystyle\sup_{0\neq f\in\ell^{2}(\mathbb{Z})}\frac{\|\Delta(E_{n}\ast f)\|_{\ell^{2}(\mathbb{Z})}}{\|f\|_{\ell^{2}(\mathbb{Z})}}=2\max_{-1\leq x\leq 1}|(1-x)p_{n}(x)|.

Therefore it suffices to show that as nn\to\infty we have the asymptotic equivalence

max1x1(1x)pn(x)3μ2n2.\displaystyle\max_{-1\leq x\leq 1}(1-x)p_{n}(x)\sim\frac{3\mu}{2n^{2}}.

For simplicity, we remove the normalizing coefficients of pn(x)p_{n}(x) and instead consider

p~n(x)=n(4n21)3pn(x)=n2+2k=1n1(n2k2)Tk(x).\displaystyle\widetilde{p}_{n}(x)=\frac{n(4n^{2}-1)}{3}p_{n}(x)=n^{2}+2\sum_{k=1}^{n-1}\left(n^{2}-k^{2}\right)T_{k}(x).

Therefore it suffices to show the following asymptotic equivalence as nn\to\infty:

(10) max1x1(1x)p~n(x)2μn.\displaystyle\max_{-1\leq x\leq 1}(1-x)\widetilde{p}_{n}(x)\sim 2\mu n.

The first step in proving this asymptotic relation is to rewrite (1x)p~(x)(1-x)\widetilde{p}(x) as follows.

Lemma 6.

Define the polynomial

p~n(x)=n2+2k=1n1(n2k2)Tk(x).\widetilde{p}_{n}(x)=n^{2}+2\sum_{k=1}^{n-1}\left(n^{2}-k^{2}\right)T_{k}(x).

Then,

(11) (1x)p~n(x)=Tn(x)Tn1(x)x1+(12n)Tn(x).(1-x)\widetilde{p}_{n}(x)=\frac{T_{n}(x)-T_{n-1}(x)}{x-1}+(1-2n)T_{n}(x).
Proof.

We first show the equality

(12) (1x)(1+2k=1n1Tk(x))=Tn1(x)Tn(x).(1-x)\left(1+2\sum_{k=1}^{n-1}T_{k}(x)\right)=T_{n-1}(x)-T_{n}(x).

This follows quickly by induction and the recurrence Tn+1(x)=2xTn(x)Tn1(x)T_{n+1}(x)=2xT_{n}(x)-T_{n-1}(x). Indeed, for n=1n=1 we find (1x)(1+20)=1x=T0(x)T1(x).(1-x)(1+2\cdot 0)=1-x=T_{0}(x)-T_{1}(x). Furthermore, if (12) holds for nn, then we find

(1x)(1+2k=1nTk(x))\displaystyle(1-x)\left(1+2\sum_{k=1}^{n}T_{k}(x)\right) =(Tn1(x)Tn(x))+2(1x)Tn(x)\displaystyle=(T_{n-1}(x)-T_{n}(x))+2(1-x)T_{n}(x)
=Tn(x)+(Tn1(x)2xTn(x))\displaystyle=T_{n}(x)+(T_{n-1}(x)-2xT_{n}(x))
=Tn(x)Tn+1(x).\displaystyle=T_{n}(x)-T_{n+1}(x).

Due to (12), the claim follows so long as we can prove

(13) (1x)p~n(x)=(1+2k=1n1Tk(x))+(12n)Tn(x).(1-x)\widetilde{p}_{n}(x)=\left(1+2\sum_{k=1}^{n-1}T_{k}(x)\right)+(1-2n)T_{n}(x).

We prove (13) by induction. Indeed, for n=1n=1 we can immediately compute the necessary relation (1x)p~1=1x=1+(121)T1(x).(1-x)\widetilde{p}_{1}=1-x=1+(1-2\cdot 1)T_{1}(x). Now suppose (13) holds for nn. First derive the following recurrence for p~n\widetilde{p}_{n}.

p~n+1(x)\displaystyle\widetilde{p}_{n+1}(x) =(n+1)2+2k=1n((n+1)2k2)Tk(x)\displaystyle=(n+1)^{2}+2\sum_{k=1}^{n}((n+1)^{2}-k^{2})T_{k}(x)
=(n2+2k=1n1(n2k2)Tk(x))+((2n+1)+2k=1n((n+1)2n2)Tk(x))\displaystyle=\left(n^{2}+2\sum_{k=1}^{n-1}(n^{2}-k^{2})T_{k}(x)\right)+\left((2n+1)+2\sum_{k=1}^{n}((n+1)^{2}-n^{2})T_{k}(x)\right)
=p~n(x)+(2n+1)(1+2k=1nTk(x)).\displaystyle=\widetilde{p}_{n}(x)+(2n+1)\left(1+2\sum_{k=1}^{n}T_{k}(x)\right).

Therefore we find (13) also holds for n+1n+1 by the following computation, which uses our hypothesis and (12).

(1x)p~n+1(x)\displaystyle(1-x)\widetilde{p}_{n+1}(x) =(1x)(p~n(x)+(2n+1)(1+k=1nTk(x)))\displaystyle=(1-x)\left(\widetilde{p}_{n}(x)+(2n+1)\left(1+\sum_{k=1}^{n}T_{k}(x)\right)\right)
=(1+2k=1n1Tk(x)+(12n)Tn(x))+(2n+1)(Tn(x)Tn+1(x))\displaystyle=\left(1+2\sum_{k=1}^{n-1}T_{k}(x)+(1-2n)T_{n}(x)\right)+(2n+1)(T_{n}(x)-T_{n+1}(x))
=1+2k=1nTk(x)+(12(n+1))Tn+1(x)\displaystyle=1+2\sum_{k=1}^{n}T_{k}(x)+(1-2(n+1))T_{n+1}(x)

With this new form for (1x)p~n(x)(1-x)\widetilde{p}_{n}(x) in hand, we turn back to our objective of showing (10). For each nn, we will bound the intervals [1,cos(16n)][-1,\cos\left(\frac{16}{n}\right)] and [cos(16n),1][\cos\left(\frac{16}{n}\right),1] separately. Notice (11) is useful for bounding [1,cos(16n)][-1,\cos\left(\frac{16}{n}\right)] because the polynomial (Tn(x)Tn1(x))/(x1)(T_{n}(x)-T_{n-1}(x))/(x-1) stays fairly small away from 11, which the following claim quantifies.

Lemma 7.

Over [1,1)[-1,1) we have

|Tn(x)Tn1(x)x1|21x.\displaystyle\left|\frac{T_{n}(x)-T_{n-1}(x)}{x-1}\right|\leq\frac{\sqrt{2}}{\sqrt{1-x}}.
Proof.

Write Tn(x)Tn1(x)=cos(nθ)cos((n1)θ)T_{n}(x)-T_{n-1}(x)=\cos(n\theta)-\cos((n-1)\theta) by making the substitution x=cosθx=\cos\theta. Then apply trigonometric identities to get the bound

|cos(nθ)cos((n1)θ)|\displaystyle|\cos(n\theta)-\cos((n-1)\theta)| =|2sin(2n12θ)sin(θ2)|\displaystyle=\left|2\sin\left(\frac{2n-1}{2}\theta\right)\sin\left(\frac{\theta}{2}\right)\right|
2|sin(θ2)|=21cosθ.\displaystyle\leq 2\left|\sin\left(\frac{\theta}{2}\right)\right|=\sqrt{2}\cdot\sqrt{1-\cos\theta}.

Therefore, substituting back to x=cos(θ)x=\cos(\theta) we find

|Tn(x)Tn1(x)x1|21x1x=21x.\displaystyle\left|\frac{T_{n}(x)-T_{n-1}(x)}{x-1}\right|\leq\frac{\sqrt{2}\sqrt{1-x}}{1-x}=\frac{\sqrt{2}}{\sqrt{1-x}}.

Now let x[1,cos(16n)]x\in[-1,\cos\left(\frac{16}{n}\right)] and use (11) to bound

(14) |(1x)p~n(x)|\displaystyle|(1-x)\widetilde{p}_{n}(x)| (2n1)|Tn(x)|+|Tn(x)Tn1(x)x1|\displaystyle\leq(2n-1)|T_{n}(x)|+\left|\frac{T_{n}(x)-T_{n-1}(x)}{x-1}\right|
(2n1)+21x(2n1)+21cos(16n).\displaystyle\leq(2n-1)+\frac{\sqrt{2}}{\sqrt{1-x}}\leq(2n-1)+\frac{\sqrt{2}}{\sqrt{1-\cos\left(\frac{16}{n}\right)}}.

We can simplify this further by taking the Taylor expansion of cosine about 0 and computing the following limit.

limn(2n1)1cos(16n)2=limn(2n1)22(12(16n)2+O(1n4))=16.\displaystyle\lim_{n\to\infty}(2n-1)\frac{\sqrt{1-\cos\left(\frac{16}{n}\right)}}{\sqrt{2}}=\lim_{n\to\infty}\sqrt{\frac{(2n-1)^{2}}{2}\left(\frac{1}{2}\left(\frac{16}{n}\right)^{2}+O\left(\frac{1}{n^{4}}\right)\right)}=16.

Therefore, returning to the inequality chain (14), we have for x[1,cos(16n)]x\in[-1,\cos\left(\frac{16}{n}\right)] the following asymptotic equivalence as nn\to\infty:

|(1x)p~n(x)|(2n1)(1+116)2μn.\displaystyle|(1-x)\widetilde{p}_{n}(x)|\sim(2n-1)\left(1+\frac{1}{16}\right)\leq 2\mu n.

Where the last inequality follows from 1+116μ1.0631+\frac{1}{16}\leq\mu\approx 1.063 by (6). We have shown

lim supnmax1xcos(16n)12n(1x)p~n(x)μ.\limsup_{n\to\infty}\max_{-1\leq x\leq\cos\left(\frac{16}{n}\right)}\frac{1}{2n}(1-x)\widetilde{p}_{n}(x)\leq\mu.

Therefore if only we can show the asymptotic equivalence

(15) maxcos(16n)x1(1x)p~n(x)2μn\max_{\cos\left(\frac{16}{n}\right)\leq x\leq 1}(1-x)\widetilde{p}_{n}(x)\sim 2\mu n

as nn\to\infty, then (10) and hence Theorem 2 follows. Substitute x=cos(αn)x=\cos\left(\frac{\alpha}{n}\right) and we claim uniform convergence

12n(1cos(αn))p~n(cos(αn))sin(α)αcos(α).\frac{1}{2n}\left(1-\cos\left(\frac{\alpha}{n}\right)\right)\cdot\widetilde{p}_{n}\left(\cos\left(\frac{\alpha}{n}\right)\right)\to\frac{\sin(\alpha)}{\alpha}-\cos(\alpha).

If this is true, the result follows immediately because uniform convergence allows the following interchange of maximum and limit.

limn12nmaxcos(16n)x1|(1x)p~n(x)|\displaystyle\lim_{n\to\infty}\frac{1}{2n}\max_{\cos\left(\frac{16}{n}\right)\leq x\leq 1}|(1-x)\widetilde{p}_{n}(x)| =limnmaxα[0,16]12n|(1cos(αn))p~n(cos(αn))|\displaystyle=\lim_{n\to\infty}\max_{\alpha\in[0,16]}\frac{1}{2n}\left|\left(1-\cos\left(\frac{\alpha}{n}\right)\right)\widetilde{p}_{n}\left(\cos\left(\frac{\alpha}{n}\right)\right)\right|
=maxα[0,16]limn12n|(1cos(αn))p~n(cos(αn))|\displaystyle=\max_{\alpha\in[0,16]}\lim_{n\to\infty}\frac{1}{2n}\left|\left(1-\cos\left(\frac{\alpha}{n}\right)\right)\widetilde{p}_{n}\left(\cos\left(\frac{\alpha}{n}\right)\right)\right|
=maxα[0,16]|sin(α)αcos(α)|=μ.\displaystyle=\max_{\alpha\in[0,16]}\left|\frac{\sin(\alpha)}{\alpha}-\cos(\alpha)\right|=\mu.

Thus the following claim is the only remaining barrier to the proof of Theorem 2.

Lemma 8.

We have uniform convergence

12n(1cos(αn))p~n(cos(αn))sin(α)αcos(α)\frac{1}{2n}\left(1-\cos\left(\frac{\alpha}{n}\right)\right)\widetilde{p}_{n}\left(\cos\left(\frac{\alpha}{n}\right)\right)\to\frac{\sin(\alpha)}{\alpha}-\cos(\alpha)

as nn\to\infty over α[0,16]\alpha\in[0,16].

Proof.

Use (11) and trigonometric identities to rewrite the sequence of functions as

12n(1cos(αn))p~n(cos(αn))\displaystyle\frac{1}{2n}\left(1-\cos\left(\frac{\alpha}{n}\right)\right)\widetilde{p}_{n}\left(\cos\left(\frac{\alpha}{n}\right)\right) =12ncos(α)cos(n1nα)cos(αn)1+12n2ncos(α)\displaystyle=\frac{1}{2n}\frac{\cos(\alpha)-\cos\left(\frac{n-1}{n}\alpha\right)}{\cos(\frac{\alpha}{n})-1}+\frac{1-2n}{2n}\cos(\alpha)
=1nsin(2n12nα)sin(α2n)1cos(αn)2n12ncos(α)\displaystyle=\frac{1}{n}\frac{\sin\left(\frac{2n-1}{2n}\alpha\right)\sin\left(\frac{\alpha}{2n}\right)}{1-\cos\left(\frac{\alpha}{n}\right)}-\frac{2n-1}{2n}\cos(\alpha)
=1nsin(2n12nα)2sin(α2n)2n12ncos(α)\displaystyle=\frac{1}{n}\frac{\sin\left(\frac{2n-1}{2n}\alpha\right)}{2\sin\left(\frac{\alpha}{2n}\right)}-\frac{2n-1}{2n}\cos(\alpha)

Because 2n12ncos(α)cos(α)\frac{2n-1}{2n}\cos(\alpha)\to\cos(\alpha) uniformly, we only must show that

1nsin(2n12nα)2sin(α2n)sin(α)α\displaystyle\frac{1}{n}\frac{\sin\left(\frac{2n-1}{2n}\alpha\right)}{2\sin\left(\frac{\alpha}{2n}\right)}\to\frac{\sin(\alpha)}{\alpha}

uniformly. We break this into two parts, first using that sine is Lipschitz with constant 11 to compute

|sin(2n12nα)αsin(α)α||2n12nααα|=|2n12n1|.\displaystyle\left|\frac{\sin\left(\frac{2n-1}{2n}\alpha\right)}{\alpha}-\frac{\sin(\alpha)}{\alpha}\right|\leq\left|\frac{\frac{2n-1}{2n}\alpha-\alpha}{\alpha}\right|=\left|\frac{2n-1}{2n}-1\right|.

Notice the right hand side of the above inequality converges uniformly to 0 as nn\to\infty, giving the uniform convergence 1αsin(2n12nα)1αsin(α)\frac{1}{\alpha}\sin(\frac{2n-1}{2n}\alpha)\to\frac{1}{\alpha}\sin(\alpha). Secondly, use Taylor’s theorem to get the following uniform convergence over α[0,16]\alpha\in[0,16]:

α2nsin(α2n)=αα+O((αn)3)=11+O(α2n3)1.\displaystyle\frac{\alpha}{2n\sin\left(\frac{\alpha}{2n}\right)}=\frac{\alpha}{\alpha+O((\frac{\alpha}{n})^{3})}=\frac{1}{1+O(\frac{\alpha^{2}}{n^{3}})}\to 1.

Because 11 and 1αsin(α)\frac{1}{\alpha}\sin(\alpha) are bounded over [0,16][0,16], the product of the sequences converges uniformly to the product of the limits and so we get uniform convergence

1nsin(2n12nα)2sin(α2n)=sin(2n12nα)αα2nsin(α2n)sinαα.\displaystyle\frac{1}{n}\frac{\sin\left(\frac{2n-1}{2n}\alpha\right)}{2\sin\left(\frac{\alpha}{2n}\right)}=\frac{\sin\left(\frac{2n-1}{2n}\alpha\right)}{\alpha}\cdot\frac{\alpha}{2n\sin\left(\frac{\alpha}{2n}\right)}\to\frac{\sin\alpha}{\alpha}.

Acknowledgements

This material is based upon work supported by the National Science Foundation Graduate Research Fellowship under Grant No. DGE-2140004. Additional thanks to Stefan Steinerberger for bringing this problem to the author’s attention and for helpful conversations.

References

  • [1] Jean Babaud, Andrew P. Witkin, Michel Baudin, and Richard O. Duda. Uniqueness of the Gaussian kernel for scale-space filtering. IEEE Transactions on Pattern Analysis and Machine Intelligence, PAMI-8(1):26–33, 1986.
  • [2] William Beckner. Inequalities in Fourier Analysis. Annals of Mathematics, 102(1):159–182, 1975.
  • [3] William Beckner. Pitt’s inequality and the uncertainty principle. Proceedings of the American Mathematical Society, 123(6):1897–1905, 1995.
  • [4] Emanuel Carneiro, Damiano Foschi, Diogo Oliveira e Silva, and Christoph Thiele. A sharp trilinear inequality related to Fourier restriction on the circle. Revista Matematica Iberoamericana, 33(4):1463–1486, 2017.
  • [5] Emanuel Carneiro, Diogo Oliveira e Silva, and Mateus Sousa. Extremizers for Fourier restriction on hyperboloids. Annales de l’Institut Henri Poincaré C, Analyse non linéaire, 36(2):389–415, 2019.
  • [6] Michael G. Cowling and John F. Price. Bandwidth versus time concentration: The Heisenberg–Pauli–Weyl inequality. SIAM Journal on Mathematical Analysis, 15(1):151–165, 1984.
  • [7] V. A. Epanechnikov. Non-parametric estimation of a multivariate probability density. Theory of Probability & Its Applications, 14(1):153–158, 1969.
  • [8] Damiano Foschi. Maximizers for the Strichartz inequality. Journal of the European Mathematical Society, 9(4):739–774, 2007.
  • [9] Felipe Gonçalves, Diogo Oliveira e Silva, and Stefan Steinerberger. Hermite polynomials, linear flows on the torus, and an uncertainty principle for roots. Journal of Mathematical Analysis and Applications, 451(2):678–711, 2017.
  • [10] Peter Hall, Michael C. Minnotte, and Chunming Zhang. Bump hunting with non-Gaussian kernels. The Annals of Statistics, 32(5):2124 – 2141, 2004.
  • [11] Dirk Hundertmark and Vadim Zharnitsky. On sharp Strichartz inequalities in low dimensions. International Mathematics Research Notices, 2006:34080, 2006.
  • [12] Jan J Koenderink. The structure of images. Biological Cybernetics, 50(5):363–370, 1984.
  • [13] Vjekoslav Kovač, Diogo Oliveira e Silva, and Jelena Rupčić. A sharp nonlinear Hausdorff–Young inequality for small potentials. Proceedings of the American Mathematical Society, 147(1):239–253, 2019.
  • [14] Vjekoslav Kovač, Diogo Oliveira E Silva, and Jelena Rupčić. Asymptotically sharp discrete nonlinear Hausdorff–Young inequalities for the SU(1,1)\operatorname{SU}(1,1)-valued Fourier products. The Quarterly Journal of Mathematics, 73(3):1179–1188, 2022.
  • [15] Noah Kravitz and Stefan Steinerberger. The smoothest average: Dirichlet, Fejér and Chebyshev. Bulletin of the London Mathematical Society, 53(6):1801–1815, 2021.
  • [16] Tony Lindeberg. Scale-space for discrete signals. IEEE Transactions on Pattern Analysis and Machine Intelligence, 12(3):234–254, 1990.
  • [17] Diogo Oliveira e Silva, Christoph Thiele, and Pavel Zorin-Kranich. Band-limited maximizers for a Fourier extension inequality on the circle. Experimental Mathematics, 31(1):192–198, 2022.
  • [18] M Samiuddin and GM El-Sayyad. On nonparametric kernel density estimates. Biometrika, 77(4):865–874, 1990.
  • [19] Betsy Stovall. Uniform estimates for Fourier restriction to polynomial curves in d\mathbb{R}^{d}. American Journal of Mathematics, 138(2):449–471, 2016.
  • [20] Betsy Stovall. Extremizability of Fourier restriction to the paraboloid. Advances in Mathematics, 360:106898, 2020.
  • [21] United States Geological Survey. Lake Chelan elevation of resevoir water surface. https://waterdata.usgs.gov/monitoring-location/12452000. Data from 2011-07-16 to 2011-07-30. Accessed 2023-05-24.
  • [22] Ryoya Yamasaki and Toshiyuki Tanaka. Kernel selection for modal linear regression: Optimal kernel and irls algorithm. In 2019 18th IEEE International Conference On Machine Learning And Applications (ICMLA), pages 595–601. IEEE, 2019.
  • [23] Alan L. Yuille and Tomaso A. Poggio. Scaling theorems for zero crossings. IEEE Transactions on Pattern Analysis and Machine Intelligence, PAMI-8(1):15–25, 1986.