This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Approximating monomials using Chebyshev polynomials

Arvind K. Saibaba111Department of Mathematics, North Carolina State University, Raleigh, NC. Email: [email protected].
Abstract

This paper considers the approximation of a monomial xnx^{n} over the interval [1,1][-1,1] by a lower-degree polynomial. This polynomial approximation can be easily computed analytically and is obtained by truncating the analytical Chebyshev series expansion of xnx^{n}. The error in the polynomial approximation in the supremum norm has an exact expression with an interesting probabilistic interpretation. We use this interpretation along with concentration inequalities to develop a useful upper bound for the error.

Keywords— Chebyshev polynomials, Polynomial Approximation, Binomial Coefficients, Concentration inequalities

1 Motivation and Introduction

We are interested in approximating the monomial xnx^{n} by a polynomial of degree 0k<n0\leq k<n over the interval [1,1][-1,1]. The monomials 1,x,x2,1,x,x^{2},\dots form a basis for C[1,1]C[-1,1], so it seems unlikely that we can represent a monomial in terms of lower degree polynomials. In Figure 1, we plot a few functions from the monomial basis over [0,1][0,1]; the basis function look increasingly alike as we take higher and higher powers, i.e., they appear to “lose independence.” Numerical analysts often avoid the monomial basis in polynomial interpolation since they result in ill-conditioned Vandermonde matrices, leading to poor numerical performance in finite precision arithmetic. This loss of independence means that it is reasonable to approximate the monomial xnx^{n} as a linear combination of lower order monomials, i.e., a lower order polynomial approximation. The natural question to ask, therefore, is: how small can kk be so that a well-chosen polynomial of degree kk can accurately approximate xnx^{n}?

Refer to caption
Figure 1: Visualization of a few monomials in the interval [0,1][0,1].

The surprising answer to this question is that we can approximate the monomial xnx^{n} over [1,1][-1,1] by a polynomial of small degree, which we will make precise. Let f=maxx[1,1]|f(x)|\|f\|_{\infty}=\max_{x\in[-1,1]}|f(x)| denote the supremum norm on C[1,1]C[-1,1] and let πk()\pi_{k}^{*}(\cdot) be the best polynomial approximation to xnx^{n} in this norm; that is

En,k:=minπ𝒫kxnπ(x)=xnπk(x),E_{n,k}:=\min_{\pi\in\mathcal{P}_{k}}\|x^{n}-\pi(x)\|_{\infty}=\|x^{n}-\pi_{k}^{*}(x)\|_{\infty},

where 𝒫k\mathcal{P}_{k} is a vector space of polynomials with real coefficients of degree at most kk. The minimizer πk()\pi_{k}^{*}(\cdot) exists and is unique [9, Chapter 10], but does not have a closed form expression. Newman and Rivlin [7, Theorem 2] showed that222We briefly mention that the notation our manuscript differs from [7] in that we reverse the roles of nn and kk.

pn,k4exnπk(x)pn,k,\frac{p_{n,k}}{4e}\leq\|x^{n}-\pi_{k}^{*}(x)\|_{\infty}\leq p_{n,k}, (1)

where the term pn,kp_{n,k} is given by the formula

pn,k=12n1j=(n+k)/2+1n(nj).p_{n,k}=\frac{1}{2^{n-1}}\sum_{j=\lfloor(n+k)/2\rfloor+1}^{n}\binom{n}{j}.

Since pn,kp_{n,k} involves the sum of binomial coefficients, it has a probabilistic interpretation which we explore in Section 4.

To see why a small kk is sufficient, consider the upper bound pn,kp_{n,k}. In Section 4 we use the probabilistic interpretation to obtain the following bound pn,k2exp(k2/2n)p_{n,k}\leq 2\exp\left(-{k^{2}}/{2n}\right). Suppose we are given a user-defined tolerance ϵ>0\epsilon>0. To ensure

xnπk(x)ϵ,\|x^{n}-\pi_{k}^{*}(x)\|_{\infty}\leq\epsilon,

we need to choose k2nlog(2/ϵ)k\geq\sqrt{2n\log(2/\epsilon)}. The accuracy of the polynomial approximation is visualized in Figure 2, where in the left panel we plot the monomial xnx^{n} for n=75n=75 and the best polynomial approximation πk\pi_{k}^{*} for k=5,15,25k=5,15,25. The polynomial πk\pi_{k}^{*} is computed using the Remez algorithm, implemented in chebfun [3]. We see that for k=25k=25, the polynomial approximation looks very accurate. In the right panel, we display pn,kp_{n,k}, which is the upper bound of the best polynomial approximation, as well as the upper bound for pn,kp_{n,k}. We see that pn,kp_{n,k} and its upper bound both have sharp decay with increasing kk. Numerical evidence in [6] further confirms this analysis; the authors show that the error En,kE_{n,k} behaves approximately like 12erfc(k/n)\frac{1}{2}\text{erfc}(k/\sqrt{n}), where erfc is the complementary error function. .

Refer to caption
Figure 2: (left) Approximation of the monomial xnx^{n} for n=75n=75 by πk\pi_{k}^{*}, (right) pn,kp_{n,k} and its upper bound 2exp(k2/2n)2\exp(-k^{2}/2n). The visualization is restricted to the interval [0,1][0,1].

Polynomial and rational approximations to the monomial has received considerable attention in the approximation theory community, and surveys of various results can be found in [8, 6]. Polynomial approximations to high order monomials have many applications in numerical analysis. This key insight was exploited by Cornelius Lanczos [5] in his “τ\tau-method” for the numerical solution of differential equations. For a simulating discussion on this topic, please see [6]. In numerical linear algebra, this has been exploited to efficiently compute matrix powers and the Schatten p-norm of a matrix [1, 4].

In this short note, we show to construct a polynomial approximation xnϕk(x)x^{n}\approx\phi_{k}(x) using a truncated Chebyshev polynomial expansion. The error in the truncated representation equals the sum of the discarded coefficients and is precisely pn,kp_{n,k}. The polynomial ϕk\phi_{k} and the resulting error can both be computed analytically and, therefore, is of great practical use. We briefly review Chebyshev polynomials in Section 2 and state and prove the main result in Section 3. In Section 4, we explore probabilistic interpretations of pn,kp_{n,k} and obtain bounds for partial sums of binomial coefficients.

2 Chebyshev polynomials

The Chebyshev polynomials of the first kind Tn(x)T_{n}(x) for n=0,1,2,n=0,1,2,\dots can be represented as

Tn(x)=cos(narccosx)x[1,1].T_{n}(x)=\cos(n\arccos{x})\qquad x\in[-1,1].

Starting with T0(x)=1T_{0}(x)=1, T1(x)=xT_{1}(x)=x, the Chebyshev polynomials satisfy a recurrence relationship of the form Tn+1(x)=2xTn(x)Tn1(x)T_{n+1}(x)=2xT_{n}(x)-T_{n-1}(x) for n1n\geq 1. The Chebyshev polynomials are orthogonal with respect to the weighted inner product

u,v=11w(x)u(x)v(x)𝑑x\langle u,v\rangle=\int_{-1}^{1}w(x)u(x)v(x)dx

where the weight function takes the form w(x)=(1x2)1/2w(x)=(1-x^{2})^{-1/2}. Any function fC[1,1]f\in C[-1,1] that is Lipschitz continuous can be represented in terms of a Chebyshev polynomial expansion of the form

f(x)=12c0+j=1cjTj(x)x[1,1],f(x)=\frac{1}{2}c_{0}+\sum_{j=1}^{\infty}c_{j}T_{j}(x)\qquad x\in[-1,1],

where the coefficients cjc_{j} are obtained as cj=2πf(x),Tj(x)c_{j}=\frac{2}{{\pi}}\langle f(x),T_{j}(x)\rangle and the series is uniformly convergent. The monomial xnx^{n} is rather special since it has the following exact representation in terms of the Chebyshev polynomials [2, Section 4]

xn=j=0ncjTj(x),x^{n}=\sum_{j=0}^{n}{}^{{}^{\prime}}c_{j}T_{j}(x), (2)

where {}^{{}^{\prime}} means the summand corresponding to j=0j=0 is halved (if it appears) and the coefficients cjc_{j} for j=0,,nj=0,\dots,n are

cj={21n(n(nj)/2)nj even0otherwise.c_{j}=\left\{\begin{array}[]{ll}2^{1-n}\binom{n}{(n-j)/2}&n-j\text{ even}\\ 0&\text{otherwise}.\end{array}\right. (3)

Equation (2) takes a more familiar form, when we consider the trigonometric perspective of Chebyshev polynomials. For example, the well-known trigonometric identity cos(3θ)=4cos3θ3cosθ\cos(3\theta)=4\cos^{3}\theta-3\cos\theta, can be arranged as

cos3θ=34cosθ+14cos(3θ)=122((31)cosθ+(30)cos(3θ)).\cos^{3}\theta=\frac{3}{4}\cos\theta+\frac{1}{4}\cos(3\theta)=\frac{1}{2^{2}}\left(\binom{3}{1}\cos\theta+\binom{3}{0}\cos(3\theta)\right).

With x=cosθx=\cos\theta, we get x3=22(31)T1(x)+22(30)T3(x)x^{3}=2^{-2}\binom{3}{1}T_{1}(x)+2^{-2}\binom{3}{0}T_{3}(x). For completeness, we provide a derivation of (2) in Appendix A. It is important to note here that the series in (2) is finite, but can be truncated to obtain an accurate approximation; see Section 3.

Chebyshev polynomials have many applications in approximation theory and numerical analysis [9] but we limit ourselves to two such examples here. First, if the function is differentiable rr times or analytic, the Chebyshev coefficients exhibit decay (algebraic or geometric respectively). Therefore, the Chebyshev series can be truncated to obtain an polynomial approximation of the function and the accuracy of the approximation depends on the rate of decay of the coefficients. Another application of Chebyshev polynomials is in the theory and practice of polynomial interpolation. The polynomial qn1(x)=xn21nTn(x){q}_{n-1}^{*}(x)=x^{n}-2^{1-n}T_{n}(x) solves the minimax problem

minq𝒫n1xnq(x)=21n.\min_{q\in\mathcal{P}_{n-1}}\|x^{n}-q(x)\|_{\infty}=2^{1-n}. (4)

Based on the minimax characterization, to interpolate a function over [1,1][-1,1] by a polynomial of degree n1n-1, the function to be interpolated should be evaluated at the roots of the Chebyshev polynomial TnT_{n} given by the points xj=cos(2j+12nπ)x_{j}=\cos\left(\frac{2j+1}{2n}\pi\right) for j=0,,n1j=0,\dots,n-1.

3 Main result

We construct the polynomial approximation xnϕk(x)x^{n}\approx\phi_{k}(x) by truncating the Chebyshev polynomial expansion in (2) beyond the term j=kj=k. That is

ϕk(x):=j=0kcjTj(x).\phi_{k}(x):=\sum_{j=0}^{k}{}^{{}^{\prime}}c_{j}T_{j}(x).

Our main result is the following theorem, which quantifies the error in the polynomial approximation. The proof of this theorem is based on the expression in (2). We believe this result is new.

Theorem 1.

The error in the polynomial approximation ϕk(x)\phi_{k}(x) satisfies

xnϕk(x)=pn,k.\|x^{n}-\phi_{k}(x)\|_{\infty}=p_{n,k}.
Proof.

From (2), xnϕk(x)=j=k+1ncjTj(x)x^{n}-\phi_{k}(x)=\sum_{j=k+1}^{n}c_{j}T_{j}(x). Using triangle inequality, we find that xnϕk(x)j=k+1ncj\|x^{n}-\phi_{k}(x)\|_{\infty}\leq\sum_{j=k+1}^{n}c_{j} since the coefficients are nonnegative and the Chebyshev polynomials are bounded as |Tj(x)|1|T_{j}(x)|\leq 1. Substituting the coefficients cjc_{j} from (3), to get

xnϕk(x)12n1j=k+1nj evenn(n(nj)/2).\|x^{n}-\phi_{k}(x)\|_{\infty}\leq\frac{1}{2^{n-1}}\sum_{\begin{subarray}{c}j=k+1\\ n-j\text{ even}\end{subarray}}^{n}\binom{n}{(n-j)/2}. (5)

Using the properties of the binomial coefficients, the summation simplifies as

j=k+1nj evenn(n(nj)/2)=j=k+1n+j evenn(n(n+j)/2)=j=(n+k)/2+1n(nj).\sum_{\begin{subarray}{c}j=k+1\\ n-j\text{ even}\end{subarray}}^{n}\binom{n}{(n-j)/2}=\sum_{\begin{subarray}{c}j=k+1\\ n+j\text{ even}\end{subarray}}^{n}\binom{n}{(n+j)/2}=\sum_{j=\lfloor(n+k)/2\rfloor+1}^{n}\binom{n}{j}.

Plug this identity into (5) to get xnϕk(x)pn,k\|x^{n}-\phi_{k}(x)\|_{\infty}\leq p_{n,k}. The bound is clearly achieved at x=1x=1, where all the Chebyshev polynomials take the value 11. ∎

This theorem shows that the polynomial approximation ϕk\phi_{k} is nearly optimal, and the error due to this approximation is pn,kp_{n,k}. However, it is the optimal polynomial for the special case k=n1k=n-1. It is easy to see that xnϕn1(x)=21nTn(x)x^{n}-\phi_{n-1}(x)=2^{1-n}T_{n}(x) and so ϕn1\phi_{n-1} is the same as the best polynomial approximation qn1q^{*}_{n-1} in (4). For k<n1k<n-1, from (1) and Theorem 1

xnϕk(x)4exnπk(x)xnϕk(x),\frac{\|x^{n}-\phi_{k}(x)\|_{\infty}}{4e}\leq\|x^{n}-\pi_{k}^{*}(x)\|_{\infty}\leq\|x^{n}-\phi_{k}(x)\|_{\infty},

so that the error in the Chebyshev polynomial approximation is suboptimal by at most the factor 4e10.874e\approx 10.87. Therefore, by using ϕk\phi_{k} we lose only one significant digit of accuracy compared to πk\pi^{*}_{k}.

4 A probabilistic digression

In Section 1, we saw that the error in the monomial approximation depends on pn,kp_{n,k}. Since pn,kp_{n,k} depends on the sum of binomial coefficients, it has a probabilistic interpretation. Newman and Rivlin [7] observed that if a fair coin is tossed nn times, pn,kp_{n,k} is the probability that the magnitude of the difference between the number of heads and the number of tails exceeds kk. They used this insight along with the de Moivre-Laplace theorem [10, Section 1.3] (which is a special case of the Central Limit Theorem) to obtain the approximation pn,k2erfc(k/n)p_{n,k}\approx 2\text{erfc}(k/\sqrt{n}).

To convert this into a rigorous inequality for pn,kp_{n,k} we use a different tool from probability, namely, concentration inequalities. The inequalities are useful in quantifying how much a random variable deviates from its mean. We start with the following alternative interpretation for pn,kp_{n,k}: it is twice the probability that greater than (n+k)/2\lfloor(n+k)/2\rfloor coin tosses result in heads (or equivalently tails). We associate each coin toss with an independent Bernoulli random variable XiX_{i} with parameter p=1/2p=1/2 since the coin is fair. The random variable X=i=1nXiX=\sum_{i=1}^{n}X_{i} has the Binomial distribution with parameters nn and pp. Then,

pn,k=2((n+k)/2+1Xn)2(X(n+k)/2).p_{n,k}=2\mathbb{P}\left(\lfloor(n+k)/2\rfloor+1\leq X\leq n\right)\leq 2\mathbb{P}\left(X\geq(n+k)/2\right).

Since XX has the Binomial distribution, we can once again use the de Moivre-Laplace theorem, to say that as nn\rightarrow\infty,

Xnpnp(1p)𝒩(0,1),in distribution.\frac{X-np}{\sqrt{np(1-p)}}\longrightarrow\mathcal{N}(0,1),\qquad\text{in distribution}.

Roughly speaking, this theorem says that XX behaves as a normal random variable with mean n/2n/2 and variance n/4n/4. Since the tails of normal distributions decay exponentially, we expect that XX lies in the range n2±1.96n4\frac{n}{2}\pm 1.96\sqrt{\frac{n}{4}} with nearly 95%95\% probability; alternatively, the probability that it is outside this range is very small. To make this more precise, we apply Hoeffding’s concentration inequality [10, Theorem 2.2.6], to obtain

(X(n+k)/2)=(X𝔼[X]k/2)exp(k22n).\mathbb{P}\left(X\geq(n+k)/2\right)=\mathbb{P}\left(X-\mathbb{E}[X]\geq k/2\right)\leq\exp\left(-\frac{k^{2}}{2n}\right).

This gives our desired bound pn,k2exp(k2/2n)p_{n,k}\leq 2\exp(-{k^{2}}/{2n}).

We can use a similar technique to prove the following result which may be of independent interest. If 0kn/20\leq k\leq n/2, then

j=0k(nj)2nexp((n2k)22n).\sum_{j=0}^{k}\binom{n}{j}\leq 2^{n}\exp\left(-\frac{(n-2k)^{2}}{2n}\right).

Other concentration inequalities such as Chernoff and Bernstein (see [10, Chapter 2]) also give equally interesting bounds. We invite the reader to explore such results.

5 Acknowledgements

The author would like to thank Alen Alexanderian, Ethan Dudley, Ivy Huang, Ilse Ipsen, and Nathan Reading for comments and feedback. The work was supported by the National Science Foundation through the grants DMS-1745654 and DMS-1845406.

6 Declaration of Interest

The author has no relevant financial or non-financial competing interests to report.

Appendix A Derivation of the monomial expansion

In this appendix, we provide a short derivation of (2). We take x=cosθx=\cos\theta and write

cosnθ=(eiθ+eiθ2)=12nj=0n(nj)eijθei(nj)θ=12nj=0n(nj)ei(2jn)θ.\cos^{n}\theta=\left(\frac{e^{i\theta}+e^{-i\theta}}{2}\right)=\frac{1}{2^{n}}\sum_{j=0}^{n}\binom{n}{j}e^{ij\theta}e^{-i(n-j)\theta}=\frac{1}{2^{n}}\sum_{j=0}^{n}\binom{n}{j}e^{i(2j-n)\theta}.

We have used Euler’s formula and the binomial theorem. At this point, the derivation splits into two different paths:

1. nn is odd
cosnθ=\displaystyle\cos^{n}\theta= 12n(j=0(n1)/2(nj)ei(2jn)θ+j=(n+1)/2n(nj)ei(2jn)θ)\displaystyle\>\frac{1}{2^{n}}\left(\sum_{j=0}^{(n-1)/2}\binom{n}{j}e^{i(2j-n)\theta}+\sum_{j=(n+1)/2}^{n}\binom{n}{j}e^{i(2j-n)\theta}\right)
=\displaystyle= 12n1j=0(n1)/2(nj)cos((n2j)θ)\displaystyle\>\frac{1}{2^{n-1}}\sum_{j=0}^{(n-1)/2}\binom{n}{j}\cos((n-2j)\theta)
=\displaystyle= 12n1j=0nj evenn(n(nj)/2)cos(jθ).\displaystyle\>\frac{1}{2^{n-1}}\sum_{\begin{subarray}{c}j=0\\ n-j\text{ even}\end{subarray}}^{n}\binom{n}{(n-j)/2}\cos(j\theta).
2. nn is even
cosnθ=\displaystyle\cos^{n}\theta= 12n(j=0n/21(nj)ei(2jn)θ+(nn/2)+j=n/2+1n(nj)ei(2jn)θ)\displaystyle\>\frac{1}{2^{n}}\left(\sum_{j=0}^{n/2-1}\binom{n}{j}e^{i(2j-n)\theta}+\binom{n}{n/2}+\sum_{j=n/2+1}^{n}\binom{n}{j}e^{i(2j-n)\theta}\right)
=\displaystyle= 12n(nn/2)+12n1j=0n/21(nj)cos((n2j)θ)\displaystyle\>\frac{1}{2^{n}}\binom{n}{n/2}+\frac{1}{2^{n-1}}\sum_{j=0}^{n/2-1}\binom{n}{j}\cos((n-2j)\theta)
=\displaystyle= 12n1j=0nj evenn(n(nj)/2)cos(jθ).\displaystyle\>\frac{1}{2^{n-1}}\sum_{\begin{subarray}{c}j=0\\ n-j\text{ even}\end{subarray}}^{n}{}^{{}^{\prime}}\binom{n}{(n-j)/2}\cos(j\theta).

In either case, substitute x=cosθx=\cos\theta and Tj(x)=cos(jθ)T_{j}(x)=\cos(j\theta) to complete the derivation.

References

  • [1] H. Avron and S. Toledo. Randomized algorithms for estimating the trace of an implicit symmetric positive semi-definite matrix. Journal of the ACM (JACM), 58(2):8, 2011.
  • [2] W. J. Cody. A survey of practical rational and polynomial approximation of functions. SIAM Review, 12(3):400–423, 1970.
  • [3] T. A. Driscoll, N. Hale, and L. N. Trefethen. Chebfun Guide. Pafnuty Publications, 2014.
  • [4] E. Dudley, A. K. Saibaba, and A. Alexanderian. Monte Carlo estimators for the Schatten p-norm of symmetric positive semidefinite matrices. arXiv preprint arXiv:2005.10174, 2020.
  • [5] C. Lanczos. Applied analysis. Dover Books on Advanced Mathematics. Dover Publications, Inc., New York, 1988. Reprint of the 1956 original.
  • [6] Y. Nakatsukasa and L. Trefethen. Rational approximation of xnx^{n}. Proceedings of the American Mathematical Society, 146(12):5219–5224, 2018.
  • [7] D. J. Newman and T. J. Rivlin. Approximation of monomials by lower degree polynomials. Aequationes Mathematicae, 14(3):451–455, 1976.
  • [8] A. R. Reddy. Approximations to xnx^{n} and |x||x|—a survey. J. Approx. Theory, 51(2):127–137, 1987.
  • [9] L. N. Trefethen. Approximation theory and approximation practice. Society for Industrial and Applied Mathematics (SIAM), Philadelphia, PA, 2013.
  • [10] R. Vershynin. High-dimensional probability, volume 47 of Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge University Press, Cambridge, 2018.