Approximating monomials using Chebyshev polynomials
Abstract
This paper considers the approximation of a monomial over the interval by a lower-degree polynomial. This polynomial approximation can be easily computed analytically and is obtained by truncating the analytical Chebyshev series expansion of . The error in the polynomial approximation in the supremum norm has an exact expression with an interesting probabilistic interpretation. We use this interpretation along with concentration inequalities to develop a useful upper bound for the error.
Keywords— Chebyshev polynomials, Polynomial Approximation, Binomial Coefficients, Concentration inequalities
1 Motivation and Introduction
We are interested in approximating the monomial by a polynomial of degree over the interval . The monomials form a basis for , so it seems unlikely that we can represent a monomial in terms of lower degree polynomials. In Figure 1, we plot a few functions from the monomial basis over ; the basis function look increasingly alike as we take higher and higher powers, i.e., they appear to “lose independence.” Numerical analysts often avoid the monomial basis in polynomial interpolation since they result in ill-conditioned Vandermonde matrices, leading to poor numerical performance in finite precision arithmetic. This loss of independence means that it is reasonable to approximate the monomial as a linear combination of lower order monomials, i.e., a lower order polynomial approximation. The natural question to ask, therefore, is: how small can be so that a well-chosen polynomial of degree can accurately approximate ?

The surprising answer to this question is that we can approximate the monomial over by a polynomial of small degree, which we will make precise. Let denote the supremum norm on and let be the best polynomial approximation to in this norm; that is
where is a vector space of polynomials with real coefficients of degree at most . The minimizer exists and is unique [9, Chapter 10], but does not have a closed form expression. Newman and Rivlin [7, Theorem 2] showed that222We briefly mention that the notation our manuscript differs from [7] in that we reverse the roles of and .
(1) |
where the term is given by the formula
Since involves the sum of binomial coefficients, it has a probabilistic interpretation which we explore in Section 4.
To see why a small is sufficient, consider the upper bound . In Section 4 we use the probabilistic interpretation to obtain the following bound . Suppose we are given a user-defined tolerance . To ensure
we need to choose . The accuracy of the polynomial approximation is visualized in Figure 2, where in the left panel we plot the monomial for and the best polynomial approximation for . The polynomial is computed using the Remez algorithm, implemented in chebfun [3]. We see that for , the polynomial approximation looks very accurate. In the right panel, we display , which is the upper bound of the best polynomial approximation, as well as the upper bound for . We see that and its upper bound both have sharp decay with increasing . Numerical evidence in [6] further confirms this analysis; the authors show that the error behaves approximately like , where erfc is the complementary error function. .

Polynomial and rational approximations to the monomial has received considerable attention in the approximation theory community, and surveys of various results can be found in [8, 6]. Polynomial approximations to high order monomials have many applications in numerical analysis. This key insight was exploited by Cornelius Lanczos [5] in his “-method” for the numerical solution of differential equations. For a simulating discussion on this topic, please see [6]. In numerical linear algebra, this has been exploited to efficiently compute matrix powers and the Schatten p-norm of a matrix [1, 4].
In this short note, we show to construct a polynomial approximation using a truncated Chebyshev polynomial expansion. The error in the truncated representation equals the sum of the discarded coefficients and is precisely . The polynomial and the resulting error can both be computed analytically and, therefore, is of great practical use. We briefly review Chebyshev polynomials in Section 2 and state and prove the main result in Section 3. In Section 4, we explore probabilistic interpretations of and obtain bounds for partial sums of binomial coefficients.
2 Chebyshev polynomials
The Chebyshev polynomials of the first kind for can be represented as
Starting with , , the Chebyshev polynomials satisfy a recurrence relationship of the form for . The Chebyshev polynomials are orthogonal with respect to the weighted inner product
where the weight function takes the form . Any function that is Lipschitz continuous can be represented in terms of a Chebyshev polynomial expansion of the form
where the coefficients are obtained as and the series is uniformly convergent. The monomial is rather special since it has the following exact representation in terms of the Chebyshev polynomials [2, Section 4]
(2) |
where means the summand corresponding to is halved (if it appears) and the coefficients for are
(3) |
Equation (2) takes a more familiar form, when we consider the trigonometric perspective of Chebyshev polynomials. For example, the well-known trigonometric identity , can be arranged as
With , we get . For completeness, we provide a derivation of (2) in Appendix A. It is important to note here that the series in (2) is finite, but can be truncated to obtain an accurate approximation; see Section 3.
Chebyshev polynomials have many applications in approximation theory and numerical analysis [9] but we limit ourselves to two such examples here. First, if the function is differentiable times or analytic, the Chebyshev coefficients exhibit decay (algebraic or geometric respectively). Therefore, the Chebyshev series can be truncated to obtain an polynomial approximation of the function and the accuracy of the approximation depends on the rate of decay of the coefficients. Another application of Chebyshev polynomials is in the theory and practice of polynomial interpolation. The polynomial solves the minimax problem
(4) |
Based on the minimax characterization, to interpolate a function over by a polynomial of degree , the function to be interpolated should be evaluated at the roots of the Chebyshev polynomial given by the points for .
3 Main result
We construct the polynomial approximation by truncating the Chebyshev polynomial expansion in (2) beyond the term . That is
Our main result is the following theorem, which quantifies the error in the polynomial approximation. The proof of this theorem is based on the expression in (2). We believe this result is new.
Theorem 1.
The error in the polynomial approximation satisfies
Proof.
From (2), . Using triangle inequality, we find that since the coefficients are nonnegative and the Chebyshev polynomials are bounded as . Substituting the coefficients from (3), to get
(5) |
Using the properties of the binomial coefficients, the summation simplifies as
Plug this identity into (5) to get . The bound is clearly achieved at , where all the Chebyshev polynomials take the value . ∎
This theorem shows that the polynomial approximation is nearly optimal, and the error due to this approximation is . However, it is the optimal polynomial for the special case . It is easy to see that and so is the same as the best polynomial approximation in (4). For , from (1) and Theorem 1
so that the error in the Chebyshev polynomial approximation is suboptimal by at most the factor . Therefore, by using we lose only one significant digit of accuracy compared to .
4 A probabilistic digression
In Section 1, we saw that the error in the monomial approximation depends on . Since depends on the sum of binomial coefficients, it has a probabilistic interpretation. Newman and Rivlin [7] observed that if a fair coin is tossed times, is the probability that the magnitude of the difference between the number of heads and the number of tails exceeds . They used this insight along with the de Moivre-Laplace theorem [10, Section 1.3] (which is a special case of the Central Limit Theorem) to obtain the approximation .
To convert this into a rigorous inequality for we use a different tool from probability, namely, concentration inequalities. The inequalities are useful in quantifying how much a random variable deviates from its mean. We start with the following alternative interpretation for : it is twice the probability that greater than coin tosses result in heads (or equivalently tails). We associate each coin toss with an independent Bernoulli random variable with parameter since the coin is fair. The random variable has the Binomial distribution with parameters and . Then,
Since has the Binomial distribution, we can once again use the de Moivre-Laplace theorem, to say that as ,
Roughly speaking, this theorem says that behaves as a normal random variable with mean and variance . Since the tails of normal distributions decay exponentially, we expect that lies in the range with nearly probability; alternatively, the probability that it is outside this range is very small. To make this more precise, we apply Hoeffding’s concentration inequality [10, Theorem 2.2.6], to obtain
This gives our desired bound .
We can use a similar technique to prove the following result which may be of independent interest. If , then
Other concentration inequalities such as Chernoff and Bernstein (see [10, Chapter 2]) also give equally interesting bounds. We invite the reader to explore such results.
5 Acknowledgements
The author would like to thank Alen Alexanderian, Ethan Dudley, Ivy Huang, Ilse Ipsen, and Nathan Reading for comments and feedback. The work was supported by the National Science Foundation through the grants DMS-1745654 and DMS-1845406.
6 Declaration of Interest
The author has no relevant financial or non-financial competing interests to report.
Appendix A Derivation of the monomial expansion
In this appendix, we provide a short derivation of (2). We take and write
We have used Euler’s formula and the binomial theorem. At this point, the derivation splits into two different paths:
- 1. is odd
-
- 2. is even
-
In either case, substitute and to complete the derivation.
References
- [1] H. Avron and S. Toledo. Randomized algorithms for estimating the trace of an implicit symmetric positive semi-definite matrix. Journal of the ACM (JACM), 58(2):8, 2011.
- [2] W. J. Cody. A survey of practical rational and polynomial approximation of functions. SIAM Review, 12(3):400–423, 1970.
- [3] T. A. Driscoll, N. Hale, and L. N. Trefethen. Chebfun Guide. Pafnuty Publications, 2014.
- [4] E. Dudley, A. K. Saibaba, and A. Alexanderian. Monte Carlo estimators for the Schatten p-norm of symmetric positive semidefinite matrices. arXiv preprint arXiv:2005.10174, 2020.
- [5] C. Lanczos. Applied analysis. Dover Books on Advanced Mathematics. Dover Publications, Inc., New York, 1988. Reprint of the 1956 original.
- [6] Y. Nakatsukasa and L. Trefethen. Rational approximation of . Proceedings of the American Mathematical Society, 146(12):5219–5224, 2018.
- [7] D. J. Newman and T. J. Rivlin. Approximation of monomials by lower degree polynomials. Aequationes Mathematicae, 14(3):451–455, 1976.
- [8] A. R. Reddy. Approximations to and —a survey. J. Approx. Theory, 51(2):127–137, 1987.
- [9] L. N. Trefethen. Approximation theory and approximation practice. Society for Industrial and Applied Mathematics (SIAM), Philadelphia, PA, 2013.
- [10] R. Vershynin. High-dimensional probability, volume 47 of Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge University Press, Cambridge, 2018.