Explicit Improvements to the Burgess Bound via Pólya–Vinogradov
Abstract.
We make explicit a theorem of Fromm and Goldmakher [Fromm], which states that one can improve Burgess’ bound for short character sums simply by improving the leading constant in the Pólya–Vinogradov inequality. Towards achieving this, we establish explicit versions of several estimates related to the mean values of real multiplicative functions and the Dickman function.
1. Introduction
Given a Dirichlet character , it is often the case that we need to consider the size of the corresponding character sum,
(1) |
Owing to the orthogonality relation on residues modulo , one only ever needs to consider the case that the character sum is short, i.e., . In this case, we have the trivial estimate,
There are two standard non-trivial estimates for the size of (1). First, the Pólya–Vinogradov inequality, (henceforth referred to as the “P–V inequality”). Second, Burgess’ bound, , for and an integer . If the modulus is a prime, then both of these estimates can be used to show that
(2) |
for large enough .
One might consider the Burgess bound to be a better result, however, unless the character sum is particularly long. Specifically, P–V implies that (2) holds for , while Burgess’ bound implies that (2) holds for . The proof of the Burgess bound also relies on advanced results due to Weil [weil19481], while the standard proof of the P–V inequality is substantially easier. Finally, one of the best-known P–V inequalities is proved using the effective range of Burgess’ bound, see [Hildebrand1988].
Conversely, when working with explicit versions of these estimates, any improvement to the leading constant in the P–V inequality will immediately yield improvements in the leading constant for Burgess’ bound (see, for example, [trevino2015] and [Forrest]). Fromm and Goldmakher [Fromm] have recently established that, in fact, improvements to the P–V inequality can be used to extract improvements to the effective range (with respect to ) in Burgess’ bound. Precisely, they establish the following relationship.
Theorem 1.1.
[Fromm]*Theorem A Suppose the P–V inequality can be improved to for all even primitive quadratic . Then for all for all odd primitive quadratic .
Based on a suggestion Fromm and Goldmakher made in their paper, we will prove the following explicit version of Theorem 1.1. The interested reader may also consider the work of Mangerel [Mangerel], for a different approach to the relationship between P–V and Burgess.
Theorem 1.2.
Suppose the P–V inequality can be improved to
for all even primitive quadratic . Then for all odd primitive quadratic characters we have for , with and as in Lemma 1.3, such that .
The above result is particularly interesting, as it the first to shows that a Burgess-like result depends in a meaningful way on the leading constant in the P–V inequality. In Table 1, we compare for various using the best known P–V constant and several powers of .
0.5 | 0.25 | 0.05 | 0.025 | ||
---|---|---|---|---|---|
1 | |||||
From Table 1, one sees that roughly decays in magnitude as does. However, even to obtain an improvement over the trivial bound would require significant improvements over the best available choices of . One should expect this behaviour, since one also expects to be able to take tending to 0. Additionally, since the best in the P–V inequality is obtained via Burgess’ bound, one does not expect to have for all while is fixed. While there is room for improvement in , we believe that our result has significance as the first of its kind. This is also part of the reason, together with the heavy analytic machinery employed, why is not yet optimal. We hope this result will increase the interest in the explicit correlation between P–V and Burgess’ bound.
As an aside, note that in Theorem 1.2, we have still included some terms. This is because many of the best known P–V results appear in this form. This choice also makes the exposition more concise. Further attempts in line with this article, in particular those using completely explicit P–V results like [frolenkov2013] or [Bordignon], should be able to make the result completely explicit.
In order to obtain Theorem 1.2, we must establish some notation. Let
The result that allowed Fromm and Goldmakher to obtain Lemma A in [Fromm] is a correlation between the two functions defined above. This correlation, Lemma B in [Fromm], assures us that if is bounded away from zero, then will be as well (for certain ). The proof of Theorem 1.2 relies on establishing an explicit version of Lemma B in [Fromm].
Lemma 1.3.
Given and such that
with
for all completely multiplicative functions , , .
This result allows us to prove Theorem 1.2.
Proof of Theorem 1.2.
Here we follow the proof of Theorem A [Fromm]. Using Lemma 2.1 [Fromm] and assuming , we obtain infinitely many characters such that
with the least prime larger than which satisfies . We therefore have a contradiction if , i.e. when . We can further simplify this by observing that we trivially have , that results optimal for large . Using the version of Bertrand’s postulate for primes in arithmetic progressions in [Breusch], with the assumption , we have that . Note that assuming a smaller upper bound for , together with Corollary 6 in [Bennett], is possible to reduce the constant to , we decided not to do so to keep the result as concise as possible. Thus, we obtain
∎
The proof of Lemma 1.3 will require two results, which will make up the bulk of this article. The easier of these is the following explicit version of Theorem 2 in [Hildebrand] applied to (another non-explicit version of this result can be found in [Granville2007.1]). First, for a given multiplicative function , let us define
Theorem 1.4.
Let be a completely multiplicative function as defined in (1.1) of [Hildebrand]. Then, we have
The second result, which is the harder to prove, is an explicit version of Theorem III.4.14 in Hall and Tenenbaum [tenenbaum2015]. In our current application, we focus on functions which are quadratic Dirichlet characters, but there are variants of this theorem which cover a much larger class of functions (for example, see the main theorem of [hall1991]).
Theorem 1.5.
Let be the unique solution to
Note that . If is a real, completely multiplicative function, we have, uniformly for ,
(3) |
We can now easily prove Lemma 1.3.
Proof of Lemma 1.3.
In Section 2 we will prove Theorem 1.4. In Section 3 we will prove a partially explicit version of an upper bound for the mean value of multiplicative functions, that works as an intermediate result for Theorem 1.5. In Section 4, we introduce some explicit bounds related to prime numbers; applying these results to those obtained in the previous sections, we conclude with a proof of Theorem 1.5. To ease the understanding of the relationships between the results we introduce the following scheme.
2. Lower bound for the mean value theorem for a non-negative multiplicative function
The aim of this section is to prove Theorem 1.4. We start by giving an explicit lower bound for the Dickman function, , defined by
with initial conditions for . Note that we will follow Buchstab’s approach from [buchstab] for large , alongside computations for small .
Lemma 2.1.
Assuming , we have
(6) |
Proof.
Using the the built-in Dickman function in Sage, we determine that for we can take as an exponent . Note that we are limited to this interval due to the computational complexity. We can thus use the following result due to Buchstab [buchstab], that tells that for and , we have
(7) |
and the result follows taking . ∎
It worth noting that, by [buchstab], the right size for the constant in the exponent of (6) is . Since we want a uniform result, a lower bound for the term appears, by computation, to be . Obtaining this result appears difficult as (7) does not give a good estimate for small values of . One might get around this by making explicit other asymptotic results for , such as the one in [deBruijn], but we have not pursued this here. We can now prove Theorem 1.4.
3. A partially explicit upper bound for the mean value of multiplicative functions
In this section, we aim to prove an explicit version of a theorem of Montgomery [montgomery1978], regarding the mean value of multiplicative functions. He restricted his interest, as will we, to completely multiplicative functions. The more general case involves technical changes, see [tenenbaum2015], which make the leading constant increase significantly.
We start by introducing a well-known, but useful, result.
Lemma 3.1.
Assuming , with and we have
Proof.
By Euler–Maclaurin, we have
Thus, taking ,
Now it follows from
that
Proceeding in the same way, we obtain
The result easily follows remembering that . ∎
Everything is in place to prove an explicit version of the inequality in [montgomery1978]. Note that our result appears slightly different with compared with the cited one, as we have tailored the optimization of the constant for the current application.
Theorem 3.2.
Let be a completely multiplicative function such that . Set
We define
Then, for large enough,
(8) |
Proof.
We now establish, for , the following result
(9) |
By the Cauchy–Schwarz inequality, with ,
We can observe that, with ,
Defining , then
Thus, taking , the proof of (9) reduces to that of
The equation
allows us to write Plancherel’s formula as
We assume arbitrary large. For we have, by (4.46) in [tenenbaum2015]
We now need to estimate the contribution in the complementary range . We write
The right hand side integral does not exceed
We can observe that
and, choosing to have small enough, by Lemma 3.1 we obtain
Thus (9) is obtained taking . We now introduce (4.39) from [tenenbaum2015]
(10) |
With the above result and using (9) we can now finish the proof as follows
∎
4. Explicit mean value estimates for real multiplicative functions
In this section we aim to prove Theorem 1.5. We will first, in subsection 4.1 and 4.2, introduce some useful explicit results and then tackle Theorem 1.5 in subsection 4.3.
4.1. Prime counting estimates
Take to be the prime counting function. We provide two versions of the Prime Number Theorem (PNT), the first good for small and the second for big. Assuming , by [rosser1962] we have
(11) |
Defining
(12) |
and taking , by Corollary 2 [Trudgian], we have
(13) |
Note that there is a better version of the PNT due to Platt and Trudgian [Platt2019]. However, we will turn the above result into a uniform one and the improvement obtained using Platt and Trudgian’s result is not clear and would make the following exposition longer and more complicated. We also note that another way to improve the result could be using the improved zero-free region for the Riemann zeta function given in [Mossinghoff]. We now provide some useful bounds on .
Lemma 4.1.
For we have
Proof.
By repeatedly integrating (12) by parts, we have
and the result follows by observing that the last integral is positive for . ∎
From Lemma 5.9 [Bennett] we have, for
(14) |
We can now prove the main lemma.
Lemma 4.2.
For all we have
(15) |
(16) |
and
(17) |
Proof.
Let be a -periodic function of bounded variation on , writing , , we can now prove the following results. Assuming , by (15) we obtain
(18) |
and, by (16),
(19) |
By (17) we obtain
(20) |
We focus on the two following cases. For , , with , and
we obtain
(21) |
where the will be computed later, optimizing on and . For ,
(22) |
with , we obtain
(23) |
where the will be computed later optimizing on and .
The above upper bounds (21) and (23) will be used in the next section to prove an explicit version of Lemma III.4.13 of [tenenbaum2015]. It is interesting to note that within this non-explicit result, a stronger version of (13) was used, to assure that (21) and (23) would converge for any . As there is no explicit version of this stronger PNT, we have that the two series converge only for certain values of . This will come with a loss in a term in Lemma 4.7, and therefore balancing it with the above sums will be fundamental.
4.2. Some useful lemmas
The bulk of the proof of Theorem 1.5 can be contained in the following lemmas, which encapsulate explicit versions of Lemma III.4.13 of [tenenbaum2015].
Lemma 4.3.
Proof.
It is sufficient to prove this for . Define . By partial summation, we have
(27) | ||||
For the second term in (27), we have from Equation 3.6 of [hall1988] that, for any real and ,
(28) |
By the second mean value theorem for integrals, there exists a so that
(29) |
Recall Mertens’ second theorem in the following forms.
Proposition 4.4.
Let . We have
where and
Proof.
This is the Corollary to Theorem 5 in [rosser1962]. ∎
Proposition 4.5.
Let . We have
(31) |
Proof.
The bounds follow from Theorem 5 in [rosser1962] and some simple computations. Note that the upper bound is optimal, with equality occurring at . ∎
We also introduce a helpful estimate.
Proposition 4.6.
Let . We have
Proof.
For , one may verify that
When , we begin by applying partial summation to the sum in question
(32) | ||||
(33) |
One may compute the first integral exactly and find that it is bounded by 65.204. For the other instances of , it is suitable to use [dusart1998]*Theorem 1.10.7, which states that, for ,
(34) |
Taking (34) in (32) and simplifying, one arrives at
where
We observe that until , and then takes a maximum at . At this maximum, , establishing the result. ∎
We can now obtain an important explicit estimate.
Lemma 4.7.
Proof.
We may assume . Start by considering . We observe that the Taylor expansion of yields
(36) |
Hence,
(37) |
Applying Propositions 4.4 and 4.6 to (37), we obtain
(38) |
Let be a constant that will be chosen later. When , we have
(39) |
Now, we consider . If , then (37) yields
(40) | ||||
Noting that , , and , we can now take in Lemma 4.3. This yields
(41) |
where is taken from (25). Combining (40) and (41) gives
(42) |
For our choice of , we focus on minimizing . Taking , we find that the best choice of is 2.67 and this leads to (42) becoming
(43) |
If , we first consider the case that . Taking as in (22) and in Lemma 4.3, we obtain
(44) |
where is bounded in (26) (and therefore depends on choices of and ). It follows trivially from Proposition 4.4 and that
(45) |
Taking (44) and (45) together, with our choice for , yields
(46) | ||||
We need to optimize the last three terms of (46) with respect to and . For fixed and , it appears that as defined in (23) is decreasing in , but the savings are slight for large enough . Therefore, in the interest of simpler computations, we choose . Some rough optimization over the terms involving in (46) shows that gives a relatively small maximum over these terms as a function of . Making this choice of and bounding the terms by their maximum in , we determine that
(47) |
The final case to consider is and , but in this case the sum in question is bounded by the sum estimated in (45). Given our choice of , this implies
(48) |
Taking (43) as the worst case between (39), (43), (47) and (48), completes the proof.
∎
Here is interesting to note that, as it will be clear from the following results, the constant is the main contributor to the size of the constant in Theorem 1.5, and thus of . Thus reducing would be a good starting point to improve Theorem 1.2.
Lemma 4.8.
For , define to be the real number satisfying
Then,
for any .
Proof.
Let be the Dirichlet series corresponding to . We have the following estimate.
Lemma 4.9.
For , we have
(50) |
where .
Proof.
Since is completely multiplicative, we have that
Therefore,
Applying the Taylor expansion of to the inside of the above sum, we obtain
(51) |
The sum over primes in the right-most term can be bounded above by the “prime” zeta function
which converges for . Therefore, we have
(52) |
The equality in (52) follows from the definition of , since
The following result will be used in bounding the sum over the primes in Lemma 4.9.
Lemma 4.10.
Uniformly for , we have
Proof.
By partial summation we have
Using (3.6) from [rosser1962], we then obtain
the result now follows taking the maximum over . ∎
Note that using a better explicit version of the PNT could improve the above result, as this improvement appears to be minor we decided, for the sake of simplicity, for the above version.
4.3. Proof of Theorem 1.5
Consider , where and . By Lemma 4.9, we have
(53) |
We break the sum over primes in (53) at , yielding the bound
(54) | ||||||
Simply ignoring in the first sum on the right of (54) and applying Lemma 4.10 to the second sum, we obtain
(55) |
Now, we may apply Lemma 4.8 to the remaining sum in (55) and place this estimate in (53) to establish
(56) |
Write . Recalling Theorem 3.2, we see that (56) implies
The integer sum above is a computable constant. Calling its square root , we have
(57) |
and note that . Now, if is defined by
(58) |
then, for ,
(59) | ||||
Recalling the definition of in Lemma 4.8 and using Proposition 4.5 we easily obtain,
which, when applied to (57) implies
(60) |
Taking this estimate for in Theorem 3.2, we find that
(61) | ||||
Here, is the collected constant term up to this point. It follows from (58) that
and that for , we can take . For , we have . Furthermore, one may verify that for and for . Therefore, we may write
(62) | ||||
Acknowledgments
We would like to thank our supervisor, Tim Trudgian, for suggesting this topic to us and for his insightful comments. We are also grateful to Leo Goldmakher for his comments on an earlier version of this article and for bringing [Mangerel] to our attention.