This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Composite values of shifted exponentials

Olli Järviniemi Department of Mathematics and Statistics, University of Turku, 20014 Turku, Finland [email protected]  and  Joni Teräväinen Department of Mathematics and Statistics, University of Turku, 20014 Turku, Finland [email protected]
Abstract.

A well-known open problem asks to show that 2n+52^{n}+5 is composite for almost all values of nn. This was proposed by Gil Kalai as a possible Polymath project, and was first posed by Christopher Hooley. We settle this problem assuming GRH and a form of the pair correlation conjecture. We in fact do not need the full power of the pair correlation conjecture, and it suffices to assume an inequality of Brun–Titchmarsh type in number fields that is implied by the pair correlation conjecture. Our methods apply in fact to any shifted exponential sequence of the form anba^{n}-b and show that, under the same assumptions, such numbers are kk-almost primes for a density 0 of natural numbers nn. Furthermore, under the same assumptions we show that apba^{p}-b is composite for almost all primes pp whenever (a,b)(2,1)(a,b)\neq(2,1).

2010 Mathematics Subject Classification:
11A41, 11R44, 11R42

1   Introduction

{NoHyper}Keywords: Exponential sequences, Chebotarev density theorem, sieve methods.

It is a notorious open question to determine whether a shifted exponential sequence anba^{n}-b for given a>1a>1 and b{0}b\in\mathbb{Z}\setminus\{0\} produces infinitely many primes as nn ranges over the positive integers \mathbb{N}. One expects that this should be the case whenever there is no obvious reason for anba^{n}-b to be composite for all large nn, the obvious reasons being that anba^{n}-b either has a fixed prime divisor or that it factors as the result of a polynomial identity. This leads to the following question.

Question 1.1.

Let a>1a>1 and b0,1b\neq 0,-1 be integers. Assume that for every q1q\geq 1 there exists 1rq1\leq r\leq q such that:

  • (i)

    The sequence aqn+rba^{qn+r}-b has no fixed prime divisor;

  • (ii)

    There is no m2m\geq 2, mqm\mid q such that barba^{-r} is an mmth power of a rational number, and further if 4q4\mid q there is no cc\in\mathbb{Q} such that bar=4c4ba^{-r}=-4c^{4});

Then does the sequence anba^{n}-b contain infinitely many primes?

Here case (ii) corresponds to the well-known fact that the binomial xnd[x]x^{n}-d\in\mathbb{Q}[x] is reducible if and only if dd is an mmth power of a rational number for some m2m\geq 2, mnm\mid n, or 4n4\mid n and d=4c4d=-4c^{4} for some rational cc. The exclusion of b=1b=-1 stems from the fact that for b=1b=-1 one easily sees that the only possible primes in the sequence are of the form a2m+1a^{2^{m}}+1, and as is discussed below, probabilistic heuristics suggest that there are only finitely many such primes (even though (i) and (ii) hold for (a,b)=(2,1)(a,b)=(2,-1), say). See Section 11 for further discussion on the necessity of the conditions.

Question 1.1 is closely connected with two conjectures that are among the oldest in number theory: the existence of Mersenne primes and Fermat primes. Mersenne primes are primes of the form 2p12^{p}-1 with pp a prime; one easily sees that any prime of the form 2n12^{n}-1 must be of this form. Many of the largest known primes are Mersenne primes. Fermat primes are primes in the sequence 22m+12^{2^{m}}+1; again, all the primes of the form 2n+12^{n}+1 are easily seen to have this shape. The first five Fermat numbers (m=0,1,2,3,4m=0,1,2,3,4) are all prime, but extensive numerical searches have produced no further Fermat primes. A widely-believed conjecture (supported by probabilistic arguments; see [11, Problem A3]) is that there are infinitely many Mersenne primes but only finitely many Fermat primes. These two assertions seem to be well beyond reach of all known methods.

Although little is known about the primality of 2n±12^{n}\pm 1, it is noteworthy that 2n±12^{n}\pm 1 can be prime only for a set of integers nn of natural density 0; this follows immediately from the form that the exponents of the Mersenne and Fermat primes take. In light of this, one conjectures more generally that anba^{n}-b is prime for a natural density 0 of natural numbers nn. Even this is still an open problem. The problem, in the concrete special case of the sequence 2n+52^{n}+5, was suggested by Gil Kalai in [16] as a possible Polymath project (with the comment that it “might be too hard”). The problem was originally studied by Christopher Hooley in his book [13], and was popularized by Peter Sarnak111Sarnak remarks in [30] that “Even a problem like 2n+52^{n}+5 being composite for almost all nn is very problematic (Hooley).” [30].

In the discussion in [16], it was suggested that in order to make any progress on the compositeness of 2n+52^{n}+5 one should assume GRH. Our first main result confirms that 2n+52^{n}+5 is indeed composite for almost all nn, as well as the analogous result for general shifted exponential sequences anba^{n}-b, assuming GRH and an additional Brun-Titchmarsh type inequality for primes satisfying conditions of form “p1(modm)p\equiv 1\pmod{m}, cc is a perfect mm^{\prime}th power modulo pp”. We postpone detailed discussion of the hypotheses to Subsection 1.1.

Hypothesis 1.2 (A Brun–Titchmarsh estimate on average).

Let aa and bb be fixed coprime integers with |a|,|b|>1|a|,|b|>1. The following holds for any small enough 0<ϵ<ϵ0<\epsilon^{\prime}<\epsilon:

Let mm be a positive integer and let mm^{\prime} be a prime with mmm^{\prime}\mid m and m[(logm)ϵ,mϵ]m^{\prime}\in[(\log m)^{\epsilon^{\prime}},m^{\epsilon}]. Then, uniformly for ymexp((logm)1/2)y\geq m\exp((\log m)^{1/2}), we have

(1.1) r=0m1|{py:p1(modm),bar is perfect mth power mod p}|2ϵ,ϵy2ϕ(m)2(logy)2max{1(logy)2,1(m)ϵ}.\displaystyle\begin{split}&\sum_{r=0}^{m^{\prime}-1}|\{p\leq y:p\equiv 1\pmod{m},ba^{-r}\text{ is perfect }m^{\prime}\text{th power mod }p\}|^{2}\\ &\ll_{\epsilon,\epsilon^{\prime}}\frac{y^{2}}{\phi(m)^{2}(\log y)^{2}}\max\left\{\frac{1}{(\log y)^{2}},\frac{1}{(m^{\prime})^{\epsilon}}\right\}.\end{split}

We then state our main results.

Theorem 1.3.

Assume GRH and Hypothesis 1.2. Let a>1a>1 and bb be integers. The natural density of positive integers nn such that anba^{n}-b is composite is 11.

Remark 1.4.

Hooley had shown in [13] that 2nb2^{n}-b is composite for almost all nn assuming GRH and an essentially self-serving hypothesis. Our task in this paper is to remove this self-serving hypothesis and replace it with a more natural one.

Remark 1.5.

One motivation for Sarnak to popularize Hooley’s problem was that Bourgain, Gamburd and Sarnak [3, Theorem 3] managed to settle the analogous problem for the Markoff numbers. Markoff triples are defined as the solutions (x1,x2,x3)3(x_{1},x_{2},x_{3})\in\mathbb{N}^{3} to the Diophantine equation x12+x22+x32=3x1x2x3x_{1}^{2}+x_{2}^{2}+x_{3}^{2}=3x_{1}x_{2}x_{3}, and the Markoff numbers are the increasing sequence formed by the largest coordinates of Markoff triples (x1,x2,x3)(x_{1},x_{2},x_{3}) (with multiplicities). By the result of [3] almost all Markoff numbers are composite, and their number up to XX is (logX)2\asymp(\log X)^{2}. Therefore, the sequences anba^{n}-b are even sparser.

Sarnak [30] also connected Hooley’s problem to the affine sieve developed in [29][2]; the affine sieve theorem of Salehi Golsefidy and Sarnak [29] applies to counting almost primes lying in an orbit of a group of affine linear transformations under the assumption that the Zariski closure of the group is Levi-semisimple. The authors of [29] present a heuristic argument for the necessity of the Levi-semisimplicity condition, which boils down to understanding almost primality questions for shifted exponential functions, such as the one above.

In addition to showing compositeness of anba^{n}-b for almost all nn, we are more generally able to show that anba^{n}-b is not a kk-almost prime (a number with at most kk prime factors) for a density 11 set of natural numbers nn. We prove this in the following strong sense.

Theorem 1.6.

Assume GRH and Hypothesis 1.2. Let a>1a>1 and b0b\neq 0 be coprime integers. Then there exists a constant c=ca,b>0c=c_{a,b}>0 such that the natural density of positive integers nn such that ω(anb)cloglogn\omega(a^{n}-b)\geq c\log\log n is 11.

Here, as usual, ω(n)\omega(n) denotes the number of distinct prime factors of nn.

We will in fact prove that the number of prime divisors of anba^{n}-b which are less than n\sqrt{n} is almost always loglogn\gg\log\log n. On the other hand, anba^{n}-b should of course have a lot of prime factors p>np>\sqrt{n} as well, but our method does not apply to detecting these large factors.

One naturally wonders whether one can say something about the compositeness of anba^{n}-b even when nn is restricted to an interesting subset of natural numbers, particularly the primes. In the case of Mersenne primes, which are of the form 2p12^{p}-1 with pp prime, it appears unknown that there are even infinitely many composite numbers in this sequence. Nevertheless, we are able to apply our methods also to the sequence apba^{p}-b with prime exponents, as long as we are not in the case a=2a=2, b=1b=1 that corresponds to Mersenne primes.

Theorem 1.7.

Assume GRH and Hypothesis 1.2. Let a>1a>1 and bb be integers with (a,b)(2,1)(a,b)\neq(2,1). The relative density of primes pp for which apba^{p}-b is composite is 11.

1.1   The hypotheses

In this subsection we discuss the hypotheses appearing in our results, namely the generalized Riemann hypothesis (GRH) and Hypothesis 1.2. We also connect Hypothesis 1.2 to the pair correlation conjecture (PCC) stated below.

The precise form of the generalized Riemann hypothesis (GRH) that we need is as follows. It only involves certain special field extensions, which we call Kummer-type extensions.

Definition 1.8.

We say that a field extension K/K/\mathbb{Q} is a Kummer-type extension if KK is of the form K=(ζ,a11/m1,,ak1/mk)K=\mathbb{Q}(\zeta,a_{1}^{1/m_{1}},\ldots,a_{k}^{1/m_{k}}), where ζ\zeta is a primitive root of unity of some order mm, ai>1a_{i}>1 are integers and mimm_{i}\mid m for all iki\leq k.

GRH: For any Artin LL-function222See [24] for an introduction to the theory of Artin LL-functions. associated with a Kummer-type extension, its zeros in the critical strip Re(s)(0,1)\textnormal{Re}(s)\in(0,1) all lie on the critical line Re(s)=1/2\textnormal{Re}(s)=1/2.

We then give a couple of remarks on Hypothesis 1.2.

  • The condition “p1(modm)p\equiv 1\pmod{m}, barba^{-r} is perfect mm^{\prime}th power mod pp” is naturally viewed as “pp splits completely in (ζm,(bar)1/m)\mathbb{Q}(\zeta_{m},(ba^{-r})^{1/m^{\prime}})”. Hence Hypothesis 1.2 asks for a bound on the number of totally split primes in a number field.

  • One may view the hypothesis as a generalization of the classical Brun–Titchmarsh inequality (with extra averaging), which states that, uniformly in the region y>my>m, we have

    |{py:psplits completely in(ζm)}|yϕ(m)log(y/m).\displaystyle|\{p\leq y:p\,\,\textnormal{splits completely in}\,\,\mathbb{Q}(\zeta_{m})\}|\ll\frac{y}{\phi(m)\log(y/m)}.

    This, combined with the fact that pp can only split completely in Oa,b(1)O_{a,b}(1) extensions of the form (ζm,(bar)1/m)\mathbb{Q}(\zeta_{m},(ba^{-r})^{1/m^{\prime}}), gives an upper bound for the left-hand side of (1.1) of the form y2/(ϕ(m)2(log(y/m))2)\ll y^{2}/(\phi(m)^{2}(\log(y/m))^{2}), and so this trivial bound falls short of Hypothesis 1.2 by a few logarithms at most.

  • One would more strongly expect for the left-hand side of (1.1) a bound of

    (1.2) ϵ,Ay2ϕ(m)2(logy)2max{1(logy)A,1m}\displaystyle\ll_{\epsilon^{\prime},A}\frac{y^{2}}{\phi(m)^{2}(\log y)^{2}}\max\left\{\frac{1}{(\log y)^{A}},\frac{1}{m^{\prime}}\right\}

    (and also without the primality assumption on mm^{\prime}), since one can prove under our assumptions that [(ζm,(bar)1/m:]ϕ(m)m[\mathbb{Q}(\zeta_{m},(ba^{-r})^{1/m^{\prime}}:\mathbb{Q}]\asymp\phi(m)m^{\prime}, and the probability of a prime splitting completely in K/K/\mathbb{Q} should be comparable to 1/[K:]1/[K:\mathbb{Q}] with a wide range of uniformity, as is known in the case of cyclic extensions.

    In fact, under GRH (which we are assuming in any case) the Chebotarev density theorem gives an asymptotic estimate of the form

    (1.3) y[(ζm,(bar)1/m):]+Oη((ym)1/2+η)\displaystyle\frac{y}{[\mathbb{Q}(\zeta_{m},(ba^{-r})^{1/m^{\prime}}):\mathbb{Q}]}+O_{\eta}((ym)^{1/2+\eta})

    for the number of primes splitting completely in (ζm,(bar)1/m)\mathbb{Q}(\zeta_{m},(ba^{-r})^{1/m^{\prime}}), and this gives (1.2) for ym3+O(η)y\geq m^{3+O(\eta)}. Therefore, the essence of Hypothesis 1.2 is that we can also deal with the range where ym1+δy\approx m^{1+\delta}.

  • Hypothesis 1.2 should be true in the full range m[1,m]m^{\prime}\in[1,m],. However, as is clear from the bound given by the Chebotarev density theorem, (1.1) becomes much more challenging to prove as the quantity ϕ(m)m\phi(m)m^{\prime} increases. Therefore, in our arguments we exercise great care to minimize the range of mm^{\prime} in which we assume (1.1), in the hope that the case of small mm^{\prime} would turn out to lie less deep than the case of large mm^{\prime}.

  • The bound (1.1) resembles a Brun–Titchmarsh estimate in number fields, as our naming of the hypothesis suggests. Unfortunately, the current Brun–Titchmarsh analogues of the Chebotarev density theorem are too weak to imply the Hypothesis. For example, the results of [32] or [5] are not applicable when mlogmm^{\prime}\gg\log m.

Next we connect Hypothesis 1.2 to the pair correlation conjecture (PCC), which in fact implies our hypothesis.

PCC: Let K/K/\mathbb{Q} be a Kummer-type extension, and let L(s,χ,K/)L(s,\chi,K/\mathbb{Q}) be the Artin LL-function associated with an irreducible character χ\chi of Gal(K/)\textnormal{Gal}(K/\mathbb{Q}). Define the pair correlation function

F(X,T;χ):=Tγ1,γ2Tw(γ1γ2)X(γ1γ2),\displaystyle F(X,T;\chi):=\sum_{-T\leq\gamma_{1},\gamma_{2}\leq T}w(\gamma_{1}-\gamma_{2})X^{(\gamma_{1}-\gamma_{2})},

where γ1,γ2\gamma_{1},\gamma_{2} run through the imaginary parts of zeros of L(s,χ,K/)L(s,\chi,K/\mathbb{Q}) on the critical line Re(s)=1/2\textnormal{Re}(s)=1/2 and w(u):=44+u2w(u):=\frac{4}{4+u^{2}} is a weight function (note that F(X,T;χ)F(X,T;\chi) is always real-valued). Further define the conductor 𝒜χ(T):=AχTχ(1)\mathcal{A}_{\chi}(T):=A_{\chi}T^{\chi(1)}, where Aχ:=dKχ(1)Norm(𝔣(χ))A_{\chi}:=d_{K}^{\chi(1)}\textnormal{Norm}(\mathfrak{f}(\chi)), where dKd_{K} is the degree of the field extension K/K/\mathbb{Q} and 𝔣(χ)\mathfrak{f}(\chi) is the Artin conductor of χ\chi. With this notation, for any C>0C>0 and 1XTχ(1)C1\leq X\leq T^{\chi(1)C}, we have

F(X,T;χ)Cχ(1)1Tlog𝒜χ(T).\displaystyle F(X,T;\chi)\ll_{C}\chi(1)^{-1}T\log\mathcal{A}_{\chi}(T).

Remarks.

  • This PCC conjecture is denoted PCC(χ,χ(1),χ(1)1,1)\textnormal{PCC}(\chi,\chi(1),\chi(1)^{-1},1) by M. R. Murty, V. K. Murty and Wong in [21, Conjecture 3.2] and is a special case of a more general conjecture PCC(χ,mχ,cχ,r)\textnormal{PCC}(\chi,m_{\chi},c_{\chi},r) stated there. They state that this conjecture first arose in unpublished work of M. R. Murty and V. K. Murty 20 years earlier.

  • Evidence for PCC includes the result [21, Proposition 3.1], which unconditionally gives

    F(X,T;χ)AT(log𝒜χ(T))2,\displaystyle F(X,T;\chi)\ll_{A}T(\log\mathcal{A}_{\chi}(T))^{2},

    so the content of the conjecture is in reducing the power of logarithm here. As noted in [21], the form of PCC stated here is weaker than some other forms of the pair correlation conjecture in the aspect that they further require an asymptotic formula for F(X,T;χ)F(X,T;\chi). In particular, Montgomery’s  [19] classical pair correlation conjecture for the Riemann zeta function (which corresponds to K=K=\mathbb{Q} and χ1\chi\equiv 1) predicts not only an upper bound of TlogT\ll T\log T for the pair correlation function of ζ(s)\zeta(s) but more strongly an asymptotic formula of the same order of magnitude. Montgomery in fact proved this asymptotic formula unconditionally when C1C\leq 1 in the notation of the PCC conjecture above. See also [25] for numerical evidence for the pair correlation conjecture in the case of the Riemann zeta function, [28] for an asymptotic formulation for all automorphic LL-functions, and [22] for the formulation for any Dirichlet series in the Selberg class.

Theorem 1.9.

PCC for all Kummer-type extensions implies Hypothesis 1.2.

In fact, as will be clear from the proof, PCC implies a much stronger version of Hypothesis 1.2, where we can take ϵ=1\epsilon=1.

Let us briefly comment on how our two assumptions, GRH and Hypothesis 1.2, enter the proof. The GRH assumption is clearly necessary, as will be seen from formula (1.4) below (which is currently provable under GRH but not unconditionally). Also, Chebotarev’s density theorem is a key tool in our proofs, and in order for its error bounds to be strong enough we need to assume GRH.

Hypothesis 1.2 is put into use in only one part of the proof. The idea is roughly as follows. For pp a prime, let (p)\ell(p) denote the least positive integer ll such that alb(modp)a^{l}\equiv b\pmod{p} (if such an integer ll exists). In the course of the proof, we need upper bounds for the number of primes pyp\leq y that satisfy p1(modm)p\equiv 1\pmod{m} and (p)r(modm)\ell(p)\equiv r\pmod{m}, for yy just a bit larger than mm (say ym1+ϵy\approx m^{1+\epsilon} or ymexp((logm)1/2)y\approx m\exp((\log m)^{1/2})). This condition can be naturally interpreted in terms of splitting of primes in certain Kummer-type extensions, and the Chebotarev density theorem tells us that such pp are roughly333One does not necessarily have exact equidistribution (in sense of the natural density) even for mm fixed. This technical detail is discussed in more detail later on. equidistributed among the different values of rr, at least for yy large in terms of mm. The small values of yy (compared to mm) are more difficult. However, as we only need upper bounds rather than actual equidistribution, we can afford to weaken the (p)r(modm)\ell(p)\equiv r\pmod{m} condition to (p)r(modm)\ell(p)\equiv r\pmod{m^{\prime}} for mmm^{\prime}\mid m being a relatively small divisor of the modulus. Now, Hypothesis 1.2 essentially states that the values (p)(modm)\ell(p)\pmod{m^{\prime}} for the primes pyp\leq y with p1(modm)p\equiv 1\pmod{m} are not clustered into a too small a set of residues, which is what we need in the proof.

In the case yy is very close to mm (ymexp((logm)1/2)y\leq m\exp((\log m)^{1/2})), we may neglect the condition (p)r(modm)\ell(p)\equiv r\pmod{m} and merely work with the congruence conditions p1(modm)p\equiv 1\pmod{m}. Such a crude estimate does not provide estimates good enough for our purposes in regions larger than ymexp((logm))1/2)y\leq m\exp((\log m))^{1/2}), so for larger values of yy one has to somehow account for the condition (p)r(modm)\ell(p)\equiv r\pmod{m}. As Hypothesis 1.2 is essentially our only tool for such considerations for yy relatively close to mm, we require it to be applicable in the region ymexp((logm)1/2)y\geq m\exp((\log m)^{1/2}).

1.2   Unconditional work

Not much is known about the composite values of anba^{n}-b unconditionally (except in the trivial case where conditions (i) and (ii) in Question 1.1 are not satisfied). Note that if the congruence anb0(modp)a^{n}-b\equiv 0\pmod{p} has a solution (p)\ell(p) (which we assume to be the minimal positive solution), the general solution to the congruence is given by n(p)(modordp(a))n\equiv\ell(p)\pmod{\textup{ord}_{p}(a)}. The set of such nn has natural density 1/ordp(a)1/\textup{ord}_{p}(a). Thus, by the Borel–Cantelli lemma, a necessary (but not sufficient) condition for almost all numbers of the form anba^{n}-b to be have a prime factor K\geq K for every given KK is that

(1.4) ppanbfor somen1ordp(a)=.\displaystyle\sum_{\begin{subarray}{c}p\\ p\mid a^{n}-b\,\,\textnormal{for some}\,\,n\end{subarray}}\frac{1}{\textup{ord}_{p}(a)}=\infty.

This estimate has not been proved unconditionally, meaning that if one wants to make progress on the composite values of anba^{n}-b, one must assume a conjecture that implies (1.4). It seems that the best known unconditional estimate for the number of primes pxp\leq x dividing some element of the sequence anba^{n}-b is logx\gg\log x (in contrast with the conjectured order of magnitude of π(x)\asymp\pi(x)), proved by M. R. Murty, Séguin and Stewart [23]; this is a bound that is unfortunately far too weak to imply (1.4).

It is also worth noting that Hooley showed unconditionally in [13] that the sequence n2n+1n\cdot 2^{n}+1 produces primes (the Cullen primes) for a density zero of natural numbers nn; as pointed out by Elsholtz in [6], this is a lot easier than the corresponding question for the sequence anba^{n}-b, since it is easier to control the distribution of Cullen primes in residue classes (regarding the problem of the sequence anba^{n}-b, he states that “current methods do not work” for it). Hooley’s method was further refined by Rieger [26] to yield that p2p+1p\cdot 2^{p}+1 is prime for a relative density 0 of primes pp.

1.3   Connection with the Artin primitive root conjecture

The most natural conjecture known to imply (1.4) is GRH; this implication follows from work of Moree and Stevenhagen [20]. Their work is related to Hooley’s [12] work on Artin’s primitive root conjecture, which is the statement that ordp(a)=p1\textup{ord}_{p}(a)=p-1 for infinitely many primes, whenever a>1a>1 is a fixed integer that is not a perfect power. A wide generalization of Artin’s conjecture was proved by Lenstra [18], again under GRH, and in fact our proof of Theorem 1.3 involves (among other things) adapting his work to sequences of the form anba^{n}-b. This produces the following intermediate result containing (1.4) (by Mertens’ theorem), which may be of independent interest.

Corollary 1.10.

Assume GRH. Let a>1a>1 be an integer that is not a perfect power, and let b0b\neq 0. Let k1k\geq 1 be the largest integer for which bb is a perfect kkth power (if |b|=1|b|=1, we define k=1k=1). Then the set of primes pp satisfying panbp\mid a^{n}-b for some nn and ordp(a)=p12k\textup{ord}_{p}(a)=\frac{p-1}{2k} possesses a natural density, which is positive and can be computed explicitly.

1.4   Acknowledgments

The authors thank the anonymous referees for numerous helpful corrections. The authors thank Peter Sarnak and Jesse Thorner for helpful discussions. The first author was supported by the Emil Aaltonen foundation and worked in the Finnish Centre of Excellence in Randomness and Structures (Academy of Finland grant no. 346307). The second author was supported by a Titchmarsh Fellowship and by European Union’s Horizon Europe research and innovation programme under Marie Skłodowska-Curie grant agreement No 101058904.

2   Notation, conventions, and some preliminaries

Without further mention, we assume GRH in the lemmas and theorems that follow. We will assume Hypothesis 1.2 only in the proof of Lemma 7.1 below.

We may assume in the proofs of Theorems 1.3 and 1.6 that aa is not a perfect power, as the case of aa being a perfect power immediately reduces to the generic case. If |b|1|b|\leq 1, the conclusion of Theorems 1.3 and 1.7 is trivial, since am1an1a^{m}-1\mid a^{n}-1 for mnm\mid n and am+1an+1a^{m}+1\mid a^{n}+1 for mnm\mid n and mm odd. Thus we may assume |b|>1|b|>1 when proving these theorems. We may also assume that (a,b)=1(a,b)=1, since otherwise anba^{n}-b is composite.

If pp is a prime not dividing aa, we denote by ordp(a)\textup{ord}_{p}(a) the least positive integer ee such that ae1(modp)a^{e}\equiv 1\pmod{p}.

We denote by ζk\zeta_{k} any primitive kkth root of unity.

We let τ(n),ϕ(n)\tau(n),\phi(n) denote the divisor and Euler phi functions, respectively. We let (a,b)(a,b) stand for the greatest common divisor of aa and bb and lcm(a,b)\textnormal{lcm}(a,b) for their least common multiple. By rad(n)\textnormal{rad}(n) we denote the product of the prime factors of nn.

The natural density of a set SS\subset\mathbb{N} is defined as the limit

d(S):=limx|S[1,x]|x,d(S):=\lim_{x\to\infty}\frac{|S\cap[1,x]|}{x},

provided that the limit exists. We similarly define the relative density of a set AA\subset\mathbb{N} in the set BB\subset\mathbb{N} as

limx|A[1,x]||B[1,x]|,\lim_{x\to\infty}\frac{|A\cap[1,x]|}{|B\cap[1,x]|},

assuming that ABA\subset B and that the limit exists. The density d(𝒫)d_{\mathbb{P}}(\mathcal{P}) of a subset 𝒫\mathcal{P} of the primes \mathbb{P} is to be considered as its relative density in the set of primes.

For any subset SS of the primes, we define πS(x):=|{px:pS}|\pi_{S}(x):=|\{p\leq x:\,\,p\in S\}|.

Given an extension L/KL/K of number fields, a prime 𝔭\mathfrak{p} of 𝒪K\mathcal{O}_{K}, and a prime 𝔓\mathfrak{P} of 𝒪L\mathcal{O}_{L} lying above 𝔭\mathfrak{p}, we define the Artin symbol (L/K𝔓)\left(\frac{L/K}{\mathfrak{P}}\right) as the unique element σGal(L/K)\sigma\in\textnormal{Gal}(L/K) satisfying

σ(α)αNorm(𝔭)(mod𝔓)\displaystyle\sigma(\alpha)\equiv\alpha^{\textnormal{Norm}(\mathfrak{p})}\pmod{\mathfrak{P}}

for all αL\alpha\in L. Further, if K=K=\mathbb{Q} and pp is a rational prime unramified in LL, we define (L/p)\left(\frac{L/\mathbb{Q}}{p}\right) as the conjugacy class of possible values of (L/𝔓)\left(\frac{L/\mathbb{Q}}{\mathfrak{P}}\right) with 𝔓\mathfrak{P} lying above pp; it can be checked that such values do indeed form a conjugacy class. (This definition could be extended to KK\neq\mathbb{Q} as well, but we will only need the case K=K=\mathbb{Q}.)

The prime pp is said to split completely in LL if (L/p)\left(\frac{L/\mathbb{Q}}{p}\right) is the conjugacy class consisting of the identity of Gal(L/)\textup{Gal}(L/\mathbb{Q}).

We use the following version of the Chebotarev density theorem, due to Serre [31] and conditional on GRH (this improves on work of Lagarias and Odlyzko [17]).

Lemma 2.1 (Chebotarev density theorem).

Assume GRH. Let KK be a finite Galois extension of \mathbb{Q} with Galois group GG. Let CC be a conjugacy class of GG. The number of (rational) unramified primes pp with pxp\leq x and Artin symbol (K/p)=C\left(\frac{K/\mathbb{Q}}{p}\right)=C is

πC(x)=|C||G|Li(x)+O(|C||G|x(logdisc(K/)+[K:]logx)),\pi_{C}(x)=\frac{|C|}{|G|}\textup{Li}(x)+O\left(\frac{|C|}{|G|}\sqrt{x}(\log\textnormal{disc}(K/\mathbb{Q})+[K:\mathbb{Q}]\log x)\right),

where disc(K/)\textnormal{disc}(K/\mathbb{Q}) is the discriminant of K/K/\mathbb{Q} and Li(x)=2xdtlogt\textup{Li}(x)=\int_{2}^{x}\frac{dt}{\log t}.

Proof.

This is [31, Théorème 4]. ∎

3   Overview of the method

In this section, we give a sketch of the proof method of Theorem 1.3; the actual details in subsequent sections are slightly different and more complicated, but the purpose of this section is just to illustrate the approach.

As already mentioned, we may assume that aa is not a perfect power. Let

(3.1) Pa={p:axb(modp)for somex};\displaystyle P_{a}=\{p\in\mathbb{P}:\,\,a^{x}\equiv b\pmod{p}\,\,\textnormal{for some}\,\,x\};

since PaP_{a} contains those primes pp for which aa is a primitive root, under GRH the set PaP_{a} contains a positive proportion of the primes. For each pPap\in P_{a}, there exists a unique integer (p)[1,ordp(a)]\ell(p)\in[1,\textup{ord}_{p}(a)] such that a(p)b(modp)a^{\ell(p)}\equiv b\pmod{p}. Now, for any integer n(p)(modordp(a))n\equiv\ell(p)\pmod{\textup{ord}_{p}(a)}, we have anb(modp)a^{n}\equiv b\pmod{p}, implying that anba^{n}-b is not prime (except possibly for n=(p)n=\ell(p)). The goal is to prove that the density of integers covered by such arithmetic progressions (p)(modordp(a))\ell(p)\pmod{\textup{ord}_{p}(a)} is 11. Thus the statement is that the residue classes (p)(modordp(a))\ell(p)\pmod{\textup{ord}_{p}(a)} for pPp\leq P\to\infty are nearly (up to a vanishing proportion of numbers) a covering system of the integers. This is generally speaking a rather delicate condition; see [8] and [1] for work on finite sets of congruences covering all but an ε\varepsilon-proportion of the integers444Note that whether or not residue classes ap(modp1)a_{p}\pmod{p-1} cover almost all numbers depends heavily on the values of the apa_{p}; for example, it is known that the proportion of integers covered by 0(modp1)0\pmod{p-1} for p>Pp>P approaches 0 as PP\to\infty. Also if 𝒫\mathcal{P} is any subset of the primes with sum of reciprocals less than 12\frac{1}{2}, say, then ap(modp1)a_{p}\pmod{p-1} for p𝒫p\in\mathcal{P} leaves uncovered a positive proportion of all numbers..

If the conditions x(p)(modordp(a))x\equiv\ell(p)\pmod{\textup{ord}_{p}(a)} were independent of each other (which would be the case by the Chinese reminder theorem if the moduli ordp(a)\textup{ord}_{p}(a) and ordq(a)\textup{ord}_{q}(a) were coprime for all pqp\neq q), the density of integers not covered by such congruence conditions would be

pPa(11ordp(a)).\prod_{p\in P_{a}}\left(1-\frac{1}{\textup{ord}_{p}(a)}\right).

Under GRH the relative density of PaP_{a} inside the primes is positive, so by Mertens’ theorem the above product evaluates to 0.

However, since ordp(a)\textup{ord}_{p}(a) and ordq(a)\textup{ord}_{q}(a) are typically not coprime555Note that it does not help to restrict to a subset of primes for which ordp(a)/2\textup{ord}_{p}(a)/2 are pairwise coprime, since such a set always has finite sum of reciprocals. If for example we restrict to primes pp such that (p1)/2(p-1)/2 is also prime, then it is known that such primes have bounded sum of reciprocals (even though their infinitude is not known)., we have to take into account the dependencies between the conditions imposed by different primes. For this we utilize the second moment method.

Lemma 3.1.

Let (Ω,Pr)(\Omega,\textnormal{Pr}) be any finitely additive probability space, and let A1,A2,,AnΩA_{1},A_{2},\ldots,A_{n}\in\Omega be events. Denote

(3.2) μ:=i=1nPr(Ai).\displaystyle\mu:=\sum_{i=1}^{n}\Pr(A_{i}).

Then, for any ε>0\varepsilon>0, we have

(3.3) Pr(xΩ:||{i[1,n]:xAi}|μ|εμ)ε2(i=1nj=1nPr(AiAj)/μ21).\displaystyle\Pr\left(x\in\Omega:\,\,||\{i\in[1,n]:\,\,x\in A_{i}\}|-\mu|\geq\varepsilon\mu\right)\leq\varepsilon^{-2}\left(\sum_{i=1}^{n}\sum_{j=1}^{n}\Pr(A_{i}\cap A_{j})/\mu^{2}-1\right).
Proof.

This follows by writing the left-hand side as

Pr(xΩ:|i=1n1Ai(x)μ|εμ)\displaystyle\Pr\left(x\in\Omega:\,\,\left|\sum_{i=1}^{n}1_{A_{i}}(x)-\mu\right|\geq\varepsilon\mu\right)

and using Chebychev’s inequality to upper-bound this as

ε2μ2𝐄|i=1n1Ai(x)μ|2,\displaystyle\varepsilon^{-2}\mu^{-2}\mathbf{E}\left|\sum_{i=1}^{n}1_{A_{i}}(x)-\mu\right|^{2},

and then expanding out the square. ∎

Remark 3.2.

If the AiA_{i} were pairwise independent, the right-hand side of (3.3) would be 0. More generally, if we show that the AiA_{i} are approximately independent, then the right-hand side of (3.3) is small.

In our case, the events ApA_{p} correspond to a “random” integer being congruent to (p)(modordp(a))\ell(p)\pmod{\textup{ord}_{p}(a)}, where pp ranges over Pa[1,x]P_{a}\cap[1,x], from which we get Pr(Ap)=1/ordp(a)\Pr(A_{p})=1/\textup{ord}_{p}(a). (See Section 5 for a precise formulation of this idea.) Then the expected value μ\mu appearing in (3.2) is, more or less, equal to

(3.4) pxpPa1ordp(a).\displaystyle\sum_{\begin{subarray}{c}p\leq x\\ p\in P_{a}\end{subarray}}\frac{1}{\textup{ord}_{p}(a)}.

For reasons explained below, we will in fact restrict to a subset of PaP_{a} where ordp(a)p1\textup{ord}_{p}(a)\gg p-1, which moreover has a positive relative density inside the primes. This then allows for an asymptotic evaluation of (3.4). We now turn to the sum of the pairwise intersections in (3.3), which are much more difficult to analyze.

Let pp and qq be primes, and write (ordp(a),ordq(a))=m(\textup{ord}_{p}(a),\textup{ord}_{q}(a))=m. The system y(p)(modordp(a)),y(q)(modordq(a))y\equiv\ell(p)\pmod{\textup{ord}_{p}(a)},y\equiv\ell(q)\pmod{\textup{ord}_{q}(a)} has a solution if and only if m(p)(q)m\mid\ell(p)-\ell(q). If a solution exists, then Pr(ApAq)=mordp(a)ordq(a)\Pr(A_{p}\cap A_{q})=\frac{m}{\textup{ord}_{p}(a)\textup{ord}_{q}(a)}, and otherwise Pr(ApAq)=0\Pr(A_{p}\cap A_{q})=0. Thus

(3.5) pxpPaqxqPaPr(ApAq)=mxp,qxp,qPa(p1,q1)=mm(p)(q)mordp(a)ordq(a).\displaystyle\sum_{\begin{subarray}{c}p\leq x\\ p\in P_{a}\end{subarray}}\sum_{\begin{subarray}{c}q\leq x\\ q\in P_{a}\end{subarray}}\Pr(A_{p}\cap A_{q})=\sum_{m\leq x}\sum_{\begin{subarray}{c}p,q\leq x\\ p,q\in P_{a}\\ (p-1,q-1)=m\\ m\mid\ell(p)-\ell(q)\end{subarray}}\frac{m}{\textup{ord}_{p}(a)\textup{ord}_{q}(a)}.

If the pairs ((p),(q))(\ell(p),\ell(q)) were equidistributed modulo all mxm\leq x, even when conditioned on the event (ordp(a),ordq(a))=m(\textup{ord}_{p}(a),\textup{ord}_{q}(a))=m, we would see that ApA_{p} and AqA_{q} are independent “on average”, resulting in the same asymptotics for (3.5) as for (3.4).

Unfortunately, the set PaP_{a} does not have the required equidistribution property for all moduli.666For example, if p±3(mod8)p\equiv\pm 3\pmod{8} and 2x9(modp)2^{x}\equiv 9\pmod{p}, then xx must be even. More generally, congruence conditions for pp give information on quadratic residues modulo pp, leading to bias even in power-free cases. For example, if 2x3(modp)2^{x}\equiv-3\pmod{p} and p1(mod3)p\equiv 1\pmod{3}, p±3(mod8)p\equiv\pm 3\pmod{8}, then 3-3 is a quadratic residue (modp)\pmod{p} and 22 is not, so xx must be even. In Section 4, we construct a positive density set SS of primes pp (depending on a,ba,b) for which axb(modp)a^{x}\equiv b\pmod{p} is solvable and for which the smallest solution (p)\ell(p) of the congruence does enjoy the required equidistribution property for all fixed mm by adapting the method of Lenstra [18]. We then apply Lemma 3.1 to this subset SS of primes in place of PaP_{a} (in which case showing the existence and positivity of d(S)d_{\mathbb{P}}(S) requires some work).

The main difficulty then is that we need to estimate (3.5) in all ranges of mm, not only for mm fixed (even though the case of bounded mm should give the main term in (3.5) and larger mm should only contribute an error term). This will be achieved in four steps, carried out in different sections.

  • (i)

    We handle the case of fixed mm (say mNm\leq N) in Section 6, making use of the equidistribution of the set SS in residue classes, which in turn is proved in Section 4.

  • (ii)

    The contribution of medium-large mm (say Nmmax{p,q}cN\leq m\leq\max\{p,q\}^{c} for some small constant c>0c>0) is handled in Subsection 7.2 using the effective version of the Chebotarev density theorem stated in Lemma 2.1.

  • (iii)

    The case of very large mm (say max{p,q}/exp(logmax{p,q})1/2)mmax{p,q}\max\{p,q\}/\exp(\log\max\{p,q\})^{1/2})\leq m\leq\max\{p,q\}) is handled (unconditionally) by applying Selberg’s sieve in Subsection 7.1. The range here is optimal in the sense that this argument (which starts by dropping the condition m(p)(q)m\mid\ell(p)-\ell(q)) would not work in any larger regime.

  • (iv)

    The remaining case of large mm (max{p,q}cmmax{p,q}/exp(logmax{p,q})1/2)\max\{p,q\}^{c}\leq m\leq\max\{p,q\}/\exp(\log\max\{p,q\})^{1/2})) is where we utilize Hypothesis 1.2. This case is dealt with in Subsection 7.3.

Combining the estimates for these four different regimes will yield Theorem 1.3.

We comment on some difficulties arising with controlling the contribution of large values of mm. To bound the sum in (3.5) one has to bound the number of primes x/2pxx/2\leq p\leq x (or more generally x1px2x_{1}\leq p\leq x_{2}) satisfying (p)r(modm)\ell(p)\equiv r\pmod{m} summed over all r(modm)r\pmod{m}. (The contribution of large mm would be too large if the values (p)(modm)\ell(p)\pmod{m} attained only a few values, so we cannot just drop the condition on (p)\ell(p) and work only with the congruence conditions p1(modm)p\equiv 1\pmod{m}.) This condition can be naturally interpreted as pp splitting completely in a certain field extension, at least for pSp\in S, where SS is the set constructed in Section 4.

If mm is small (say mx1/4ϵm\leq x^{1/4-\epsilon}), one can apply the GRH-conditional Chebotarev density theorem for all rr separately (see case (ii) above). If mm is of magnitude x\sqrt{x}, one can barely apply the Chebotarev density theorem for a single value of rr directly, and to control all mm values of rr one would need a uniform error term. If mm is much larger than x\sqrt{x}, say x3/4x^{3/4}, the GRH-conditional square root error term is inapplicable. It is here that we use Hypothesis 1.2.

Note that while for fixed mm (see case (i)) we need exact (asymptotic) equidistribution, for large values of mm we only need that the residues (p)(modm)\ell(p)\pmod{m^{\prime}} do not obtain a small set of values very often. Heuristically each value (modm)\pmod{m^{\prime}} occurs roughly 1/m1/m^{\prime} of the time (even with mm^{\prime} quite large), but we only need that each value occurs at most max{(1/m)ϵ,1/(logy)2}\max\{(1/m^{\prime})^{\epsilon},1/(\log y)^{2}\} of the time. This is indeed what Hypothesis 1.2 tells us.

4   Equidistribution of the discrete logarithm

Let hh be the largest integer such that |b||b| is a perfect power of order hh. Recall from Section 2 that we can assume aa is not a perfect power. Let SS\subset\mathbb{P} (which depends on aa and bb) be a set of primes defined as

S:={p:p1(mod|4abh|),a,bperfect 2hth powers(modp),ordp(a)=p12h}.\displaystyle S:=\{p\in\mathbb{P}:\,\,p\equiv 1\pmod{|4abh|},\,\,a,b\,\,\textnormal{perfect $2h$th powers}\pmod{p},\,\,\textup{ord}_{p}(a)=\frac{p-1}{2h}\}.

Note that for any pSp\in S we have ordp(b)p12h=ordp(a)\textup{ord}_{p}(b)\mid\frac{p-1}{2h}=\textup{ord}_{p}(a), so the congruence axb(modp)a^{x}\equiv b\pmod{p} has a solution.

Let (p)\ell(p) be the smallest nonnegative solution to axb(modp)a^{x}\equiv b\pmod{p} for pSp\in S. For each m,d,rm,d,r\in\mathbb{N}, let

(4.1) Sm,d,r:=S{p:mdordp(a),(p)r(modm)}.\displaystyle S_{m,d,r}:=S\cap\{p\in\mathbb{P}:\,\,md\mid\textup{ord}_{p}(a),\,\,\ell(p)\equiv r\pmod{m}\}.

Note then that for pSm,d,rp\in S_{m,d,r} any solution to axb(modp)a^{x}\equiv b\pmod{p} satisfies xr(modm)x\equiv r\pmod{m}. Note also that

pSm,d,rpSandmdordp(a)andx:barx2hm(modp).\displaystyle p\in S_{m,d,r}\Longleftrightarrow p\in S\quad\textnormal{and}\quad md\mid\textup{ord}_{p}(a)\quad\textnormal{and}\quad\exists x:\,ba^{-r}\equiv x^{2hm}\pmod{p}.

Indeed, for the forward implication note that if pSp\in S and (p)=r+jm\ell(p)=r+jm, then a(p)b(modp)a^{\ell(p)}\equiv b\pmod{p} is equivalent to barajm(modp)ba^{-r}\equiv a^{jm}\pmod{p}, and here aa is a 2h2hth power (modp)\pmod{p}. Conversely, suppose that pSp\in S, mdordp(a)md\mid\textup{ord}_{p}(a) and barx2hm(modp)ba^{-r}\equiv x^{2hm}\pmod{p}. Then by the assumption on ordp(a)\textup{ord}_{p}(a) we have x2hajx^{2h}\equiv a^{j} for some jj, so barajm(modp)ba^{-r}\equiv a^{jm}\pmod{p} for some jj, and this implies r+jm(p)(modm)r+jm\equiv\ell(p)\pmod{m}.

Our goal in this section is to prove the following two lemmas.

Lemma 4.1.

The relative density d(S)d_{\mathbb{P}}(S) exists and is positive.

Lemma 4.2.

For dd\in\mathbb{N} and m0(mod|2ab|)m\equiv 0\pmod{|2ab|} fixed, the relative density d(Sm,d,r)d_{\mathbb{P}}(S_{m,d,r}) exists and does not depend on rr.

Before proving these lemmas, we make some preliminary considerations.

We build on the arguments of Lenstra in [18]. Let

Fm,d,r:=(ζ|4abh|,ζ2hmd,a1/2h,b1/2h,(bar)1/2hm).\displaystyle F_{m,d,r}:=\mathbb{Q}(\zeta_{|4abh|},\zeta_{2hmd},a^{1/2h},b^{1/2h},(ba^{-r})^{1/2hm}).

Clearly if m0(mod|2ab|)m\equiv 0\pmod{|2ab|}, then ζ|4abh|(ζ2hmd)\zeta_{|4abh|}\in\mathbb{Q}(\zeta_{2hmd}). Since |b||b| is a perfect hhth power, |b|1/2h(ζ|4b|)|b|^{1/2h}\in\mathbb{Q}(\zeta_{|4b|}) (indeed, if c=|b|1/hc=|b|^{1/h}, then c(ζ|4c|)(ζ|4b|)\sqrt{c}\in\mathbb{Q}(\zeta_{|4c|})\subset\mathbb{Q}(\zeta_{|4b|})). Furthermore, b1/2hb^{1/2h} is either |b|1/2h|b|^{1/2h} or ζ4hj|b|1/2h\zeta_{4h}^{j}|b|^{1/2h} for some jj. In either case,

Fm,d,r=(ζ2hmd,a1/2h,(bar)1/2hm).F_{m,d,r}=\mathbb{Q}(\zeta_{2hmd},a^{1/2h},(ba^{-r})^{1/2hm}).

For a prime \ell, define

q():=min{α1:α2h}\displaystyle q(\ell):=\min\{\alpha\geq 1:\ell^{\alpha}\nmid 2h\}

to be the smallest power of \ell which does not divide 2h2h, and let

L:=(ζq(),a1/q()).L_{\ell}:=\mathbb{Q}(\zeta_{q(\ell)},a^{1/q(\ell)}).

We can now characterize the set Sm,d,rS_{m,d,r} as follows.

Lemma 4.3.

A large enough prime pp satisfies pSm,d,rp\in S_{m,d,r} if and only if pp splits completely in Fm,d,rF_{m,d,r} but not in any of the LL_{\ell}.

Proof.

We recall a few simple facts from Galois theory. Firstly, by Dedekind’s factorization theorem, an unramified prime pp splits completely in a field extension (α)\mathbb{Q}(\alpha) if and only if the minimal polynomial fα(x)[x]f_{\alpha}(x)\in\mathbb{Q}[x] of α\alpha factorizes into distinct linear factors (modp)\pmod{p} for pp large enough. Secondly, pp splits completely in (α,β)\mathbb{Q}(\alpha,\beta) if and only if pp splits completely in both (α)\mathbb{Q}(\alpha) and (β)\mathbb{Q}(\beta). Thirdly, the cyclotomic polynomial Φn(x)\Phi_{n}(x) factorizes into distinct linear factors (modp)\pmod{p} if and only if p1(modn)p\equiv 1\pmod{n}.

From these three facts and the well-known fact that the compositum of Galois extensions is Galois, we see that a large prime pp splits completely in Fm,d,rF_{m,d,r} but not in any of the LL_{\ell} if and only if aa is a 2h2hth power (modp)\pmod{p}, barba^{-r} is a 2hm2hmth power (modp)\pmod{p}, p1(mod2hmd)p\equiv 1\pmod{2hmd} and aa is not a q()q(\ell)th power (modp)\pmod{p} for any p1\ell\mid p-1. The first and last condition are equivalent to pSp\in S, and together with the second and third condition this is equivalent to pSm,d,rp\in S_{m,d,r}, so the claim follows (note that the special case m=d=1,r=0m=d=1,r=0 gives a characterization of the set SS). ∎

For nn a squarefree integer, let LnL_{n} be the compositum of LL_{\ell} for primes n\ell\mid n and let

q(n):=nq(),q(1):=1.\displaystyle q(n):=\prod_{\begin{subarray}{c}\ell\mid n\\ \ell\in\mathbb{P}\end{subarray}}q(\ell),\quad q(1):=1.

Define

an:=|{σGal(Fm,d,rLn/):σfixesFm,d,r,n:σdoes not fixL}|[Fm,d,rLnm:].\displaystyle a_{n}:=\frac{|\{\sigma\in\textnormal{Gal}(F_{m,d,r}L_{n}/\mathbb{Q}):\,\,\sigma\,\,\textnormal{fixes}\,\,F_{m,d,r},\,\,\forall\ell\mid n:\,\,\sigma\,\,\textnormal{does not fix}\,\,L_{\ell}\}|}{[F_{m,d,r}L_{n}m:\mathbb{Q}]}.

One would now expect that

(4.2) d(Sm,d,r)=limkank,\displaystyle d_{\mathbb{P}}(S_{m,d,r})=\lim_{k\to\infty}a_{n_{k}},

where nkn_{k} is the product of the first kk primes. This is because anka_{n_{k}} is by the Chebotarev density theorem equal to the density of primes pp which split completely in Fm,d,rF_{m,d,r} but in none of L,nkL_{\ell},\ell\mid n_{k}. Formula (4.2) was indeed proved by Lenstra [18, Theorem 3.1] under GRH. Thus, in particular, the density d(Sm,d,r)d_{\mathbb{P}}(S_{m,d,r}) exists.

Another relevant result of Lenstra concerns whether d(Sm,d,r)d_{\mathbb{P}}(S_{m,d,r}) equals to 0 or not. By (4.2), this corresponds to understanding the behavior of the local densities ana_{n}. Clearly anama_{n}\leq a_{m} for mnm\mid n, so if an=0a_{n}=0 for some nn, then d(Sm,d,r)=0d_{\mathbb{P}}(S_{m,d,r})=0. In fact, the converse is true as well, as shown by Lenstra [18, Theorem 4.1]: if an>0a_{n}>0 for all nn, then d(Sm,d,r)d_{\mathbb{P}}(S_{m,d,r}) is strictly positive. The idea of Lenstra’s proof is that the conditions of pp not splitting completely in LL_{\ell} are independent of each other and of the condition of pp splitting completely in Fm,d,rF_{m,d,r}, except for some finite set of primes \ell. In other words, LnFm,d,rL_{n}F_{m,d,r} and LL_{\ell} are linearly disjoint extensions for n\ell\nmid n and \ell large enough. This allows one to write the density in (4.2) as an infinite product

(4.3) d(Sm,d,r)=ann(11[L:]),\displaystyle d_{\mathbb{P}}(S_{m,d,r})=a_{n}\prod_{\begin{subarray}{c}\ell\nmid n\\ \ell\in\mathbb{P}\end{subarray}}\left(1-\frac{1}{[L_{\ell}:\mathbb{Q}]}\right),

where nn is the product of exceptional primes and the terms 11[L:]1-\frac{1}{[L_{\ell}:\mathbb{Q}]} correspond to the density of primes not splitting completely in LL_{\ell} (see [18, (5.11)]). Note that the infinite product in equation (4.3) converges to a strictly positive value, therefore giving both a method for determining whether d(Sm,d,r)=0d_{\mathbb{P}}(S_{m,d,r})=0 or not and a method for calculating this density.

We are now ready to prove our lemmas.

Proof of Lemma 4.1.

Consider the case m=d=1,r=0m=d=1,r=0 above. By (4.3), it is sufficient to check that an0a_{n}\neq 0 for all nn. Fix nn and consider the field

F1,1,0Ln=(ζ|4abh|,ζq(n),a1/2h,a1/q(n)).F_{1,1,0}L_{n}=\mathbb{Q}(\zeta_{|4abh|},\zeta_{q(n)},a^{1/2h},a^{1/q(n)}).

Let σ:F1,1,0LnF1,1,0Ln\sigma:F_{1,1,0}L_{n}\to F_{1,1,0}L_{n} be the automorphism which fixes ζ|4abh|\zeta_{|4abh|}, ζq(n)\zeta_{q(n)} and a1/2ha^{1/2h}, and which maps a1/q()a^{1/q(\ell)} to ζq()a1/q()\zeta_{q(\ell)}a^{1/q(\ell)} for all n\ell\mid n. Assuming that this is well-defined, the result follows from the Chebotarev density theorem (since then σ\sigma fixes F1,1,0F_{1,1,0} but none of LL_{\ell}). To prove that there exists such a map σ\sigma, let K:=(ζ|4abh|,ζq(n),a1/2h)K:=\mathbb{Q}(\zeta_{|4abh|},\zeta_{q(n)},a^{1/2h}). It suffices to prove that extending KK by the elements a1/q(),na^{1/q(\ell)},\ell\mid n one by one always increases the degree of KK by a factor of \ell (note that the degree increases by at most a factor of \ell). In other words, it suffices to prove that the degree of a1/q()a^{1/q(\ell)} over K:=(ζ|4abh|,ζq(n),a1/2h,a1/q(n/))K^{\prime}:=\mathbb{Q}(\zeta_{|4abh|},\zeta_{q(n)},a^{1/2h},a^{1/q(n/\ell)}) is \ell.

To prove this we use the following simple observation (see [15, Proposition 3.5]): Let aa be a positive integer which is not a perfect power. For positive integers ss and tt with sts\mid t, the degree of (ζt,a1/s)/(ζt)\mathbb{Q}(\zeta_{t},a^{1/s})/\mathbb{Q}(\zeta_{t}) is either ss or s/2s/2, where the latter case occurs if and only if a(ζt)\sqrt{a}\in\mathbb{Q}(\zeta_{t}).777The proof is by basic Galois theory. See (4.5) below for the proof of a closely related statement. This result will be used subsequently without further mention.

We now have

[K(a1/q()):K]\displaystyle[K^{\prime}(a^{1/q(\ell)}):K^{\prime}] =[K(a1/q()):(ζ|4abh|,ζq(n))][K:(ζ|4abh|,ζq(n))]\displaystyle=\frac{[K^{\prime}(a^{1/q(\ell)}):\mathbb{Q}(\zeta_{|4abh|},\zeta_{q(n)})]}{[K^{\prime}:\mathbb{Q}(\zeta_{|4abh|},\zeta_{q(n)})]}
=[(ζ|4abh|,ζq(n),a1/2h,a1/q(n)):(ζ|4abh|,ζq(n))][(ζ|4abh|,ζq(n),a1/2h,a1/q(n/)):(ζ|4abh|,ζq(n))],\displaystyle=\frac{[\mathbb{Q}(\zeta_{|4abh|},\zeta_{q(n)},a^{1/2h},a^{1/q(n)}):\mathbb{Q}(\zeta_{|4abh|},\zeta_{q(n)})]}{[\mathbb{Q}(\zeta_{|4abh|},\zeta_{q(n)},a^{1/2h},a^{1/q(n/\ell)}):\mathbb{Q}(\zeta_{|4abh|},\zeta_{q(n)})]},

where by the above observation the degrees in the numerator and denominator are “what one would expect”, namely lcm(2h,q(n))/2\textup{lcm}(2h,q(n))/2 and lcm(2h,q(n/))/2\textup{lcm}(2h,q(n/\ell))/2, respectively. Here dividing by 22 accounts for the fact that a(ζ|4abh|)\sqrt{a}\in\mathbb{Q}(\zeta_{|4abh|}). This concludes the proof of Lemma 4.1. ∎

Proof of Lemma 4.2.

We then prove Lemma 4.2. Fix m0(mod|2ab|)m\equiv 0\pmod{|2ab|} and dd, and let nn be the product of exceptional primes over all 1rm1\leq r\leq m so that (4.3) is valid for all rr. Let an=an(r)a_{n}=a_{n}(r) be as in (4.3). Recall that an(r)a_{n}(r) is equal to the density of elements σGal(Fm,d,rLn/)\sigma\in\textup{Gal}(F_{m,d,r}L_{n}/\mathbb{Q}) fixing Fm,d,rF_{m,d,r} but not LL_{\ell} for any n\ell\mid n. One may thus write, by inclusion-exclusion,

(4.4) an(r)=1[Fm,d,rLn:]knμ(k)[Fm,d,rLk:].\displaystyle a_{n}(r)=\frac{1}{[F_{m,d,r}L_{n}:\mathbb{Q}]}\sum_{k\mid n}\mu(k)[F_{m,d,r}L_{k}:\mathbb{Q}].

We now turn to proving that, for any fixed kk, [Fm,d,rLk:][F_{m,d,r}L_{k}:\mathbb{Q}] does not depend on rr. This follows from the following more general claim: Let m,r,sm,r,s and tt be positive integers satisfying 4abht,mt,st4abh\mid t,m\mid t,s\mid t and 2hs2h\mid s. We have

(4.5) [(ζt,a1/s,(bar)1/2hm):]=ϕ(t)ms2.[\mathbb{Q}(\zeta_{t},a^{1/s},(ba^{-r})^{1/2hm}):\mathbb{Q}]=\phi(t)\frac{ms}{2}.

In particular the degrees of the extensions Fm,d,rLk/F_{m,d,r}L_{k}/\mathbb{Q} do not depend on rr.

Note that the degree in (4.5) is at most ϕ(t)ms2\phi(t)\frac{ms}{2}. Indeed,

[(ζt,a1/s,(bar)1/2hm):]\displaystyle[\mathbb{Q}(\zeta_{t},a^{1/s},(ba^{-r})^{1/2hm}):\mathbb{Q}]
=[(ζt):][(ζt,a1/s):(ζt)][(ζt,a1/s,(bar)1/2hm):(ζt,a1/s)].\displaystyle=[\mathbb{Q}(\zeta_{t}):\mathbb{Q}][\mathbb{Q}(\zeta_{t},a^{1/s}):\mathbb{Q}(\zeta_{t})][\mathbb{Q}(\zeta_{t},a^{1/s},(ba^{-r})^{1/2hm}):\mathbb{Q}(\zeta_{t},a^{1/s})].

The first term equals ϕ(t)\phi(t). The second one is at most s2\frac{s}{2}, as a(ζt)\sqrt{a}\in\mathbb{Q}(\zeta_{t}). The third term is similarly at most mm, as (bar)1/2h(ζt,a1/s)(ba^{-r})^{1/2h}\in\mathbb{Q}(\zeta_{t},a^{1/s}), since |b|1/2h(ζ|4abh|)|b|^{1/2h}\in\mathbb{Q}(\zeta_{|4abh|}).

We then prove that this bound is tight. We claim that the numbers

ae/s(bar)f/2mh,a^{e/s}(ba^{-r})^{f/2mh},

where 0e<s/20\leq e<s/2 and 0f<m0\leq f<m, are linearly independent over (ζt)\mathbb{Q}(\zeta_{t}), which gives the lower bound for the degree implicit in (4.5). By [10] it suffices to check that

(4.6) ae/s(bar)f/2mh(ζt)\displaystyle a^{e/s}(ba^{-r})^{f/2mh}\not\in\mathbb{Q}(\zeta_{t})

for any such ee and ff not both 0. Suppose that (4.6) fails. Let

C:=a2hemrfsbsf.C:=a^{2hem-rfs}b^{sf}.

Now CC is an integer for which C1/2hms(ζt)C^{1/2hms}\in\mathbb{Q}(\zeta_{t}). Thus, arguing as before, |C||C| must be a perfect hmshmsth power of a rational number. Since aa and bb are coprime, this means that

hms2hemrfshms\mid 2hem-rfs

and

hmshsf,hms\mid hsf,

where we used the fact that aa is not a perfect power and |b||b| is an hhth power. The latter condition gives mfm\mid f, and since 0f<m0\leq f<m, we have f=0f=0. Now the first condition gives s2es\mid 2e, so we must have e=0e=0. This concludes the proof of Lemma 4.2. ∎

5   Applying the second moment method

As before, let (p)\ell(p) be the smallest positive integer ll for which alb(modp)a^{l}\equiv b\pmod{p} (whenever it exists).

Let the set SS be as constructed in Section 4. For pSp\in S, let

Ap:={nx:n(p)(modordp(a)),n(p)},\displaystyle A_{p}:=\{n\leq x:\,\,n\equiv\ell(p)\pmod{\textup{ord}_{p}(a)},\,\,n\neq\ell(p)\},

and let Pr\Pr denote the uniform probability measure on [1,x][1,x]\cap\mathbb{N}. Also denote Sy:=S[1,y]S_{\leq y}:=S\cap[1,y]. By Lemma 3.1, in order to prove Theorem 1.3 it suffices to prove the estimate

(5.1) p,qSxPr(ApAq)(pSxPr(Ap))2=1+o(1).\displaystyle\frac{\sum_{p,q\in S_{\leq\sqrt{x}}}\Pr(A_{p}\cap A_{q})}{\left(\sum_{p\in S_{\leq\sqrt{x}}}\Pr(A_{p})\right)^{2}}=1+o(1).

By Lemma 3.1, the quotient above is always 1\geq 1, so it suffices to establish an upper bound of 1+o(1)\leq 1+o(1). Recalling that ordp(a)=p12h\textup{ord}_{p}(a)=\frac{p-1}{2h} for pSp\in S, the denominator is

(5.2) (pSx1ordp(a)+O(1x))2=(1+o(1))4h2d(S)2(loglogx)2\displaystyle\left(\sum_{p\in S_{\leq\sqrt{x}}}\frac{1}{\textup{ord}_{p}(a)}+O\left(\frac{1}{x}\right)\right)^{2}=(1+o(1))4h^{2}d_{\mathbb{P}}(S)^{2}(\log\log x)^{2}

by Lemma 4.1 and partial summation.

The numerator can be written as

mxp,qSx(ordp(a),ordq(a))=mm(p)(q)(mordp(a)ordq(a)+O(1x))\displaystyle\sum_{m\leq\sqrt{x}}\sum_{\begin{subarray}{c}p,q\in S_{\leq\sqrt{x}}\\ (\textup{ord}_{p}(a),\textup{ord}_{q}(a))=m\\ m\mid\ell(p)-\ell(q)\end{subarray}}\left(\frac{m}{\textup{ord}_{p}(a)\textup{ord}_{q}(a)}+O\left(\frac{1}{x}\right)\right)
=mxp,qSx(ordp(a),ordq(a))=mm(p)(q)mordp(a)ordq(a)+o(1).\displaystyle=\sum_{m\leq\sqrt{x}}\sum_{\begin{subarray}{c}p,q\in S_{\leq\sqrt{x}}\\ (\textup{ord}_{p}(a),\textup{ord}_{q}(a))=m\\ m\mid\ell(p)-\ell(q)\end{subarray}}\frac{m}{\textup{ord}_{p}(a)\textup{ord}_{q}(a)}+o(1).

Note that 2abm2ab\mid m. Let NN be a large, fixed integer. By Möbius inversion, we write

mxp,qSx(ordp(a),ordq(a))=mm(p)(q)mordp(a)ordq(a)\displaystyle\sum_{m\leq\sqrt{x}}\sum_{\begin{subarray}{c}p,q\in S_{\leq\sqrt{x}}\\ (\textup{ord}_{p}(a),\textup{ord}_{q}(a))=m\\ m\mid\ell(p)-\ell(q)\end{subarray}}\frac{m}{\textup{ord}_{p}(a)\textup{ord}_{q}(a)}
mNp,qSxmordp(a),ordq(a)m(p)(q)mordp(a)ordq(a)d(ordp(a),ordq(a))/mμ(d)\displaystyle\leq\sum_{m\leq N}\sum_{\begin{subarray}{c}p,q\in S_{\leq\sqrt{x}}\\ m\mid\textup{ord}_{p}(a),\textup{ord}_{q}(a)\\ m\mid\ell(p)-\ell(q)\end{subarray}}\frac{m}{\textup{ord}_{p}(a)\textup{ord}_{q}(a)}\sum_{d\mid(\textup{ord}_{p}(a),\textup{ord}_{q}(a))/m}\mu(d)
+m>Np,qSxmordp(a),ordq(a)m(p)(q)mordp(a)ordq(a)\displaystyle+\sum_{m>N}\sum_{\begin{subarray}{c}p,q\in S_{\leq\sqrt{x}}\\ m\mid\textup{ord}_{p}(a),\textup{ord}_{q}(a)\\ m\mid\ell(p)-\ell(q)\end{subarray}}\frac{m}{\textup{ord}_{p}(a)\textup{ord}_{q}(a)}
=mNdxp,qSxmdordp(a),ordq(a)m(p)(q)mμ(d)ordp(a)ordq(a)+m>Np,qSxmordp(a),ordq(a)m(p)(q)mordp(a)ordq(a).\displaystyle=\sum_{m\leq N}\sum_{d\leq\sqrt{x}}\sum_{\begin{subarray}{c}p,q\in S_{\leq\sqrt{x}}\\ md\mid\textup{ord}_{p}(a),\textup{ord}_{q}(a)\\ m\mid\ell(p)-\ell(q)\end{subarray}}\frac{m\mu(d)}{\textup{ord}_{p}(a)\textup{ord}_{q}(a)}\ +\sum_{m>N}\sum_{\begin{subarray}{c}p,q\in S_{\leq\sqrt{x}}\\ m\mid\textup{ord}_{p}(a),\textup{ord}_{q}(a)\\ m\mid\ell(p)-\ell(q)\end{subarray}}\frac{m}{\textup{ord}_{p}(a)\textup{ord}_{q}(a)}.

The problem thus reduces to considering sums of the form

p,qSxmdordp(a),ordq(a)m(p)(q)1ordp(a)ordq(a)\displaystyle\sum_{\begin{subarray}{c}p,q\in S_{\leq\sqrt{x}}\\ md\mid\textup{ord}_{p}(a),\textup{ord}_{q}(a)\\ m\mid\ell(p)-\ell(q)\end{subarray}}\frac{1}{\textup{ord}_{p}(a)\textup{ord}_{q}(a)} =r=0m1(pSxmdordp(a)(p)r(modm)2hp1)2\displaystyle=\sum_{r=0}^{m-1}\Big{(}\sum_{\begin{subarray}{c}p\in S_{\leq\sqrt{x}}\\ md\mid\textup{ord}_{p}(a)\\ \ell(p)\equiv r\pmod{m}\end{subarray}}\frac{2h}{p-1}\Big{)}^{2}
=4h2r=0m1(pxpSm,d,r1p1)2\displaystyle=4h^{2}\sum_{r=0}^{m-1}\Big{(}\sum_{\begin{subarray}{c}p\leq\sqrt{x}\\ p\in S_{m,d,r}\end{subarray}}\frac{1}{p-1}\Big{)}^{2}

for various d,mxd,m\leq\sqrt{x}.

Define

Σm,d,r:=(pxpSm,d,r1p1)2\Sigma_{m,d,r}:=\Big{(}\sum_{\begin{subarray}{c}p\leq\sqrt{x}\\ p\in S_{m,d,r}\end{subarray}}\frac{1}{p-1}\Big{)}^{2}

and

Σm,d:=r=0m1Σm,d,r.\Sigma_{m,d}:=\sum_{r=0}^{m-1}\Sigma_{m,d,r}.

Now we have

p,qxPr(ApAq)4h2(mNdxmμ(d)Σm,d+m>NmΣm,1).\sum_{p,q\leq x}\Pr(A_{p}\cap A_{q})\leq 4h^{2}\left(\sum_{m\leq N}\sum_{d\leq\sqrt{x}}m\mu(d)\Sigma_{m,d}+\sum_{m>N}m\Sigma_{m,1}\right).

In Sections 6 and 7, we will prove that for any N1N\geq 1 and for xx0(N)x\geq x_{0}(N) (where x0(N)x_{0}(N) is a fast enough growing function of NN) we have

(5.3) mNdxmμ(d)Σm,d=(1+O(1/N))d(S)2(loglogx)2\displaystyle\sum_{m\leq N}\sum_{d\leq\sqrt{x}}m\mu(d)\Sigma_{m,d}=(1+O(1/N))d_{\mathbb{P}}(S)^{2}(\log\log x)^{2}

and

(5.4) m>NmΣm,1(loglogx)2/N\displaystyle\sum_{m>N}m\Sigma_{m,1}\ll(\log\log x)^{2}/N

Combining (5.2), (5.3) and (5.4) and letting xx\to\infty gives a bound of 1+O(1/N)\leq 1+O(1/N) for the right-hand side of (5.1), and letting NN\to\infty slowly in terms of xx gives Theorem 1.3. Thus, our remaining task in the proof of Theorem 1.3 is proving (5.3) and (5.4).

6   Contribution of small gcd

In this section, we prove the desired estimate (5.3). We handle the cases mdN3md\leq N^{3} and md>N3md>N^{3} separately.

6.1   Case 1. mdN3md\leq N^{3}

Let πm,d,r(x):=px,pSm,d,r1\pi_{m,d,r}(x):=\sum_{p\leq x,p\in S_{m,d,r}}1. By partial summation, we have

Σm,d,r1/2\displaystyle\Sigma_{m,d,r}^{1/2} =pxpSm,d,r1p1\displaystyle=\sum_{\begin{subarray}{c}p\leq\sqrt{x}\\ p\in S_{m,d,r}\end{subarray}}\frac{1}{p-1}
=πm,d,r(x)x1+2xπm,d,r(t)(t1)2𝑑t.\displaystyle=\frac{\pi_{m,d,r}(\sqrt{x})}{\sqrt{x}-1}+\int_{2}^{\sqrt{x}}\frac{\pi_{m,d,r}(t)}{(t-1)^{2}}\,dt.

The first term here is O(1/logx)O(1/\log x). The second term can be bounded by splitting the integration interval into parts [2,y0(N)][2,y_{0}(N)] and [y0(N),x][y_{0}(N),\sqrt{x}], where y0(N)y_{0}(N) is a fast growing function of NN such that

πm,d,r(t)=(1+O(N5))d(Sm,d,r)tlogtforty0(N),\displaystyle\pi_{m,d,r}(t)=(1+O(N^{-5}))d_{\mathbb{P}}(S_{m,d,r})\frac{t}{\log t}\quad\textnormal{for}\quad t\geq y_{0}(N),

uniformly for m,d,rNm,d,r\leq N; such a function exists by Lemma 4.2. We then have

2y0(N)πm,d,r(t)(t1)2𝑑tloglogy0(N)\displaystyle\int_{2}^{y_{0}(N)}\frac{\pi_{m,d,r}(t)}{(t-1)^{2}}\,dt\ll\log\log y_{0}(N)

and

y0(N)xπm,d,r(t)(t1)2𝑑t\displaystyle\int_{y_{0}(N)}^{\sqrt{x}}\frac{\pi_{m,d,r}(t)}{(t-1)^{2}}\,dt =y0(N)x(1+O(N5))d(Sm,d,r)t/logt(t1)2𝑑t\displaystyle=\int_{y_{0}(N)}^{\sqrt{x}}\frac{(1+O(N^{-5}))d_{\mathbb{P}}(S_{m,d,r})t/\log t}{(t-1)^{2}}\,dt
=(1+O(N5))d(Sm,d,r)loglogx+O(1).\displaystyle=(1+O(N^{-5}))d_{\mathbb{P}}(S_{m,d,r})\log\log x+O(1).

All in all, we have

Σm,d,r1/2=(1+O(N5))d(Sm,d,r)loglogx+O(loglogy0(N)).\Sigma_{m,d,r}^{1/2}=(1+O(N^{-5}))d_{\mathbb{P}}(S_{m,d,r})\log\log x+O(\log\log y_{0}(N)).

Squaring gives

Σm,d,r=(1+O(N5))d(Sm,d,r)2(loglogx)2+O((loglogx)(loglogy0(N))).\Sigma_{m,d,r}=(1+O(N^{-5}))d_{\mathbb{P}}(S_{m,d,r})^{2}(\log\log x)^{2}+O((\log\log x)(\log\log y_{0}(N))).

Summing over rr and using Lemma 4.2 (which says that d(Sm,d,r)d_{\mathbb{P}}(S_{m,d,r}) is independent of rr), we get

(6.1) Σm,d=r=0m1Σm,d,r=(1+O(N5))d(S1,md,0)2m(loglogx)2+O(m(loglogx)1.1).\displaystyle\Sigma_{m,d}=\sum_{r=0}^{m-1}\Sigma_{m,d,r}=(1+O(N^{-5}))\frac{d_{\mathbb{P}}(S_{1,md,0})^{2}}{m}(\log\log x)^{2}+O\left(m(\log\log x)^{1.1}\right).

We multiply this by mμ(d)m\mu(d) and sum over mm and dd with mdN3md\leq N^{3}. For xx large enough in terms of NN, this gives

mNdN3mmμ(d)Σm,d\displaystyle\sum_{m\leq N}\sum_{d\leq\frac{N^{3}}{m}}m\mu(d)\Sigma_{m,d}
=mNdN3mμ(d)(1+O(N5))d(S1,md,0)2(loglogx)2+O(N3.1(loglogx)1.1)\displaystyle=\sum_{m\leq N}\sum_{d\leq\frac{N^{3}}{m}}\mu(d)(1+O(N^{-5}))d_{\mathbb{P}}(S_{1,md,0})^{2}(\log\log x)^{2}+O(N^{3.1}(\log\log x)^{1.1})
=(loglogx)2mNdN3mμ(d)d(S1,md,0)2+O(d(S)2N1.9)+O(N3.1(loglogx)1.1)\displaystyle=(\log\log x)^{2}\sum_{m\leq N}\sum_{d\leq\frac{N^{3}}{m}}\mu(d)d_{\mathbb{P}}(S_{1,md,0})^{2}+O\left(d_{\mathbb{P}}(S)^{2}N^{-1.9}\right)+O\left(N^{3.1}(\log\log x)^{1.1}\right)
=(1+O(N1))d(S1)2(loglogx)2,\displaystyle=(1+O(N^{-1}))d_{\mathbb{P}}(S_{1})^{2}(\log\log x)^{2},

where we used

dmN3μ(d)adm=kN3akdkμ(d)=a1.\sum_{dm\leq N^{3}}\mu(d)a_{dm}=\sum_{k\leq N^{3}}a_{k}\sum_{d\mid k}\mu(d)=a_{1}.

6.2   Case 2. md>N3md>N^{3}

Now we bound the contribution to (5.3) from the terms md>N3md>N^{3}. Crude estimation gives

r=0m1Σm,d,r\displaystyle\sum_{r=0}^{m-1}\Sigma_{m,d,r} =r=0m1(pxpSm,d,r1p1)2\displaystyle=\sum_{r=0}^{m-1}\Big{(}\sum_{\begin{subarray}{c}p\leq\sqrt{x}\\ p\in S_{m,d,r}\end{subarray}}\frac{1}{p-1}\Big{)}^{2}
(px,p1(modmd)1p1)2.\displaystyle\leq\Big{(}\sum_{p\leq x,p\equiv 1\pmod{md}}\frac{1}{p-1}\Big{)}^{2}.

We split the summation over pp into the intervals [md,(md)2][md,(md)^{2}] and ((md)2,x]((md)^{2},x]. Applying the Brun–Titchmarsh inequality for p[(md)2,x]p\in[(md)^{2},x] and a trivial estimate for p[md,(md)2]p\in[md,(md)^{2}] gives

(pxp1(modmd)1p1)2(loglogx)2ϕ(md)2+(log(md)md)2.\displaystyle\Big{(}\sum_{\begin{subarray}{c}p\leq x\\ p\equiv 1\pmod{md}\end{subarray}}\frac{1}{p-1}\Big{)}^{2}\ll\frac{(\log\log x)^{2}}{\phi(md)^{2}}+\Big{(}\frac{\log(md)}{md}\Big{)}^{2}.

Since mN,md>N3m\leq N,md>N^{3} implies m(md)1/3m\leq(md)^{1/3}, we can bound the contribution to (5.3) from the terms md>N3md>N^{3} by

md>N3mNm((loglogx)2ϕ(md)2+log2(md)(md)2)\displaystyle\sum_{\begin{subarray}{c}md>N^{3}\\ m\leq N\end{subarray}}m\left(\frac{(\log\log x)^{2}}{\phi(md)^{2}}+\frac{\log^{2}(md)}{(md)^{2}}\right)
(loglogx)2k>N3k1/3(τ(k)ϕ(k)2+τ(k)log2kk2)\displaystyle\ll(\log\log x)^{2}\sum_{k>N^{3}}k^{1/3}\left(\frac{\tau(k)}{\phi(k)^{2}}+\frac{\tau(k)\log^{2}k}{k^{2}}\right)
(loglogx)2/N,\displaystyle\ll(\log\log x)^{2}/N,

which is the desired bound.

7   Contribution of large gcd

We are now left with the task of proving (5.4). Define

Sm,d,r:={p:p1(mod2hmd),barz2hm(modp)for somez}.\displaystyle S_{m,d,r}^{\prime}:=\{p\in\mathbb{P}:\,\,p\equiv 1\pmod{2hmd},\,\,ba^{-r}\equiv z^{2hm}\pmod{p}\,\,\textnormal{for some}\,\,\,z\}.

Furthermore, let TmT_{m} be the set of primes pp for which p1(mod2hm)p\equiv 1\pmod{2hm} and for which aa is not a 2hq2hqth power (modp)\pmod{p} for any prime qmq\mid m. We now see that Sm,d,rSm,d,rTmS_{m,d,r}\subset S_{m,d,r}^{\prime}\cap T_{m}, where these larger sets Sm,d,rS_{m,d,r}^{\prime} and TmT_{m} are easier to control.

We estimate

r=0m1Σm,1,r\displaystyle\sum_{r=0}^{m-1}\Sigma_{m,1,r} r=0m1(pxpSm,1,rTm1p)2.\displaystyle\ll\sum_{r=0}^{m-1}\Big{(}\sum_{\begin{subarray}{c}p\leq\sqrt{x}\\ p\in S_{m,1,r}^{\prime}\cap T_{m}\end{subarray}}\frac{1}{p}\Big{)}^{2}.

Call the expression inside the rr sum Σm,r\Sigma_{m,r}^{\prime}. We divide the sum Σm,r\Sigma_{m,r}^{\prime} into three parts depending on the relative size of pp and mm as follows. Set B=10B=10. We have

Σm,r=(pxpSm,1,rTm1p)2(mpmF(m)pSm,1,rTm1p)2+(mF(m)pmBpSm,1,rTm1p)2+(mBpxpSm,1,r1p)2,\displaystyle\Sigma_{m,r}^{\prime}=\Big{(}\sum_{\begin{subarray}{c}p\leq\sqrt{x}\\ p\in S_{m,1,r}^{\prime}\cap T_{m}\end{subarray}}\frac{1}{p}\Big{)}^{2}\ll\Big{(}\sum_{\begin{subarray}{c}m\leq p\leq mF(m)\\ p\in S_{m,1,r}^{\prime}\cap T_{m}\end{subarray}}\frac{1}{p}\Big{)}^{2}+\Big{(}\sum_{\begin{subarray}{c}mF(m)\leq p\leq m^{B}\\ p\in S_{m,1,r}^{\prime}\cap T_{m}\end{subarray}}\frac{1}{p}\Big{)}^{2}+\Big{(}\sum_{\begin{subarray}{c}m^{B}\leq p\leq\sqrt{x}\\ p\in S_{m,1,r}^{\prime}\end{subarray}}\frac{1}{p}\Big{)}^{2},

where

F(m):=exp((logm)1/2).\displaystyle F(m):=\exp((\log m)^{1/2}).

Denote the sums on the right by Σm,r,1\Sigma_{m,r,1}^{\prime}, Σm,r,2\Sigma_{m,r,2}^{\prime}, Σm,r,3\Sigma_{m,r,3}^{\prime}, respectively. In the following subsections we prove that the sum of Σm,r,i\Sigma_{m,r,i}^{\prime} over all mm and rr is small enough for each i{1,2,3}i\in\{1,2,3\}.

7.1   Bounds for very large gcd

We begin by estimating Σm,r,1\Sigma_{m,r,1}^{\prime} on average over rr and mm. This sum will be dealt with using Selberg’s sieve, and in particular this part of the proof is unconditional. We have

(7.1) r=0m1(mpmF(m)pSm,1,rTm1p)2(mpmF(m)p1(modm)1p)2=mp,qmF(m)pq1(modm)1pq.\displaystyle\begin{split}\sum_{r=0}^{m-1}\Big{(}\sum_{\begin{subarray}{c}m\leq p\leq mF(m)\\ p\in S_{m,1,r}^{\prime}\cap T_{m}\end{subarray}}\frac{1}{p}\Big{)}^{2}&\ll\Big{(}\sum_{\begin{subarray}{c}m\leq p\leq mF(m)\\ p\equiv 1\pmod{m}\end{subarray}}\frac{1}{p}\Big{)}^{2}\\ &=\sum_{\begin{subarray}{c}m\leq p,q\leq mF(m)\\ p\equiv q\equiv 1\pmod{m}\end{subarray}}\frac{1}{pq}.\end{split}

We claim that for any M1M\gg 1, for m[M,2M]m\in[M,2M] and MPQM2M\leq P\leq Q\leq M^{2}, we have

(7.2) Pp2PQq2Qpq1(modm)1pq1M(logM)2+1P/2Q2PQlogQ.\displaystyle\sum_{\begin{subarray}{c}P\leq p\leq 2P\\ Q\leq q\leq 2Q\\ p\equiv q\equiv 1\pmod{m}\end{subarray}}\frac{1}{pq}\ll\frac{1}{M(\log M)^{2}}+\frac{1_{P/2\leq Q\leq 2P}}{Q\log Q}.

Once this has been shown, multiplying by mm and summing over PP and QQ dyadically we can upper bound the contribution of M>NM>N by

m>Nmr=0m1Σm,r,1N<MxM=2j((logF(2M))2(logM)2+1logM)N<MxM=2j1logMloglogx\displaystyle\sum_{m>N}m\sum_{r=0}^{m-1}\Sigma_{m,r,1}^{\prime}\ll\sum_{\begin{subarray}{c}N<M\leq\sqrt{x}\\ M=2^{j}\end{subarray}}\left(\frac{(\log F(2M))^{2}}{(\log M)^{2}}+\frac{1}{\log M}\right)\ll\sum_{\begin{subarray}{c}N<M\leq\sqrt{x}\\ M=2^{j}\end{subarray}}\frac{1}{\log M}\ll\log\log x

by our choice F(m)=exp((logm)1/2)F(m)=\exp((\log m)^{1/2}). This is certainly o((loglogx)2)o((\log\log x)^{2}).

To prove (7.2), we first note that the contribution of the terms p=qp=q to that sum is 1/(QlogQ)\ll 1/(Q\log Q), so it suffices to prove a bound of 1/(M(logM)2)\ll 1/(M(\log M)^{2}) for the off-diagonal contribution. Also observe that if p1(modm)p\equiv 1\pmod{m}, then p=am+1p=am+1 for some P/Ma2P/MP/M\leq a\leq 2P/M. Thus, the contribution of pqp\neq q to (7.2) is

1PQP/Ma2P/MQ/Mb2Q/M,ba|{m2M:am+1,bm+1}|.\displaystyle\ll\frac{1}{PQ}\sum_{\begin{subarray}{c}P/M\leq a\leq 2P/M\\ Q/M\leq b\leq 2Q/M,\,b\neq a\end{subarray}}|\{m\leq 2M:\,\,am+1,bm+1\in\mathbb{P}\}|.

By Selberg’s sieve, in the form of [14, Theorem 6.4] (see also formulas (6.83)–(6.86) there), this is further bounded by

(7.3) 1PQP/Ma2P/MQ/Mb2Q/M,baM(logM)2pab(ab)(1+1p).\displaystyle\ll\frac{1}{PQ}\sum_{\begin{subarray}{c}P/M\leq a\leq 2P/M\\ Q/M\leq b\leq 2Q/M,\,b\neq a\end{subarray}}\frac{M}{(\log M)^{2}}\prod_{p\mid ab(a-b)}\left(1+\frac{1}{p}\right).

Note that by Cauchy–Schwarz for any 0hy0\leq h\leq y we have

(7.4) nynϕ(n)n+hφ(n+h)(ny(nϕ(n))2)1/2(ny(n+hϕ(n+h))2)1/2y.\displaystyle\sum_{n\leq y}\frac{n}{\phi(n)}\cdot\frac{n+h}{\varphi(n+h)}\leq\left(\sum_{n\leq y}\left(\frac{n}{\phi(n)}\right)^{2}\right)^{1/2}\left(\sum_{n\leq y}\left(\frac{n+h}{\phi(n+h)}\right)^{2}\right)^{1/2}\ll y.

Therefore, (7.3) is at most

1PQPMQMM(logM)21M(logM)2,\displaystyle\ll\frac{1}{PQ}\cdot\frac{P}{M}\cdot\frac{Q}{M}\cdot\frac{M}{(\log M)^{2}}\ll\frac{1}{M(\log M)^{2}},

as wanted.

7.2   Bounds for medium gcd

We then deal with the sums Σm,r,3\Sigma_{m,r,3}^{\prime} for which we assume GRH (but not Hypothesis 1.2). We divide [mB,x][m^{B},\sqrt{x}] into dyadic intervals [y/2,y][y/2,y] to obtain

(7.5) Σm,r,3\displaystyle\Sigma_{m,r,3}^{\prime} (mByxy=2j1yp[y/2,y]pSm,1,r1)2.\displaystyle\ll\Big{(}\sum_{\begin{subarray}{c}m^{B}\leq y\leq\sqrt{x}\\ y=2^{j}\end{subarray}}\frac{1}{y}\sum_{\begin{subarray}{c}p\in[y/2,y]\\ p\in S_{m,1,r}^{\prime}\end{subarray}}1\Big{)}^{2}.

Note then that pSm,1,rp\in S_{m,1,r}^{\prime} implies p1(modm)p\equiv 1\pmod{m} and that barba^{-r} is an mmth power (modp)\pmod{p}, so by Galois theory pp splits completely in L:=(ζm,(bar)1/m)L:=\mathbb{Q}(\zeta_{m},(ba^{-r})^{1/m}). Thus the Artin symbol (L/p)\left(\frac{L/\mathbb{Q}}{p}\right) is equal to 11.

Now we are in a position to apply the Chebotarev density theorem (Lemma 2.1). Under GRH, it gives

πC(y)=|C|[L:]Li(y)+O(|C|[L:]y(logdisc(L/)+[L:]logy)),\displaystyle\pi_{C}(y)=\frac{|C|}{[L:\mathbb{Q}]}\textnormal{Li}(y)+O\left(\frac{|C|}{[L:\mathbb{Q}]}\sqrt{y}(\log\textnormal{disc}(L/\mathbb{Q})+[L:\mathbb{Q}]\log y)\right),

where CC is any conjugacy class of Gal(L/)\textup{Gal}(L/\mathbb{Q}). In the case of the specific extension LL above, we easily see that [L:]mϕ(m)[L:\mathbb{Q}]\gg m\phi(m) (see Lemma 8.1), and also by [4, Lemma 1.2] we have logdisc(L/)m2logm\log\textnormal{disc}(L/\mathbb{Q})\ll m^{2}\log m, so if CC is the identity of Gal(L/)\textup{Gal}(L/\mathbb{Q}), we get

πC(y)1mϕ(m)Li(y)forym4.1.\displaystyle\pi_{C}(y)\ll\frac{1}{m\phi(m)}\textnormal{Li}(y)\quad\textnormal{for}\quad y\geq m^{4.1}.

Therefore, (7.5) is

(mByxy=2j1yy(logy)mϕ(m))2(loglogx)2(mϕ(m))2.\displaystyle\ll\Big{(}\sum_{\begin{subarray}{c}m^{B}\leq y\leq\sqrt{x}\\ y=2^{j}\end{subarray}}\frac{1}{y}\cdot\frac{y}{(\log y)m\phi(m)}\Big{)}^{2}\ll\frac{(\log\log x)^{2}}{(m\phi(m))^{2}}.

Summing this over all 0rm10\leq r\leq m-1, multiplying by mm and summing over m>Nm>N gives

m>Nmr=0m1Σm,r,3m>N(loglogx)2ϕ(m)2(loglogx)2N.\sum_{m>N}m\sum_{r=0}^{m-1}\Sigma_{m,r,3}^{\prime}\ll\sum_{m>N}\frac{(\log\log x)^{2}}{\phi(m)^{2}}\ll\frac{(\log\log x)^{2}}{N}.

This is a good enough estimate.

7.3   Bounds for large gcd

We then turn to estimating Σm,r,2\Sigma_{m,r,2}^{\prime} on average over rr and mm. This is the only part of our proof where Hypothesis 1.2 is used.

Let

(7.6) :={Nmx:mm:m[(logm)ϵ,mϵ]},\displaystyle\mathcal{M}:=\{N\leq m\leq\sqrt{x}:\,\,\exists\,\,m^{\prime}\mid m:m^{\prime}\in[(\log m)^{\epsilon^{\prime}},m^{\epsilon}]\cap\mathbb{P}\},

where ϵ>0\epsilon>0 is small but fixed and ϵ=ϵ(N)\epsilon^{\prime}=\epsilon^{\prime}(N) is chosen to be small enough in terms of NN. For each mm\in\mathcal{M} we let mm^{\prime} denote such a divisor of mm.

We wish to estimate the quantity

(7.7) Σm,r,2\displaystyle\Sigma_{m,r,2}^{\prime} =(mF(m)pmBpSm,1,rTm1p)2.\displaystyle=\Big{(}\sum_{\begin{subarray}{c}mF(m)\leq p\leq m^{B}\\ p\in S^{\prime}_{m,1,r}\cap T_{m}\end{subarray}}\frac{1}{p}\Big{)}^{2}.

We split

Mm2Mr=0m1Σm,r,2\displaystyle\sum_{M\leq m\leq 2M}\sum_{r=0}^{m-1}\Sigma_{m,r,2}^{\prime}
(7.8) =Mm2Mmr=0m1(mF(m)pmBpSm,1,rTm1p)2+Mm2Mmr=0m1(mF(m)pmBpSm,1,rTm1p)2,\displaystyle=\sum_{\begin{subarray}{c}M\leq m\leq 2M\\ m\in\mathcal{M}\end{subarray}}\sum_{r=0}^{m-1}\Big{(}\sum_{\begin{subarray}{c}mF(m)\leq p\leq m^{B}\\ p\in S_{m,1,r}^{\prime}\cap T_{m}\end{subarray}}\frac{1}{p}\Big{)}^{2}+\sum_{\begin{subarray}{c}M\leq m\leq 2M\\ m\not\in\mathcal{M}\end{subarray}}\sum_{r=0}^{m-1}\Big{(}\sum_{\begin{subarray}{c}mF(m)\leq p\leq m^{B}\\ p\in S_{m,1,r}^{\prime}\cap T_{m}\end{subarray}}\frac{1}{p}\Big{)}^{2},

We now present a general bound for the contribution of a single mm to the first sum in (7.3). This is the only part of the paper where Hypothesis 1.2 is put into use.

Lemma 7.1.

Assume Hypothesis 1.2. Let P2P\geq 2, and let m[M,2M]m\in[M,2M] and mm^{\prime}\in\mathbb{P} with mmm^{\prime}\mid m and m[(logm)ϵ,mϵ]m^{\prime}\in[(\log m)^{\epsilon^{\prime}},m^{\epsilon}]. Then, provided that PmF(m)P\geq mF(m), we have

r=0m1(Pp2PpSm,1,rTm1)2ϵ,ϵP2ϕ(m)2(logP)2max{1(m)ϵ,1(logP)2}.\displaystyle\sum_{r=0}^{m-1}\Big{(}\sum_{\begin{subarray}{c}P\leq p\leq 2P\\ p\in S^{\prime}_{m,1,r}\cap T_{m}\end{subarray}}1\Big{)}^{2}\ll_{\epsilon^{\prime},\epsilon}\frac{P^{2}}{\phi(m)^{2}(\log P)^{2}}\max\left\{\frac{1}{(m^{\prime})^{\epsilon}},\frac{1}{(\log P)^{2}}\right\}.
Proof.

Note first that if (p)r(modm)\ell(p)\equiv r\pmod{m}, then (p)r(modm)\ell(p)\equiv r\pmod{m^{\prime}}. Observe also that

(7.9) Sm,1,rSm,1,rTm=\displaystyle S_{m,1,r}^{\prime}\cap S_{m,1,r^{\prime}}^{\prime}\cap T_{m}=\emptyset

for any rrr\neq r^{\prime}: if pSm,1,rSm,1,rp\in S_{m,1,r}^{\prime}\cap S_{m,1,r^{\prime}}^{\prime}, then arra^{r-r^{\prime}} is a perfect 2hm2hmth power and thus aa must be a 2hq2hqth power for some prime qmq\mid m, implying pTmp\not\in T_{m}. Also note the trivial inequality

(7.10) x12++xk2(x1++xk)2,xi0.\displaystyle x_{1}^{2}+\cdots+x_{k}^{2}\leq(x_{1}+\cdots+x_{k})^{2},\quad x_{i}\geq 0.

Using (7.9) and (7.10), we obtain

r=0m1(Pp2PpSm,1,rTm1)2r=0m1(Pp2PpTm,m,r1)2,\displaystyle\sum_{r=0}^{m-1}\Big{(}\sum_{\begin{subarray}{c}P\leq p\leq 2P\\ p\in S^{\prime}_{m,1,r}\cap T_{m}\end{subarray}}1\Big{)}^{2}\ll\sum_{r=0}^{m^{\prime}-1}\Big{(}\sum_{\begin{subarray}{c}P\leq p\leq 2P\\ p\in T_{m,m^{\prime},r}\end{subarray}}1\Big{)}^{2},

where Tm,m,rT_{m,m^{\prime},r} is the set of primes pp which split completely in (ζ2hm,a1/2h,(bar)1/2hm)\mathbb{Q}(\zeta_{2hm},a^{1/2h},(ba^{-r})^{1/2hm^{\prime}}) but which do not split completely in (ζ2hq,a1/2hq)\mathbb{Q}(\zeta_{2hq},a^{1/2hq}) for any qmq\mid m. In particular,

Tm,m,rTm,m,r:={p:psplits completely in(ζm,(bar)1/m)}.\displaystyle T_{m,m^{\prime},r}\subset T^{\prime}_{m,m^{\prime},r}:=\{p:\,\,p\,\,\textnormal{splits completely in}\,\,\mathbb{Q}(\zeta_{m},(ba^{-r})^{1/m^{\prime}})\}.

Hence, under Hypothesis 1.2, we have

r=0m1(Pp2PpTm,m,r1)2\displaystyle\sum_{r=0}^{m^{\prime}-1}\Big{(}\sum_{\begin{subarray}{c}P\leq p\leq 2P\\ p\in T_{m,m^{\prime},r}\end{subarray}}1\Big{)}^{2} ϵ,ϵP2ϕ(m)2(logP)2max{1(m)ϵ,1(logP)2},\displaystyle\ll_{\epsilon^{\prime},\epsilon}\frac{P^{2}}{\phi(m)^{2}(\log P)^{2}}\max\left\{\frac{1}{(m^{\prime})^{\epsilon}},\frac{1}{(\log P)^{2}}\right\},

which produces the assertion of the lemma. ∎

We apply Lemma 7.1 to the sums on the right of (7.3). Denote

Dy(m):=maxd[1,y]dmd.\displaystyle D_{\leq y}(m):=\max_{\begin{subarray}{c}d\in[1,y]\cap\mathbb{P}\\ d\mid m\end{subarray}}d.

Splitting dyadically over pp and mm, we see that

Mm2Mmr=0m1(mF(m)pmBpSm,1,rTm1p)2\displaystyle\sum_{\begin{subarray}{c}M\leq m\leq 2M\\ m\in\mathcal{M}\end{subarray}}\sum_{r=0}^{m-1}\Big{(}\sum_{\begin{subarray}{c}mF(m)\leq p\leq m^{B}\\ p\in S^{\prime}_{m,1,r}\cap T_{m}\end{subarray}}\frac{1}{p}\Big{)}^{2}
Mm2MmMF(M)P,Q(2M)BP=2j1,Q=2j2r=0m1(Pp2Pp𝒯m,m,r1P)(Qq2Qq𝒯m,m,r1Q)\displaystyle\ll\sum_{\begin{subarray}{c}M\leq m\leq 2M\\ m\in\mathcal{M}\end{subarray}}\sum_{\begin{subarray}{c}MF(M)\leq P,Q\leq(2M)^{B}\\ P=2^{j_{1}},Q=2^{j_{2}}\end{subarray}}\sum_{r=0}^{m^{\prime}-1}\Big{(}\sum_{\begin{subarray}{c}P\leq p\leq 2P\\ p\in\mathcal{T}_{m,m^{\prime},r}\end{subarray}}\frac{1}{P}\Big{)}\Big{(}\sum_{\begin{subarray}{c}Q\leq q\leq 2Q\\ q\in\mathcal{T}_{m,m^{\prime},r}\end{subarray}}\frac{1}{Q}\Big{)}
Mm2MmMF(M)P,Q(2M)BP=2j1,Q=2j2r=0m1(Pp2Pp𝒯m,m,r1P)2\displaystyle\ll\sum_{\begin{subarray}{c}M\leq m\leq 2M\\ m\in\mathcal{M}\end{subarray}}\sum_{\begin{subarray}{c}MF(M)\leq P,Q\leq(2M)^{B}\\ P=2^{j_{1}},Q=2^{j_{2}}\end{subarray}}\sum_{r=0}^{m^{\prime}-1}\Big{(}\sum_{\begin{subarray}{c}P\leq p\leq 2P\\ p\in\mathcal{T}_{m,m^{\prime},r}\end{subarray}}\frac{1}{P}\Big{)}^{2}
Mm2Mm(1ϕ(m)2(logM)2+1ϕ(m)2Dmϵ(m)ϵ)\displaystyle\ll\sum_{\begin{subarray}{c}M\leq m\leq 2M\\ m\in\mathcal{M}\end{subarray}}\left(\frac{1}{\phi(m)^{2}(\log M)^{2}}+\frac{1}{\phi(m)^{2}D_{\leq m^{\epsilon}}(m)^{\epsilon}}\right)
(7.11) ϵ,ϵ1M(logM)2+Mm2Mm1ϕ(m)2Dmϵ(m)ϵ.\displaystyle\ll_{\epsilon^{\prime},\epsilon}\frac{1}{M(\log M)^{2}}+\sum_{\begin{subarray}{c}M\leq m\leq 2M\\ m\in\mathcal{M}\end{subarray}}\frac{1}{\phi(m)^{2}D_{\leq m^{\epsilon}}(m)^{\epsilon}}.

Regarding the first term in (7.3), if we multiply it by MM and summing over MxM\leq\sqrt{x} that are powers of two, we get a bound of

MxM=2j1(logM)21,\displaystyle\ll\sum_{\begin{subarray}{c}M\leq\sqrt{x}\\ M=2^{j}\end{subarray}}\frac{1}{(\log M)^{2}}\ll 1,

which is certainly small enough.

Now, to conclude the proof of Theorem 1.3, it suffices to prove the following two claims.

Lemma 7.2.

We have

(7.12) mxmmr=0m1(mF(m)pmBpSm,1,rTm1p)2=o((loglogx)2).\displaystyle\sum_{\begin{subarray}{c}m\leq\sqrt{x}\\ m\not\in\mathcal{M}\end{subarray}}m\sum_{r=0}^{m-1}\Big{(}\sum_{\begin{subarray}{c}mF(m)\leq p\leq m^{B}\\ p\in S_{m,1,r}\cap T_{m}\end{subarray}}\frac{1}{p}\Big{)}^{2}=o((\log\log x)^{2}).
Lemma 7.3.

We have

(7.13) Mm2Mm1ϕ(m)2Dmϵ(m)ϵϵ1M(logM)1+ϵϵ/2.\displaystyle\sum_{\begin{subarray}{c}M\leq m\leq 2M\\ m\in\mathcal{M}\end{subarray}}\frac{1}{\phi(m)^{2}D_{\leq m^{\epsilon}}(m)^{\epsilon}}\ll_{\epsilon}\frac{1}{M(\log M)^{1+\epsilon\cdot\epsilon^{\prime}/2}}.

Note that we have

M=2j1(logM)1+ϵϵ/2ϵ,ϵ1,\displaystyle\sum_{M=2^{j}}\frac{1}{(\log M)^{1+\epsilon\cdot\epsilon^{\prime}/2}}\ll_{\epsilon^{\prime},\epsilon}1,

so the sum over MM of the contributions given by Lemma 7.3 is small.

Proof of Lemma 7.2.

Analogously to (7.1), we first make for m[M,2M]m\in[M,2M] the crude estimate

(mF(m)pmBpSm,1,rTm1p)2(mF(m)pmBp1(modm)1p)2MF(M)p,q(2M)B1pq1(modm)pq.\displaystyle\Big{(}\sum_{\begin{subarray}{c}mF(m)\leq p\leq m^{B}\\ p\in S^{\prime}_{m,1,r}\cap T_{m}\end{subarray}}\frac{1}{p}\Big{)}^{2}\leq\Big{(}\sum_{\begin{subarray}{c}mF(m)\leq p\leq m^{B}\\ p\equiv 1\pmod{m}\end{subarray}}\frac{1}{p}\Big{)}^{2}\leq\sum_{MF(M)\leq p,q\leq(2M)^{B}}\frac{1_{p\equiv q\equiv 1\pmod{m}}}{pq}.

Note that applying the Brun–Titchmarsh inequality to this would cost us some nontrivial factors, since that inequality involves savings of 1/(log(P/m))1/(\log(P/m)) rather than 1/logP1/\log P if p[P,2P]p\in[P,2P]. Therefore, we instead exploit the summation over mm. Let us restrict to m[M,2M]m\in[M,2M], p[P,2P]p\in[P,2P] and q[Q,2Q]q\in[Q,2Q]; we will later sum dyadically over different M,P,QM,P,Q. Let

=M:={m[M,2M]:pmfor all(log2M)ϵpMϵ},\displaystyle\mathcal{R}=\mathcal{R}_{M}:=\{m\in[M,2M]:\,\,p\nmid m\,\,\textnormal{for all}\,\,(\log 2M)^{\epsilon^{\prime}}\leq p\leq M^{\epsilon}\},

and observe that [M,2M][M,2M]\setminus\mathcal{M}\subset\mathcal{R}. Our task is thus to upper bound

(7.14) MPQP/Mn2P/MQ/Mn2Q/MMm2Mm1nm+11nm+1.\displaystyle\frac{M}{PQ}\sum_{\begin{subarray}{c}P/M\leq n\leq 2P/M\\ Q/M\leq n^{\prime}\leq 2Q/M\end{subarray}}\sum_{\begin{subarray}{c}M\leq m\leq 2M\\ m\in\mathcal{R}\end{subarray}}1_{nm+1\in\mathbb{P}}1_{n^{\prime}m+1\in\mathbb{P}}.

Let us first estimate |||\mathcal{R}|. Let 𝒫(z):=pzp\mathcal{P}(z):=\prod_{p\leq z}p. Note that

||rad(d)𝒫((log(2M))ϵ)r2M/d(r,𝒫(Mϵ))=11:=S1+S2,\displaystyle|\mathcal{R}|\leq\sum_{\textnormal{rad}(d)\mid\mathcal{P}((\log(2M))^{\epsilon^{\prime}})}\sum_{\begin{subarray}{c}r\leq 2M/d\\ (r,\mathcal{P}(M^{\epsilon}))=1\end{subarray}}1:=S_{1}+S_{2},

where S1S_{1} is the part of the sum with dM1/2d\leq M^{1/2} and S2S_{2} is the complementary part. By Mertens’ theorem, we have

(7.15) S1ϵrad(d)𝒫((log(2M))ϵ)MdlogMMlogMp(log(2M))ϵ(1+1p+1p2+)ϵMloglogMlogM.\displaystyle\begin{split}S_{1}&\ll_{\epsilon}\sum_{\textnormal{rad}(d)\mid\mathcal{P}((\log(2M))^{\epsilon^{\prime}})}\frac{M}{d\log M}\\ &\ll\frac{M}{\log M}\prod_{p\leq(\log(2M))^{\epsilon^{\prime}}}\left(1+\frac{1}{p}+\frac{1}{p^{2}}+\cdots\right)\\ &\ll\epsilon^{\prime}\frac{M\log\log M}{\log M}.\end{split}

When it comes to S2S_{2}, we crudely estimate

S2\displaystyle S_{2} rad(d)𝒫((log(2M))ϵ)d>M1/2MdM0.95rad(d)𝒫((log(2M))ϵ)1d0.9\displaystyle\ll\sum_{\begin{subarray}{c}\textnormal{rad}(d)\mid\mathcal{P}((\log(2M))^{\epsilon^{\prime}})\\ d>M^{1/2}\end{subarray}}\frac{M}{d}\ll M^{0.95}\sum_{\textnormal{rad}(d)\mid\mathcal{P}((\log(2M))^{\epsilon^{\prime}})}\frac{1}{d^{0.9}}
M0.95p(log(2M))ϵ(1+1p0.9+1p1.8+)\displaystyle\ll M^{0.95}\prod_{p\leq(\log(2M))^{\epsilon^{\prime}}}\left(1+\frac{1}{p^{0.9}}+\frac{1}{p^{1.8}}+\cdots\right)
M0.96.\displaystyle\ll M^{0.96}.

Hence,

(7.16) ||ϵϵMloglogMlogM.\displaystyle|\mathcal{R}|\ll_{\epsilon}\epsilon^{\prime}\frac{M\log\log M}{\log M}.

Next, note that by (7.16) the diagonal contribution n=nn=n^{\prime} to (7.14) (crudely dropping the condition that nm+1nm+1\in\mathbb{P}) is

MPQPM|M|ϵϵloglogMQlogM\displaystyle\ll\frac{M}{PQ}\cdot\frac{P}{M}|\mathcal{R}_{M}|\ll_{\epsilon}\epsilon^{\prime}\frac{\log\log M}{Q\log M}

and it only appears if PQP\asymp Q. Multiplying by MM and summing dyadically over QQ and MM, we get a contribution of ϵ(loglogx)2\ll\epsilon^{\prime}(\log\log x)^{2} for the diagonal terms. Since ϵ=ϵ(N)\epsilon^{\prime}=\epsilon^{\prime}(N) for sufficiently fast decaying ϵ(N)\epsilon^{\prime}(N), this is (loglogx)2/N\ll(\log\log x)^{2}/N. We may henceforth restrict to nnn\neq n^{\prime} in (7.14).

We are then left with upper bounding the non-diagonal contribution nnn\neq n^{\prime} in (7.14). To this end, we will apply Selberg’s sieve to the set

𝒜=𝒜(n,n):={(nm+1)(nm+1):m[M,2M]}.\displaystyle\mathcal{A}=\mathcal{A}^{(n,n^{\prime})}:=\{(nm+1)(n^{\prime}m+1):\,\,m\in[M,2M]\cap\mathcal{R}\}.

We claim that if dd is squarefree, (d,nn(nn))=1(d,nn^{\prime}(n-n^{\prime}))=1 and 𝒜d:={m𝒜:m0(modd)}\mathcal{A}_{d}:=\{m\in\mathcal{A}:\,\,m\equiv 0\pmod{d}\}, then

(7.17) |𝒜d|=g(d)d|𝒜|+Rd,\displaystyle|\mathcal{A}_{d}|=\frac{g(d)}{d}|\mathcal{A}|+R_{d},

where RdR_{d} satisfies a level of distribution estimate

(7.18) dM0.1μ2(d)=1(d,nn(nn))=1|Rd|AM/(logM)A,\displaystyle\sum_{\begin{subarray}{c}d\leq M^{0.1}\\ \mu^{2}(d)=1\\ (d,nn^{\prime}(n-n^{\prime}))=1\end{subarray}}|R_{d}|\ll_{A}M/(\log M)^{A},

and g:0g:\mathbb{N}\to\mathbb{R}_{\geq 0} is the completely multiplicative function given at primes by

g(p):={2,p(log(2M))ϵ,2pp1,p>(log(2M))ϵ.\displaystyle g(p):=\begin{cases}2,\quad p\leq(\log(2M))^{\epsilon^{\prime}},\\ \frac{2p}{p-1},\quad p>(\log(2M))^{\epsilon^{\prime}}.\end{cases}

We note that for mm\in\mathcal{R} the condition (nm+1)(nm+1)𝒜p(nm+1)(n^{\prime}m+1)\in\mathcal{A}_{p} is equivalent to m1/n(modp)m\equiv-1/n\pmod{p} or m1/n(modp)m\equiv-1/n^{\prime}\pmod{p}, and the residue classes 1/n(modp)-1/n\pmod{p} and 1/n(modp)-1/n^{\prime}\pmod{p} are distinct from each other and from 0(modp)0\pmod{p} whenever (p,nn(nn))=1(p,nn^{\prime}(n-n^{\prime}))=1. Therefore, in order to prove (7.18) it suffices to prove that if d,a:={m:ma(modd)}\mathcal{R}_{d,a}:=\{m\in\mathcal{R}:m\equiv a\pmod{d}\}, then for aa coprime to dd we have

|d,a|=g(d)d||+Rd,a,\displaystyle|\mathcal{R}_{d,a}|=\frac{g^{\prime}(d)}{d}|\mathcal{R}|+R_{d,a}^{\prime},

where

(7.19) dM0.1μ2(d)=1(d,nn(nn))=1τ(d)max(a,d)=1|Rd,a|AM/(logM)A,\displaystyle\sum_{\begin{subarray}{c}d\leq M^{0.1}\\ \mu^{2}(d)=1\\ (d,nn^{\prime}(n-n^{\prime}))=1\end{subarray}}\tau(d)\max_{(a,d)=1}|R_{d,a}^{\prime}|\ll_{A}M/(\log M)^{A},

and g:0g^{\prime}:\mathbb{N}\to\mathbb{R}_{\geq 0} is the completely multiplicative function given by

g(p):={1,p(log(2M))ϵ,pp1,p>(log(2M))ϵ.\displaystyle g^{\prime}(p):=\begin{cases}1,\quad p\leq(\log(2M))^{\epsilon^{\prime}},\\ \frac{p}{p-1},\quad p>(\log(2M))^{\epsilon^{\prime}}.\end{cases}

For proving (7.19), we wish to split Rd,aR^{\prime}_{d,a} as a bilinear sum and then apply a general Bombieri–Vinogradov type result for such objects.

Observe the decomposition

1(n)=αβ(n)=α1β(n)+α2β(n)+α3β(n),\displaystyle 1_{\mathcal{R}}(n)=\alpha*\beta(n)=\alpha_{1}*\beta(n)+\alpha_{2}*\beta(n)+\alpha_{3}*\beta(n),

where

α(n)=1n𝒫((logM)ϵ),β(n)=1(n,𝒫(Mϵ))=1\displaystyle\alpha(n)=1_{n\mid\mathcal{P}((\log M)^{\epsilon^{\prime}})},\quad\quad\quad\quad\quad\quad\quad\beta(n)=1_{(n,\mathcal{P}(M^{\epsilon}))=1}
α1(n)=1n𝒫((logM)ϵ),n(logM)10A,α2(n)=1n𝒫((logM)ϵ),(logM)10A<nM0.1,\displaystyle\alpha_{1}(n)=1_{n\mid\mathcal{P}((\log M)^{\epsilon^{\prime}}),n\leq(\log M)^{10A}},\quad\alpha_{2}(n)=1_{n\mid\mathcal{P}((\log M)^{\epsilon^{\prime}}),\,\,(\log M)^{10A}<n\leq M^{0.1}},
α3(n)=1n𝒫((logM)ϵ),n>M0.1\displaystyle\alpha_{3}(n)=1_{n\mid\mathcal{P}((\log M)^{\epsilon^{\prime}}),\,\,n>M^{0.1}}

The contribution of α3β(n)\alpha_{3}*\beta(n) to (7.19) is negligible (similarly as in (7.15)), whereas the part α2β(n)\alpha_{2}*\beta(n) has level of distribution 1/21/2 by a variant of the Bombieri–Vinogradov theorem (see [14, Theorem 17.4], where one just needs to verify the Siegel–Walfisz condition for β\beta. This condition can be proved for 1(n,𝒫(Mϵ))=11_{(n,\mathcal{P}(M^{\epsilon}))=1} similarly as for the indicator of the primes). We claim that also the part α1β\alpha_{1}*\beta has level of distribution 1/21/2. Since α1\alpha_{1} is supported on [1,(logM)10A][1,(\log M)^{10A}], we see that if β\beta has level of distribution 1/21/2, so does α1β\alpha_{1}*\beta. For the sequence β\beta, we indeed obtain level of distribution 1/21/2 by noting that β(n)=1(n)+2j1/ϵ1n=p1pj,pi>Mϵ\beta(n)=1_{\mathbb{P}}(n)+\sum_{2\leq j\leq 1/\epsilon}1_{n=p_{1}\cdots p_{j},p_{i}>M^{\epsilon}}, and here 1(n)1_{\mathbb{P}}(n) has level of distribution 1/21/2 by the classical Bombieri–Vinogradov theorem, whereas each of the terms 1n=p1pj,pi>Mϵ1_{n=p_{1}\cdots p_{j},p_{i}>M^{\epsilon}} has level of distribution 1/21/2 by [14, Theorem 17.4].

Now we have

dM1/2ϵμ2(d)=1max(a,d)=1|Rd,a|A,ϵM/(logM)A,\displaystyle\sum_{\begin{subarray}{c}d\leq M^{1/2-\epsilon}\\ \mu^{2}(d)=1\end{subarray}}\max_{(a,d)=1}|R_{d,a}^{\prime}|\ll_{A,\epsilon}M/(\log M)^{A},

and (7.19) follows from this by Cauchy–Schwarz and the divisor sum estimate dMτ(d)2M(logM)3\sum_{d\leq M}\tau(d)^{2}\ll M(\log M)^{3} and the trivial bound |Rd,a|M/d|R_{d,a}^{\prime}|\ll M/d.

Taking (7.17),(7.18), (7.16) and (7.4) into account, by Selberg’s sieve we obtain for (7.14) the upper bound

MPQP/Mn2P/MQ/Mn2Q/Mnnpnn(nn)(1+2p)1(logP)(logQ)ϵMloglogMlogM\displaystyle\ll\frac{M}{PQ}\cdot\sum_{\begin{subarray}{c}P/M\leq n\leq 2P/M\\ Q/M\leq n^{\prime}\leq 2Q/M\\ n\neq n^{\prime}\end{subarray}}\prod_{p\mid nn^{\prime}(n-n^{\prime})}\left(1+\frac{2}{p}\right)\frac{1}{(\log P)(\log Q)}\cdot\epsilon^{\prime}\frac{M\log\log M}{\log M}
ϵloglogMlogM1(logP)(logQ).\displaystyle\ll\epsilon^{\prime}\frac{\log\log M}{\log M}\cdot\frac{1}{(\log P)(\log Q)}.

Summing dyadically over P,Q[MF(M),(2M)B]P,Q\in[MF(M),(2M)^{B}] and M[N,x]M\in[N,\sqrt{x}], this becomes

ϵNMxM=2jloglogMlogMϵ(loglogx)2(loglogx)2/N,\displaystyle\ll\epsilon^{\prime}\sum_{\begin{subarray}{c}N\leq M\leq\sqrt{x}\\ M=2^{j}\end{subarray}}\frac{\log\log M}{\log M}\ll\epsilon^{\prime}(\log\log x)^{2}\ll(\log\log x)^{2}/N,

since we can choose ϵ=ϵ(N)=1/N2\epsilon^{\prime}=\epsilon^{\prime}(N)=1/N^{2}, say, and now the desired estimate (7.12) follows. ∎

Proof of Lemma 7.3.

We split the sum in (7.13) into two parts: those mm with Dmϵ(m)>(logm)2/ϵD_{\leq m^{\epsilon^{\prime}}}(m)>(\log m)^{2/\epsilon} and the remaining mm\in\mathcal{M}. The contribution of the former part is

(7.20) Mm2M1ϕ(m)2(logm)21M(logM)2,\displaystyle\ll\sum_{M\leq m\leq 2M}\frac{1}{\phi(m)^{2}(\log m)^{2}}\ll\frac{1}{M(\log M)^{2}},

which is a strong enough bound.

For the contribution of the latter part, note that all mm\in\mathcal{M} counted by it belong to the set

:={m[M,2M]:pmp[(logM)2/ϵ,(2M)ϵ],butmmfor somem[(logm)ϵ,mϵ]}.\displaystyle\mathcal{R}^{\prime}:=\{m\in[M,2M]:\,p\nmid m\,\forall p\in[(\log M)^{2/\epsilon},(2M)^{\epsilon^{\prime}}],\,\,\textnormal{but}\,m^{\prime}\mid m\,\textnormal{for some}\,m^{\prime}\in[(\log m)^{\epsilon^{\prime}},m^{\epsilon}]\cap\mathbb{P}\}.

Therefore, their contribution is

1(logM)ϵϵMm2Mm1ϕ(m)2,\displaystyle\ll\frac{1}{(\log M)^{\epsilon\cdot\epsilon^{\prime}}}\sum_{\begin{subarray}{c}M\leq m\leq 2M\\ m\in\mathcal{R}^{\prime}\end{subarray}}\frac{1}{\phi(m)^{2}},

and the same argument as in the proof of Lemma 7.2 gives

Mm2Mm1ϕ(m)2ϵloglogMM(logM),\displaystyle\sum_{\begin{subarray}{c}M\leq m\leq 2M\\ m\in\mathcal{R}^{\prime}\end{subarray}}\frac{1}{\phi(m)^{2}}\ll_{\epsilon}\frac{\log\log M}{M(\log M)},

so the claim follows. ∎

Combining the estimates for the large and medium greatest common divisors, (5.4) follows, and this was enough to prove Theorem 1.3.

8   Proof of Hypothesis 1.2 under the pair correlation conjecture

In this section, we prove Theorem 1.9 to the effect that PCC implies Hypothesis 1.2. We begin with a few preliminary lemmas. Note that in Hypothesis 1.2 the parameter mm^{\prime} is prime, but that hypothesis will only be used in (8.5) and (8.6) below (and it would suffice to assume that mm^{\prime} does not have abnormally many prime factors), and the lemmas before the proof do not assume mm^{\prime} to be a prime.

8.1   Lemmas on Kummer-type extensions

We first present a standard lemma (see e.g. [15, Proposition 3.10]).

Lemma 8.1.

Let aa and bb be multiplicatively independent integers. We have

(m)2ϕ(m)a,b[(ζm,a1/m,b1/m):](m)2ϕ(m)(m^{\prime})^{2}\phi(m)\ll_{a,b}[\mathbb{Q}(\zeta_{m},a^{1/m^{\prime}},b^{1/m^{\prime}}):\mathbb{Q}]\leq(m^{\prime})^{2}\phi(m)

and

mϕ(m)a,b[(ζm,(bar)1/m):]mϕ(m)m^{\prime}\phi(m)\ll_{a,b}[\mathbb{Q}(\zeta_{m},(ba^{-r})^{1/m^{\prime}}):\mathbb{Q}]\leq m^{\prime}\phi(m)

for all positive integers rr, mm and mm^{\prime}.

In what follows, we denote

K:\displaystyle K: =(ζm,a1/m,b1/m)\displaystyle=\mathbb{Q}(\zeta_{m},a^{1/m^{\prime}},b^{1/m^{\prime}})
Kr:\displaystyle K_{r}: =(ζm,(bar)1/m),\displaystyle=\mathbb{Q}(\zeta_{m},(ba^{-r})^{1/m^{\prime}}),

where mmm^{\prime}\mid m. We bound the number of conjugacy classes in Gal(Kr/)\textup{Gal}(K_{r}/\mathbb{Q}).

Lemma 8.2.

For each rr, the number of conjugacy classes of Gal(K/)\textup{Gal}(K/\mathbb{Q}) which contain elements fixing KrK_{r} is at most τ(m)\tau(m^{\prime}). Furthermore, the size of each such conjugacy class is at most mm^{\prime}.

Proof.

To each element σG:=Gal(K/)\sigma\in G:=\textup{Gal}(K/\mathbb{Q}), we associate a triple (X,Y,Z)(/m)××(/m)2(X,Y,Z)\in(\mathbb{Z}/m\mathbb{Z})^{\times}\times(\mathbb{Z}/m^{\prime}\mathbb{Z})^{2} such that σ(ζm)=ζmX\sigma(\zeta_{m})=\zeta_{m}^{X}, σ(a1/m)=ζmYa1/m\sigma(a^{1/m^{\prime}})=\zeta_{m^{\prime}}^{Y}a^{1/m^{\prime}} and σ(b1/m)=ζmZb1/m\sigma(b^{1/m^{\prime}})=\zeta_{m^{\prime}}^{Z}b^{1/m^{\prime}}. We write σ(X,Y,Z)\sigma\sim(X,Y,Z). Consider the set of such tuples with a binary operation \ast given by

(X,Y,Z)(A,B,C)=(AX,Y+BX,Z+CX).(X,Y,Z)\ast(A,B,C)=(AX,Y+BX,Z+CX).

This operation is clearly associative. The identity of \ast is (1,0,0)(1,0,0). The inverse of (X,Y,Z)(X,Y,Z) under \ast is (X1,YX1,ZX1)(X^{-1},-YX^{-1},-ZX^{-1}), where x1x^{-1} is the inverse of xx modulo mm. Thus the set of these triples with operation \ast forms a group. The conjugacy class containing (A,B,C)(A,B,C) is given by the set of elements of the form

(X,Y,Z)(A,B,C)(X1,YX1,ZX1)=(A,BX(A1)Y,CX(A1)Z),(X,Y,Z)\ast(A,B,C)\ast(X^{-1},-YX^{-1},-ZX^{-1})=(A,BX-(A-1)Y,CX-(A-1)Z),

where (X,Y,Z)Gal(K/)(X,Y,Z)\in\textup{Gal}(K/\mathbb{Q}) varies.

Let σ(A,B,C)\sigma\sim(A,B,C) be an element of Gal(K/)\textup{Gal}(K/\mathbb{Q}) fixing KrK_{r}. Now A=1A=1 and CrB(modm)C\equiv rB\pmod{m^{\prime}}. Consider for variable (X,Y,Z)Gal(K/)(X,Y,Z)\in\textup{Gal}(K/\mathbb{Q}) the conjugation

(X,Y,Z)(1,B,rB)(X1,YX1,ZX1)=(1,XB,rXB).(X,Y,Z)\ast(1,B,rB)\ast(X^{-1},-YX^{-1},-ZX^{-1})=(1,XB,rXB).

Since X(/m)×X\in(\mathbb{Z}/m\mathbb{Z})^{\times} varies, we see that there are at most τ(m)\tau(m^{\prime}) conjugacy classes corresponding to elements fixing KrK_{r}, one for each integer dmd\mid m such that there exists a map σ(1,B,rB)\sigma\sim(1,B,rB) satisfying (m,B)=d(m,B)=d. The sum of the sizes of these conjugacy classes is at most mm^{\prime}, so each single class is of size at most mm^{\prime}. ∎

Note that the proof of Lemma 8.2 gives that if some element of a conjugacy class of Gal(K/)\textup{Gal}(K/\mathbb{Q}) fixes KrK_{r}, then all of the elements of it do.

We also need a bound for the number of conjugacy classes in Gal(K/)\textup{Gal}(K/\mathbb{Q}).

Lemma 8.3.

The number of conjugacy classes in Gal(K/)\textup{Gal}(K/\mathbb{Q}) is a,bmτ(m)2\ll_{a,b}m\tau(m^{\prime})^{2}.

Proof.

Recall the notation of the proof of Lemma 8.2. Let HH be the set of all triplets of the form (X,Y,Z)(/m)××(/m)2(X,Y,Z)\in(\mathbb{Z}/m\mathbb{Z})^{\times}\times(\mathbb{Z}/m^{\prime}\mathbb{Z})^{2} equipped with the operation \ast, so G:=Gal(K/)G:=\textup{Gal}(K/\mathbb{Q}) is a subgroup of HH. By Lemma 8.1 the index of GG in HH is bounded. This implies by [9] that the number of conjugacy classes of GG is at most a constant times that of HH. We therefore consider only the group HH from now on. In what follows, we identify the elements of /k\mathbb{Z}/k\mathbb{Z} with integers in {1,2,,k}\{1,2,\ldots,k\}.

For each A(/m)×A\in(\mathbb{Z}/m\mathbb{Z})^{\times} let g=(A1,m)g=(A-1,m) and let SAS_{A} be the set of elements (a,b,c)H(a,b,c)\in H with a=A,bga=A,b\mid g and 1cg1\leq c\leq g. Let SS be the union of SAS_{A} over all AA. We claim that each element in HH is conjugate to at least one element in SS.

Let (A,B,C)H(A,B,C)\in H be given. As in the proof of Lemma 8.2, the conjugacy class of (A,B,C)(A,B,C) consists of elements of the form

(A,BX(A1)Y,CX(A1)Z).(A,BX-(A-1)Y,CX-(A-1)Z).

There exists an element X(/m)×X\in(\mathbb{Z}/m\mathbb{Z})^{\times} such that BX(modg)BX\pmod{g} divides gg. Fix such an element XX. We may now choose YY and ZZ such that BX(A1)Y(modm)BX-(A-1)Y\pmod{m^{\prime}} divides gg and CX(A1)Z(modm)CX-(A-1)Z\pmod{m^{\prime}} belongs to {1,2,,g}\{1,2,\ldots,g\}. For this choice of X,YX,Y and ZZ we have (A,BX(A1)Y,CX(A1)Z)SA(A,BX-(A-1)Y,CX-(A-1)Z)\in S_{A}, proving the claim.

The result follows by

|S|=A(/m)×|SA|A=1mτ((A1,m))(A1,m)=gmmmϕ(m/g)τ(g)gmτ(m)2.|S|=\sum_{A\in(\mathbb{Z}/m\mathbb{Z})^{\times}}|S_{A}|\leq\sum_{A=1}^{m}\tau((A-1,m^{\prime}))(A-1,m^{\prime})=\sum_{g\mid m^{\prime}}\frac{m}{m^{\prime}}\phi(m^{\prime}/g)\tau(g)g\leq m\tau(m^{\prime})^{2}.

Lemma 8.4.

Let Cr,1,,Cr,mrC_{r,1},\ldots,C_{r,m_{r}} be the conjugacy classes of Gal(K/)\textup{Gal}(K/\mathbb{Q}) which fix KrK_{r} but which do not fix any of the fields (ζq,a1/q)K\mathbb{Q}(\zeta_{q},a^{1/q})\subset K, where qq ranges over the prime divisors of mm^{\prime}. Then the Cr,iC_{r,i} are pairwise disjoint.

Proof.

Assume that σCr1,i1Cr2,i2\sigma\in C_{r_{1},i_{1}}\cap C_{r_{2},i_{2}} for some 1r1<r2m1\leq r_{1}<r_{2}\leq m^{\prime}. Thus, σ\sigma fixes both

Kr1=(ζm,(bar1)1/m)K_{r_{1}}=\mathbb{Q}(\zeta_{m},(ba^{-r_{1}})^{1/m^{\prime}})

and

Kr2=(ζm,(bar2)1/m).K_{r_{2}}=\mathbb{Q}(\zeta_{m},(ba^{-r_{2}})^{1/m^{\prime}}).

In particular, σ\sigma fixes a(r1r2)/ma^{(r_{1}-r_{2})/m^{\prime}}. Let g=(r1r2,m)g=(r_{1}-r_{2},m^{\prime}) and let ee\in\mathbb{Z} be such that e(r1r2)g(modm)e(r_{1}-r_{2})\equiv g\pmod{m^{\prime}}. We see that σ\sigma fixes

(a(r1r2)/m)e=qag/m,(a^{(r_{1}-r_{2})/m^{\prime}})^{e}=q\cdot a^{g/m^{\prime}},

where q0q\neq 0 is some rational number, and thus σ\sigma fixes ag/ma^{g/m^{\prime}}.

Since σ\sigma fixes both am/m=1a^{m^{\prime}/m^{\prime}}=1 and ag/ma^{g/m^{\prime}}, we see similarly as above that σ\sigma also fixes a(m,g)/ma^{(m^{\prime},g)/m^{\prime}}. Noting that (m,g)(m^{\prime},g) is a proper divisor of mm^{\prime}, we deduce that σ\sigma fixes a1/qa^{1/q} for some prime qmq\mid m^{\prime}. Finally, note that σ\sigma fixes (ζq)(ζm)Kr1\mathbb{Q}(\zeta_{q})\subset\mathbb{Q}(\zeta_{m^{\prime}})\subset K_{r_{1}}. Thus, σ\sigma fixes (ζq,a1/q)\mathbb{Q}(\zeta_{q},a^{1/q}), which is a contradiction with the definition of Cr,iC_{r,i}. ∎

8.2   A Chebotarev-type estimate in the mean square

Having established these preliminary lemmas we proceed with deriving the implication of PCC needed. To begin we need a further conjecture, the Artin conjecture (AC) on LL-functions.

AC: The the Artin LL-functions associated with a field extension K/K/\mathbb{Q} are analytic in the whole complex plane.

Although Artin’s conjecture remains open in general, we will see in a moment that it can be proved for fields KK that occur in our setting, so it is not a restricting assumption.

Lemma 8.5 (Murty–Murty–Wong).

Let K/K/\mathbb{Q} be a field extension whose Artin LL-functions satisfy GRH, AC, and PCC. For a conjugacy class CC of G=Gal(K/)G=\textup{Gal}(K/\mathbb{Q}), define the Chebychev function

ψC(x):=pjx(K/p)=Clogp.\displaystyle\psi_{C}(x):=\sum_{\begin{subarray}{c}p^{j}\leq x\\ (\frac{K/\mathbb{Q}}{p})=C\end{subarray}}\log p.

Then we have

(8.1) C1|C|(ψC(x)|C||G|x)2x(logx)log2([K:]xpP(K/)p)|Irr(G)||G|,\displaystyle\sum_{C}\frac{1}{|C|}\left(\psi_{C}(x)-\frac{|C|}{|G|}x\right)^{2}\ll x(\log x)\log^{2}\left([K:\mathbb{Q}]x\prod_{p\in P(K/\mathbb{Q})}p\right)\frac{|\textnormal{Irr}(G)|}{|G|},

where P(K/)P(K/\mathbb{Q}) is the set of rational primes that are ramified in KK and Irr(G)\textnormal{Irr}(G) is the set of irreducible representations of GG.

Proof.

This is [21, Theorem 5.1] with the choices mχ=χ(1)m_{\chi}=\chi(1), cχ=χ(1)1c_{\chi}=\chi(1)^{-1}, r=1r=1 and k=k=\mathbb{Q} that correspond to our assumptions. ∎

Although Artin’s conjecture remains unsolved for general LL-functions, the next lemma says that AC holds for all Kummer-type extensions.

Lemma 8.6.

Hypothesis AC holds whenever K/K/\mathbb{Q} is a Kummer-type extension.

Proof.

It is known (see [34]) that Artin’s conjecture holds for the field extension K/K/\mathbb{Q} whenever its Galois group G:=Gal(K/)G:=\textup{Gal}(K/\mathbb{Q}) is metabelian, meaning that it has a normal subgroup AA such that both AA and G/AG/A are Abelian. Thus, for any given Kummer-type extension field K=(ζm,a1m1,,akmk)K=\mathbb{Q}(\zeta_{m},a_{1}^{m_{1}},\ldots,a_{k}^{m_{k}}), it suffices to find a subfield K0K_{0} such that K/K0K/K_{0} and K0/K_{0}/\mathbb{Q} are both Abelian extensions and Gal(K0/)\textup{Gal}(K_{0}/\mathbb{Q}) is normal in Gal(K/)\textup{Gal}(K/\mathbb{Q}). We choose K0=(ζm)K_{0}=\mathbb{Q}(\zeta_{m}). Then Gal(K0/)\textup{Gal}(K_{0}/\mathbb{Q}) is cyclic, so certainly Abelian. To show that K/K0K/K_{0} is also Abelian, it suffices to show that each Li:=(ζm,aimi)/(ζm)L_{i}:=\mathbb{Q}(\zeta_{m},a_{i}^{m_{i}})/\mathbb{Q}(\zeta_{m}) is Abelian, since the compositum of Abelian extensions is also Abelian. Again, LiL_{i} is a cyclic extension, and therefore Abelian. What remains to be shown then is that A:=Gal(K0/)A:=\textup{Gal}(K_{0}/\mathbb{Q}) is a normal subgroup of G=Gal(K/)G=\textup{Gal}(K/\mathbb{Q}). As in the proof of Lemma 8.2, for each σG\sigma\in G we write

σ(e,e1,,en)(/m)××in(/mi),\displaystyle\sigma\sim(e,e_{1},\ldots,e_{n})\in(\mathbb{Z}/m\mathbb{Z})^{\times}\times\prod_{i\leq n}(\mathbb{Z}/m_{i}\mathbb{Z}),

where this tuple is uniquely determined by the conditions σ(ζm)=ζme\sigma(\zeta_{m})=\zeta_{m}^{e} and σ(ai1/mi)=ζmeiai1/mi\sigma(a_{i}^{1/m_{i}})=\zeta_{m}^{e_{i}}a_{i}^{1/m_{i}} for all iki\leq k. The relation \sim is an equivalence relation on GG. If σA\sigma^{\prime}\in A is arbitrary, we may write σ(a,0,,0)\sigma^{\prime}\sim(a,0,\ldots,0) for some a(/m)×a\in(\mathbb{Z}/m\mathbb{Z})^{\times}. Now σ1(e1,e1e1,,ene1)\sigma^{-1}\sim(e^{-1},-e_{1}e^{-1},\ldots,-e_{n}e^{-1}), so σσσ1=(a,0,,0)=σ\sigma\sigma^{\prime}\sigma^{-1}=(a,0,\ldots,0)=\sigma^{\prime}. Since σG\sigma\in G, σA\sigma^{\prime}\in A are arbitrary, the claim follows. ∎

Proposition 8.7.

Assume GRH and PCC. For K=(ζm,a1/m,b1/m)K=\mathbb{Q}(\zeta_{m},a^{1/m^{\prime}},b^{1/m^{\prime}}), we have

C1|C|(πC(x)|C||G|Li(x))2a,b,hx(logx)3τ(m)2mϕ(m).\displaystyle\sum_{C}\frac{1}{|C|}\left(\pi_{C}(x)-\frac{|C|}{|G|}\textup{Li}(x)\right)^{2}\ll_{a,b,h}x(\log x)^{3}\frac{\tau(m^{\prime})^{2}}{m^{\prime}\phi(m)}.
Proof.

It is well known that |Irr(G)||\textnormal{Irr(G)}| equals to the number of conjugacy classes in GG, which by Lemma 8.3 is mτ(m)2\ll m\tau(m^{\prime})^{2}. Further, we have |G|=[K:](m)2ϕ(m)|G|=[K:\mathbb{Q}]\gg(m^{\prime})^{2}\phi(m) by Lemma 8.1. Lastly, note that if pP(K/)p\in P(K/\mathbb{Q}), then pdisc(K:)p\mid\textnormal{disc}(K:\mathbb{Q}) and that by [4, page 490] we have

disc(K:)(disc((ζm):)disc((a1/m):)disc((b1/m):))R\displaystyle\textnormal{disc}(K:\mathbb{Q})\mid(\textnormal{disc}(\mathbb{Q}(\zeta_{m}):\mathbb{Q})\cdot\textnormal{disc}(\mathbb{Q}(a^{1/m}):\mathbb{Q})\cdot\textnormal{disc}(\mathbb{Q}(b^{1/m}):\mathbb{Q}))^{R}

for some RR. A computation of the discriminants shows that

p\displaystyle p disc((ζm):)pm\displaystyle\mid\textnormal{disc}(\mathbb{Q}(\zeta_{m}):\mathbb{Q})\Longrightarrow p\mid m
p\displaystyle p disc((a1/m):)pa\displaystyle\mid\textnormal{disc}(\mathbb{Q}(a^{1/m}):\mathbb{Q})\Longrightarrow p\mid a
p\displaystyle p disc((b1/m):)pb,\displaystyle\mid\textnormal{disc}(\mathbb{Q}(b^{1/m}):\mathbb{Q})\Longrightarrow p\mid b,

so

pP(K/)pa,bmx.\displaystyle\prod_{p\in P(K/\mathbb{Q})}p\ll_{a,b}m\leq x.

In conclusion, we have the bound

(8.2) C1|C|(ψC(x)|C||G|x)2x(logx)3τ(m)2mϕ(m).\displaystyle\sum_{C}\frac{1}{|C|}\left(\psi_{C}(x)-\frac{|C|}{|G|}x\right)^{2}\ll x(\log x)^{3}\frac{\tau(m^{\prime})^{2}}{m^{\prime}\phi(m)}.

To pass from ψC(x)\psi_{C}(x) to πC(x)\pi_{C}(x), we use partial summation. Note that

ψC(x)=pjx(K/p)=Clogp=px(K/p)=Clogp+O(x),\displaystyle\psi_{C}(x)=\sum_{\begin{subarray}{c}p^{j}\leq x\\ (\frac{K/\mathbb{Q}}{p})=C\end{subarray}}\log p=\sum_{\begin{subarray}{c}p\leq x\\ (\frac{K/\mathbb{Q}}{p})=C\end{subarray}}\log p+O(\sqrt{x}),

so by partial summation

πC(x)=ψC(x)logx+2xψC(t)t(logt)2𝑑t+O(x).\displaystyle\pi_{C}(x)=\frac{\psi_{C}(x)}{\log x}+\int_{2}^{x}\frac{\psi_{C}(t)}{t(\log t)^{2}}\,dt+O(\sqrt{x}).

This gives

πC(x)|C||G|Li(x)=ψC(x)|C||G|xlogx+2xψC(t)|C||G|Li(t)t(logt)2𝑑t+O(x),\displaystyle\pi_{C}(x)-\frac{|C|}{|G|}\textup{Li}(x)=\frac{\psi_{C}(x)-\frac{|C|}{|G|}x}{\log x}+\int_{2}^{x}\frac{\psi_{C}(t)-\frac{|C|}{|G|}\textup{Li}(t)}{t(\log t)^{2}}\,dt+O(\sqrt{x}),

and further

(8.3) (πC(x)|C||G|Li(x))2(ψC(x)|C||G|xlogx)2+(2xψC(t)|C||G|Li(t)t(logt)2𝑑t)2+x.\displaystyle\left(\pi_{C}(x)-\frac{|C|}{|G|}\textup{Li}(x)\right)^{2}\ll\left(\frac{\psi_{C}(x)-\frac{|C|}{|G|}x}{\log x}\right)^{2}+\left(\int_{2}^{x}\frac{\psi_{C}(t)-\frac{|C|}{|G|}\textup{Li}(t)}{t(\log t)^{2}}\,dt\right)^{2}+x.

By Cauchy–Schwarz, we can bound

(8.4) (2xψC(t)|C||G|Li(t)t(logt)2𝑑t)2(2xdtt(logt)2)(2x(ψC(t)|C||G|Li(t))2t(logt)2𝑑t)2x(ψC(t)|C||G|Li(t))2t(logt)2𝑑t.\displaystyle\begin{split}\left(\int_{2}^{x}\frac{\psi_{C}(t)-\frac{|C|}{|G|}\textup{Li}(t)}{t(\log t)^{2}}\,dt\right)^{2}&\leq\left(\int_{2}^{x}\frac{dt}{t(\log t)^{2}}\right)\left(\int_{2}^{x}\frac{(\psi_{C}(t)-\frac{|C|}{|G|}\textup{Li}(t))^{2}}{t(\log t)^{2}}\,dt\right)\\ &\ll\int_{2}^{x}\frac{(\psi_{C}(t)-\frac{|C|}{|G|}\textup{Li}(t))^{2}}{t(\log t)^{2}}\,dt.\end{split}

Now, multiplying (8.3) by 1/|C|1/|C|, summing over CC and using (8.4), (8.2), we get

C1|C|(πC(x)|C||G|Li(x))2a,bx(logx)3τ(m)2mϕ(m).\displaystyle\sum_{C}\frac{1}{|C|}\left(\pi_{C}(x)-\frac{|C|}{|G|}\textup{Li}(x)\right)^{2}\ll_{a,b}x(\log x)^{3}\frac{\tau(m^{\prime})^{2}}{m^{\prime}\phi(m)}.

as desired. ∎

We are now ready to prove Hypothesis 1.2 under PCC.

Proof of Theorem 1.9 assuming PCC.

We may assume that we are in the regime ym10y\leq m^{10}, say, since otherwise the GRH-conditional estimate in (1.3) is good enough.

Recall that Cr,1,,Cr,mrC_{r,1},\ldots,C_{r,m_{r}} are the conjugacy classes of Gal(K/)\textup{Gal}(K/\mathbb{Q}) which fix KrK_{r} but which do not fix any of the fields (ζq,a1/q)K\mathbb{Q}(\zeta_{q},a^{1/q})\subset K, where qq ranges over the prime divisors of mm^{\prime}. By Lemma 8.2, we have |Cr,i|m|C_{r,i}|\leq m^{\prime} and mrτ(m)m_{r}\leq\tau(m^{\prime}) for all rr and ii. Let πCr,i(x)\pi_{C_{r,i}}(x) denote the number of primes pxp\leq x such that the Artin symbol of pp with respect to KK is Cr,iC_{r,i}. We now have

r=0m1|{py:psplits completely in(ζm,(bar)1/m)}|2\displaystyle\sum_{r=0}^{m^{\prime}-1}|\{p\leq y:p\,\,\textnormal{splits completely in}\,\,\mathbb{Q}(\zeta_{m},(ba^{-r})^{1/m^{\prime}})\}|^{2}
=r(modm)(i=1mrπCr,i(y))2\displaystyle=\sum_{r\pmod{m^{\prime}}}\left(\sum_{i=1}^{m_{r}}\pi_{C_{r,i}}(y)\right)^{2}
r(modm)(i=1mr(πCr,i(y)Li(y)|Cr,i|[K:]))2+r(modm)(Li(y)|Cr,i|[K:])2.\displaystyle\ll\sum_{r\pmod{m^{\prime}}}\left(\sum_{i=1}^{m_{r}}\left(\pi_{C_{r,i}}(y)-\textup{Li}(y)\frac{|C_{r,i}|}{[K:\mathbb{Q}]}\right)\right)^{2}+\sum_{r\pmod{m^{\prime}}}\left(\textup{Li}(y)\frac{|C_{r,i}|}{[K:\mathbb{Q}]}\right)^{2}.

By Lemmas 8.1 and 8.2, and the assumption that mm^{\prime} is prime, the latter sum is

(8.5) τ(m)2y2mϕ(m)2(logy)2ϵy2ϕ(m)2(logy)2m.\displaystyle\ll\frac{\tau(m^{\prime})^{2}y^{2}}{m^{\prime}\phi(m)^{2}(\log y)^{2}}\ll_{\epsilon}\frac{y^{2}}{\phi(m)^{2}(\log y)^{2}m^{\prime}}.

When it comes to the first sum, we estimate using Cauchy–Schwarz that

r(modm)(i=1mr(πCr,i(y)Li(y)|Cr,i|[K:]))2\displaystyle\sum_{r\pmod{m^{\prime}}}\left(\sum_{i=1}^{m_{r}}\left(\pi_{C_{r,i}}(y)-\textup{Li}(y)\frac{|C_{r,i}|}{[K:\mathbb{Q}]}\right)\right)^{2}
r(modm)mri=1mr(πCr,i(y)Li(y)|Cr,i|[K:])2\displaystyle\ll\sum_{r\pmod{m^{\prime}}}m_{r}\sum_{i=1}^{m_{r}}\Big{(}\pi_{C_{r,i}}(y)-\textup{Li}(y)\frac{|C_{r,i}|}{[K:\mathbb{Q}]}\Big{)}^{2}
τ(m)r(modm)i=1mr(πCr,i(y)Li(y)|Cr,i|[K:])2\displaystyle\ll\tau(m^{\prime})\sum_{r\pmod{m^{\prime}}}\sum_{i=1}^{m_{r}}\Big{(}\pi_{C_{r,i}}(y)-\textup{Li}(y)\frac{|C_{r,i}|}{[K:\mathbb{Q}]}\Big{)}^{2}
τ(m)mr(modm)i=1mr1|Cr,i|(πCr,i(y)Li(y)|Cr,i|[K:])2.\displaystyle\ll\tau(m^{\prime})m^{\prime}\sum_{r\pmod{m^{\prime}}}\sum_{i=1}^{m_{r}}\frac{1}{|C_{r,i}|}\Big{(}\pi_{C_{r,i}}(y)-\textup{Li}(y)\frac{|C_{r,i}|}{[K:\mathbb{Q}]}\Big{)}^{2}.

Note then that the conjugacy classes are disjoint by Lemma 8.4. Hence, Proposition 8.7 produces for the previous expression a bound of

(8.6) τ(m)3ϕ(m)y(logy)3y2ϕ(m)2(logy)2m(logy)5yy2ϕ(m)2(logy)2max{1(logy)2,1m},\displaystyle\begin{split}&\ll\frac{\tau(m^{\prime})^{3}}{\phi(m)}y(\log y)^{3}\ll\frac{y^{2}}{\phi(m)^{2}(\log y)^{2}}\cdot\frac{m(\log y)^{5}}{y}\\ &\ll\frac{y^{2}}{\phi(m)^{2}(\log y)^{2}}\max\left\{\frac{1}{(\log y)^{2}},\frac{1}{m^{\prime}}\right\},\end{split}

since ymexp((logm)1/2)y\geq m\exp((\log m)^{1/2}), ϕ(m)m/loglogm\phi(m)\gg m/\log\log m, τ(m)=2\tau(m^{\prime})=2 and mmϵm^{\prime}\leq m^{\epsilon}. Combining (8.5) and (8.6), the claim follows. ∎

9   Almost prime values of shifted exponentials

In this section, we prove Theorem 1.6. The proof follows along the same lines as the proof of Theorem 1.3.

Proof of Theorem 1.6.

Firstly, we deal with the simple cases |b|1|b|\leq 1. The case b=0b=0 is trivial. For the cases b=±1b=\pm 1, we observe that by the Hardy–Ramanujan theorem almost all nxn\leq x have (1+o(1))loglogx(1+o(1))\log\log x prime divisors, and if pnp\mid n is odd, then ap±1an±1a^{p}\pm 1\mid a^{n}\pm 1 for either choice of the sign ±\pm. To conclude, we apply a theorem of Zsigmondy [27], according to which each term in the sequence an1a^{n}-1 for n7n\geq 7 (and also for n6n\leq 6 with a few exceptions) has a primitive divisor, that is, a prime factor qan1q\mid a^{n}-1 that does not divide am1a^{m}-1 for any 1m<n1\leq m<n.

Now let |b|>1|b|>1. Let the sets ApA_{p} be defined as in Section 5. In the course of proving Theorem 1.3, we in fact proved more strongly that

Pr({nx:|pSx1Ap(n)μ|εμ})=oε((loglogx)2),\displaystyle\Pr\left(\left\{n\leq x:\Big{|}\sum_{p\in S_{\leq\sqrt{x}}}1_{A_{p}}(n)-\mu\big{|}\geq\varepsilon\mu\right\}\right)=o_{\varepsilon}((\log\log x)^{2}),

where

μ:=pSxPr(Ap)=(c+o(1))loglogx\displaystyle\mu:=\sum_{p\in S_{\leq\sqrt{x}}}\Pr(A_{p})=(c+o(1))\log\log x

for some constant c=ca,b>0c=c_{a,b}>0. Indeed, this follows from Lemma 3.1 and formulas (5.1) and (5.2). The claim is now proved. ∎

10   The case of prime exponents

We turn to the proof of Theorem 1.7. Most steps go through similarly as in the case of Theorem 1.3, but there are a few extra complications, namely the set SS from Section 4 needs to be defined differently, using the “WW-trick”, and therefore we will also need Lemma 10.1 below.

Proof of Theorem 1.7.

Let hah_{a} and hbh_{b} be the largest integers such that aa (respectively bb) is a perfect power888Note that here we can no longer assume that aa is not a perfect power. of order hah_{a} (respectively hbh_{b}). Define h=hahbh=h_{a}h_{b}. Let W=pw,p2abhpW=\prod_{p\leq w,p\nmid 2abh}p, where ww is a large parameter. We redefine the set SS in Section 4 as

S:\displaystyle S: ={p:p1(mod|4abh|),p2(modW),ordp(a)=ordp(b)=p12h},\displaystyle=\{p\in\mathbb{P}:\,\,p\equiv 1\pmod{|4abh|},\,\,p\equiv 2\pmod{W},\,\,\textup{ord}_{p}(a)=\textup{ord}_{p}(b)=\frac{p-1}{2h}\},

As before, define Sm,d,rS_{m,d,r} by (4.1). We observe then that for pSp\in S we have ((p),ordp(a))=1(\ell(p),\textup{ord}_{p}(a))=1. Indeed, if this did not hold, we would have

ordp(a(p))ordp(a)2=ordp(b)2<ordp(b),\displaystyle\textup{ord}_{p}(a^{\ell(p)})\leq\frac{\textup{ord}_{p}(a)}{2}=\frac{\textup{ord}_{p}(b)}{2}<\textup{ord}_{p}(b),

which is absurd as a(p)b(modp)a^{\ell(p)}\equiv b\pmod{p}.

We begin by proving that d(S)d_{\mathbb{P}}(S) exists and is positive, and furthermore that d(Sm,d,r)d_{\mathbb{P}}(S_{m,d,r}) is independent of rr as long as rr is coprime to mm. We remark that Lenstra’s results in [18] do not immediately apply to our situation. In [18], the results concern the order of the reduction of a single multiplicative subgroup of ×\mathbb{Q}^{\times} modulo primes, while in our case we wish to control simultaneously the order of both aa and bb. However, Lenstra’s methods may be adapted to this more general setting of two multiplicative subgroups, the details of the proofs being given in [15]. We thus have analogues of Lenstra’s results in this setting.

Similarly as in Section 4, define

Fm,d,r:=(ζ|4abh|,ζW,a1/2h,b1/2h,(bar)1/2hm),F_{m,d,r}:=\mathbb{Q}(\zeta_{|4abh|},\zeta_{W},a^{1/2h},b^{1/2h},(ba^{-r})^{1/2hm}),
L,a:=(ζq(),a1/q())L_{\ell,a}:=\mathbb{Q}(\zeta_{q(\ell)},a^{1/q(\ell)})

and

L,b:=(ζq(),b1/q()),L_{\ell,b}:=\mathbb{Q}(\zeta_{q(\ell)},b^{1/q(\ell)}),

where q()q(\ell) is the smallest power of \ell not dividing 2h2h, and let Ln,aL_{n,a} (respectively Ln,bL_{n,b}) denote the compositum of L,aL_{\ell,a} (respectively L,bL_{\ell,b}) over the primes n\ell\mid n. Now pSp\in S if and only if the Artin symbol of pp with respect to F1,0,0F_{1,0,0} belongs to a suitable conjugacy class CC of Gal(F1,0,0/)\textup{Gal}(F_{1,0,0}/\mathbb{Q}) (namely the one fixing

F1,1,0:=(ζ|4abh|,a1/2h,b1/2h)F_{1,1,0}^{\prime}:=\mathbb{Q}(\zeta_{|4abh|},a^{1/2h},b^{1/2h})

and mapping ζWζW2\zeta_{W}\to\zeta_{W}^{2}) and pp does not split completely in any of L,aL_{\ell,a} and L,bL_{\ell,b} for any prime \ell. If ana_{n} is defined as the local density of primes pp satisfying this condition for all n\ell\mid n, by the analogue of Lenstra’s result (4.3) it suffices to show that the local densities an0a_{n}\neq 0 for all nn in order to have d(S)0d_{\mathbb{P}}(S)\neq 0.

Proceeding as in Section 4, one has to check two conditions. The first condition is a10a_{1}\neq 0, i.e. that CC\neq\emptyset. To do this one notes that the largest Abelian subfield of F1,1,0F_{1,1,0}^{\prime} does not intersect (ζW)\mathbb{Q}(\zeta_{W}) nontrivially, and thus [F1,1,0(ζW):F1,1,0]=ϕ(W)[F_{1,1,0}^{\prime}(\zeta_{W}):F_{1,1,0}^{\prime}]=\phi(W), implying that

(10.1) Gal(F1,1,0/)Gal(F1,1,0/)×Gal((ζW)/).\displaystyle\textup{Gal}(F_{1,1,0}/\mathbb{Q})\simeq\textup{Gal}(F_{1,1,0}^{\prime}/\mathbb{Q})\times\textup{Gal}(\mathbb{Q}(\zeta_{W})/\mathbb{Q}).

It follows that |C|=1|C|=1.

The second condition to check is that for any prime \ell and squarefree nn we have

[K(a1/q(),b1/q()):K]=2,[K^{\prime}(a^{1/q(\ell)},b^{1/q(\ell)}):K^{\prime}]=\ell^{2},

where

K:=(ζ|4abh|,ζW,a1/2h,b1/2h,a1/q(n/),b1/q(n/)).K^{\prime}:=\mathbb{Q}(\zeta_{|4abh|},\zeta_{W},a^{1/2h},b^{1/2h},a^{1/q(n/\ell)},b^{1/q(n/\ell)}).

We do this by showing, more generally, that for any positive integers V,sV,s and tt satisfying 4abh,s,tV4abh,s,t\mid V and 2hs,t2h\mid s,t we have

(10.2) [(ζV,a1/s,b1/t):]=ϕ(V)s2hat2hb\displaystyle[\mathbb{Q}(\zeta_{V},a^{1/s},b^{1/t}):\mathbb{Q}]=\phi(V)\frac{s}{2h_{a}}\frac{t}{2h_{b}}

(cf. equation (4.5)). The upper bound implicit in the result is easy to establish. For the lower bound, apply the result in [10] to the numbers of the form ae/sbf/t,0e<s/2ha,0f<t/2hba^{e/s}b^{f/t},0\leq e<s/2h_{a},0\leq f<t/2h_{b} over (ζV)\mathbb{Q}(\zeta_{V}). We want to check that

ae/sbf/t(ζV)a^{e/s}b^{f/t}\in\mathbb{Q}(\zeta_{V})

for such ee and ff implies that e=f=0e=f=0. Similarly to the proof of (4.5), we obtain the divisibility relations

s2hae,t2hbf,s\mid 2h_{a}e,t\mid 2h_{b}f,

from which the result follows.

Having established d(S)0d_{\mathbb{P}}(S)\neq 0 we now proceed to proving that d(Sm,d,r)d_{\mathbb{P}}(S_{m,d,r}) is independent of rr (for (r,m)=1(r,m)=1). By a variant of Lenstra’s product formula in (4.3) for multiple variables one has

d(Sm,d,r)=ann(1C()[L,aL,b:]),d_{\mathbb{P}}(S_{m,d,r})=a_{n}\prod_{\begin{subarray}{c}\ell\nmid n\\ \ell\in\mathbb{P}\end{subarray}}\left(1-\frac{C(\ell)}{[L_{\ell,a}L_{\ell,b}:\mathbb{Q}]}\right),

where C()C(\ell) is the number of conjugacy classes of Gal(L,aL,b/)\textup{Gal}(L_{\ell,a}L_{\ell,b}/\mathbb{Q}) fixing at least one of L,aL_{\ell,a} and L,bL_{\ell,b}. Since the terms of the infinite product do not depend on rr, it suffices to check that the local densities an=an(r)a_{n}=a_{n}(r) do not depend on rr.

Clearly for any prime W\ell\mid W and nn not divisible by \ell we have an(r)=an/(r)a_{n}(r)=a_{n/\ell}(r), as the condition p2(modW)p\equiv 2\pmod{W} guarantees that pp does not split completely in (ζ)\mathbb{Q}(\zeta_{\ell}) and thus not in L,aL_{\ell,a} or L,bL_{\ell,b}. We may therefore assume that (n,W)=1(n,W)=1.

Similarly as one proves |C|=1|C|=1 via (10.1), one obtains

Gal(Fm,d,rLka,aLkb,b/)=Gal(Fm,d,rLka,aLkb,b/)×Gal((ζW)/)\displaystyle\textup{Gal}(F_{m,d,r}L_{k_{a},a}L_{k_{b},b}/\mathbb{Q})=\textup{Gal}(F_{m,d,r}^{\prime}L_{k_{a},a}L_{k_{b},b}/\mathbb{Q})\times\textup{Gal}(\mathbb{Q}(\zeta_{W})/\mathbb{Q})

for any ka,kbnk_{a},k_{b}\mid n, where

Fm,d,r:=(ζ|4abh|,a1/2h,b1/2h,(bar)1/2hm).F_{m,d,r}^{\prime}:=\mathbb{Q}(\zeta_{|4abh|},a^{1/2h},b^{1/2h},(ba^{-r})^{1/2hm}).

We thus see that there is exactly one element σGal(Fm,d,rLka,aLkb,b/)\sigma\in\textup{Gal}(F_{m,d,r}L_{k_{a},a}L_{k_{b},b}/\mathbb{Q}) which fixes Lka,aLkb,bL_{k_{a},a}L_{k_{b},b} and whose restriction to Fm,d,rF_{m,d,r} belongs to CC. This leads to the following analogue of (4.4):

an(r)=1[Fm,d,rLn,aLn,b:]kankbnμ(ka)μ(kb)[Fm,d,rLka,aLkb,b:].a_{n}(r)=\frac{1}{[F_{m,d,r}L_{n,a}L_{n,b}:\mathbb{Q}]}\sum_{k_{a}\mid n}\sum_{k_{b}\mid n}\mu(k_{a})\mu(k_{b})[F_{m,d,r}L_{k_{a},a}L_{k_{b},b}:\mathbb{Q}].

To conclude the proof one proves an analogue of (4.5) and (10.2). This time one claims that

[(ζV,a1/s,b1/t,(bar)1/u):]=ϕ(V)s2hat2hbu(s,t,u),[\mathbb{Q}(\zeta_{V},a^{1/s},b^{1/t},(ba^{-r})^{1/u}):\mathbb{Q}]=\phi(V)\frac{s}{2h_{a}}\frac{t}{2h_{b}}\frac{u}{(s,t,u)},

when 4abh,s,t,uV,2has,2hbt,2hahbu4abh,s,t,u\mid V,2h_{a}\mid s,2h_{b}\mid t,2h_{a}h_{b}\mid u and (r,u)=1(r,u)=1. As before, the implied upper bound is easy. For the lower bound, we once again apply [10] and consider the numbers of the form

ae/sbf/t(bar)g/u,a^{e/s}b^{f/t}(ba^{-r})^{g/u},

where 0e<s/2ha,0f<t/2hb0\leq e<s/2h_{a},0\leq f<t/2h_{b} and 0g<u/(s,t,u)0\leq g<u/(s,t,u). Assume that such a number belongs to a cyclotomic field. Define

C:=(ae/sbf/t(bar)g/u)stu=aeturgstbsfu+stg.C:=\left(a^{e/s}b^{f/t}(ba^{-r})^{g/u}\right)^{stu}=a^{etu-rgst}b^{sfu+stg}.

The integer CC must be a perfect power of order stu/2stu/2, and so

(10.3) stu2ha(eturgst)\displaystyle stu\mid 2h_{a}(etu-rgst)

and

(10.4) stu2hb(sfu+stg).\displaystyle stu\mid 2h_{b}(sfu+stg).

The rest is elementary. As 2has2h_{a}\mid s, the first condition gives us

2hatu2hargst,2h_{a}tu\mid 2h_{a}rgst,

and similarly from the second condition

2hbsu2hbstg.2h_{b}su\mid 2h_{b}stg.

We now have, using (r,u)=1(r,u)=1, the relation ug(s,t)u\mid g(s,t) and thus u/(s,t,u)gu/(s,t,u)\mid g. This implies g=0g=0. Plugging this into (10.3) and (10.4) gives s2haes\mid 2h_{a}e and t2hbft\mid 2h_{b}f, and thus e=f=0e=f=0.

Thus all of the results of Section 4 work with this new definition of SS. When it comes to Section 5, we redefine Pr as the uniform probability measure on [1,x][1,x]\cap\mathbb{P}. We also redefine

Ap:={p[1,x]:p(p)(modordp(a)),p(p)}.\displaystyle A_{p}:=\{p^{\prime}\in[1,x]\cap\mathbb{P}:\,\,p^{\prime}\equiv\ell(p)\pmod{\textup{ord}_{p}(a)},\,p^{\prime}\neq\ell(p)\}.

The second moment method (Lemma 3.1) gives an upper bound

p,qSlogxPr(ApAq)(pSlogxPr(Ap))21\displaystyle\ll\frac{\sum_{p,q\in S_{\leq\log x}}\Pr(A_{p}\cap A_{q})}{\left(\sum_{p\in S_{\leq\log x}}\Pr(A_{p})\right)^{2}}-1

for the probability that apba^{p}-b is prime with pxp\leq x. By the Siegel–Walfisz theorem and the fact that pSp\in S implies ((p),ordp(a))=1(\ell(p),\textup{ord}_{p}(a))=1, we see that the previous expression is

(10.5) =p,qSlogx1/(ϕ(lcm(ordp(a)ordq(a))))+O((logx)100))(pSlogx(1/(ϕ(ordp(a))+O((logx)100))))21.\displaystyle=\frac{\sum_{p,q\in S_{\leq\log x}}1/(\phi(\textup{lcm}(\textup{ord}_{p}(a)\textup{ord}_{q}(a))))+O((\log x)^{-100}))}{\left(\sum_{p\in S_{\leq\log x}}(1/(\phi(\textup{ord}_{p}(a))+O((\log x)^{-100})))\right)^{2}}-1.

The denominator here is

(10.6) (pSlogx1ϕ((p1)/2h))2+O(1)\displaystyle\left(\sum_{p\in S_{\leq\log x}}\frac{1}{\phi((p-1)/2h)}\right)^{2}+O(1)

Since ϕ((p1)/2h)/p\phi((p-1)/2h)/p fluctuates on the primes, we cannot a priori compute this sum just based on the value of d(S)d_{\mathbb{P}}(S). However, using the following lemma we can compute (10.6) in terms of d(S)d_{\mathbb{P}}(S).

Lemma 10.1.

Let x2x\geq 2, kk\in\mathbb{N} w2w\geq 2, and let W=pw,pkpW=\prod_{p\leq w,p\nmid k}p. Then for λ1\lambda\geq 1 and WkxWk\leq\sqrt{x}, kx0.1k\leq x^{0.1} we have

(10.7) |{px:p1(modk),p2(modW),p1ϕ(p1)>λkϕ(k)}|λwπ(x)ϕ(kW).\displaystyle|\{p\leq x:\,\,p\equiv 1\pmod{k},\,\,p\equiv 2\pmod{W},\,\,\frac{p-1}{\phi(p-1)}>\lambda\frac{k}{\phi(k)}\}|\ll\lambda^{-w}\frac{\pi(x)}{\phi(kW)}.

We remark that much stronger tail bounds are known in the case k=W=1k=W=1; see [33] for instance. However, for us the WW-aspect is crucial, since in particular taking λ=1+w0.9\lambda=1+w^{-0.9} we can conclude that all but exp(cw0.1)\ll\exp(-cw^{0.1})-proportion of primes px,p1(modk),p2(modW)p\leq x,p\equiv 1\pmod{k},p\equiv 2\pmod{W} satisfy ϕ(p1)=(1+O(w0.9))ϕ(k)kp\phi(p-1)=(1+O(w^{-0.9}))\frac{\phi(k)}{k}p, say.

Proof.

This is a standard application of the method of moments. For any 1\ell\geq 1, the left-hand side of (10.7) is

(10.8) λpxp1(modk)p2(modW)(p1ϕ(p1)ϕ(k)k).\displaystyle\leq\lambda^{-\ell}\sum_{\begin{subarray}{c}p\leq x\\ p\equiv 1\pmod{k}\\ p\equiv 2\pmod{W}\end{subarray}}\left(\frac{p-1}{\phi(p-1)}\frac{\phi(k)}{k}\right)^{\ell}.

For n1(modk),n2(modW)n\equiv 1\pmod{k},n\equiv 2\pmod{W}, we have

(nϕ(n)ϕ(k)k)=pnpkW(11p):=g(n),\displaystyle\left(\frac{n}{\phi(n)}\frac{\phi(k)}{k}\right)^{\ell}=\prod_{\begin{subarray}{c}p\mid n\\ p\nmid kW\end{subarray}}\left(1-\frac{1}{p}\right)^{-\ell}:=g(n),

where g(n)g(n) is a multiplicative function defined by g(pj)=(11p)g(p^{j})=(1-\frac{1}{p})^{-\ell} for pkWp\nmid kW and g(pj)=1g(p^{j})=1 for pkWp\mid kW. By Möbius inversion, we have g(n)=dnh(d)g(n)=\sum_{d\mid n}h(d), where hh is a multiplicative function defined by h(p)=g(p)1h(p)=g(p)-1 for pkWp\nmid kW and h(pj)=0h(p^{j})=0 when j2j\geq 2 or j=1j=1, pkWp\mid kW.

Now the sum over pp in (10.8) becomes

(10.9) dxh(d)pxp1(modlcm(k,d))p2(modW)1.\displaystyle\sum_{d\leq x}h(d)\sum_{\begin{subarray}{c}p\leq x\\ p\equiv 1\pmod{\textup{lcm}(k,d)}\\ p\equiv 2\pmod{W}\end{subarray}}1.

By the Brun–Titchmarsh inequality, the contribution of the terms 1dx0.41\leq d\leq x^{0.4} to the sum is

π(x)ϕ(W)dx0.4h(d)ϕ(lcm(k,d))\displaystyle\ll\frac{\pi(x)}{\phi(W)}\sum_{d\leq x^{0.4}}\frac{h(d)}{\phi(\textup{lcm}(k,d))}
π(x)ϕ(kW)pkW(1+h(p)ϕ(p/(k,p)))\displaystyle\leq\frac{\pi(x)}{\phi(kW)}\prod_{p\nmid kW}\left(1+\frac{h(p)}{\phi(p/(k,p))}\right)
π(x)ϕ(kW)pkW(1+(1+1/(p1))1p1).\displaystyle\ll\frac{\pi(x)}{\phi(kW)}\prod_{p\nmid kW}\left(1+\frac{(1+1/(p-1))^{\ell}-1}{p-1}\right).

By the mean value theorem for derivatives, we have (1+x)1+ex(1+x)^{\ell}\leq 1+e\ell x for 0x1/(1)0\leq x\leq 1/(\ell-1). This together with the inequality 1+xex1+x\leq e^{x} lets us bound the previous expression by

π(x)ϕ(kW)exp(ep>w1(p1)2)π(x)ϕ(kW),\displaystyle\frac{\pi(x)}{\phi(kW)}\exp\left(e\ell\sum_{p>w}\frac{1}{(p-1)^{2}}\right)\ll\frac{\pi(x)}{\phi(kW)},

by the prime number theorem and the choice =w\ell=w.

The contribution to (10.9) from x0.4dxx^{0.4}\leq d\leq x, in turn, is trivially

xx0.4dxh(d)lcm(k,d)Wx0.97kWx0.4dxh(d)(d/(k,d))0.9,\displaystyle\ll x\sum_{x^{0.4}\leq d\leq x}\frac{h(d)}{\textup{lcm}(k,d)W}\leq\frac{x^{0.97}}{kW}\sum_{x^{0.4}\leq d\leq x}\frac{h(d)}{(d/(k,d))^{0.9}},

and by essentially the same computation as above this is

x0.97kWexp(Cw0.1)x0.98kW,\displaystyle\ll\frac{x^{0.97}}{kW}\exp(Cw^{0.1})\ll\frac{x^{0.98}}{kW},

since wlogxw\ll\log x. Recalling that we have a factor of λ\lambda^{-\ell} in (10.8), the proof is complete. ∎

By Lemma 10.1, for all but exp(cw0.1)\ll\exp(-cw^{0.1})-proportion of pSp\in S we have ϕ(p1)=(1+O(w0.9))ϕ(|4ab|)|4ab|p\phi(p-1)=(1+O(w^{-0.9}))\frac{\phi(|4ab|)}{|4ab|}p, and for any k1k\geq 1, for all but exp(cw0.1)210k\ll\exp(-cw^{0.1})2^{-10k}-proportion of pSp\in S we have ϕ(p1)2k(1w0.9)ϕ(|4ab|)|4ab|p\phi(p-1)\geq 2^{-k}(1-w^{-0.9})\frac{\phi(|4ab|)}{|4ab|}p. Thus, the expression (10.6) takes the form

(pSlogx1(p1)/2hϕ(|4ab|)|4ab|)2+ow((loglogx)2)\displaystyle\left(\sum_{p\in S_{\leq\log x}}\frac{1}{(p-1)/2h\cdot\frac{\phi(|4ab|)}{|4ab|}}\right)^{2}+o_{w\to\infty}((\log\log x)^{2})
=(4h2(|4ab|ϕ(|4ab|))2d(S)2+ow(1))(loglogx)2,\displaystyle=\left(4h^{2}\left(\frac{|4ab|}{\phi(|4ab|)}\right)^{2}d_{\mathbb{P}}(S)^{2}+o_{w\to\infty}(1)\right)(\log\log x)^{2},

which is the same quantity as in Section 5.

To deal with the numerator in (10.5), we again apply Lemma 10.1 to write it as

p,qSlogx(|4ab|/ϕ(|4ab|))2lcm((p1)/2h,(q1)/2h)+ow((loglogx)2).\displaystyle\sum_{p,q\in S_{\leq\log x}}\frac{(|4ab|/\phi(|4ab|))^{2}}{\textup{lcm}((p-1)/2h,(q-1)/2h)}+o_{w\to\infty}((\log\log x)^{2}).

Then we split the sum according to the value of m:=((p1)/2h,(q1)/2h)m:=((p-1)/2h,(q-1)/2h). We reduce to proving (5.3) and (5.4) (with the difference that the pp sum in the definition of Σm,d,r\Sigma_{m,d,r} only goes up to logx\log x and the rr sum in Σm,d\Sigma_{m,d} only goes over (r,m)=1(r,m)=1). Estimate (5.4) is proved precisely as in the case of Theorem 1.3, since the set SS in Section 4 contains the set SS defined in this section. Estimate (5.3) is proved similarly as in Section 6, since the argument there is just based on Lemmas 4.1 and 4.2 for which we have analogues in this case, with the modification of considering only rr with (r,m)=1(r,m)=1 in (6.1). ∎

11   Necessary conditions for primality of shifted exponentials

One might be tempted to conjecture that for Question 1.1 to have a positive answer it would suffice to look at the q=1q=1 case there, that is, to show that anba^{n}-b has no fixed prime divisor and that anba^{n}-b does not factor as a result of a polynomial identity. These conditions are however not sufficient, as demonstrated by the sequence 29n429^{n}-4. For nn even, 29n429^{n}-4 factors as the difference of two squares, whereas for nn odd we have 529n45\mid 29^{n}-4 (even though 29n429^{n}-4 has no fixed prime divisor and xn4x^{n}-4 is irreducible for nn odd). More generally, we have the following construction that shows the necessity of the “for every q1q\geq 1 there exist 1rq1\leq r\leq q” part of Question 1.1.

Proposition 11.1.

Let pp be a prime. Then there exist integers a,b>1a,b>1 such that

  • (i)

    The sequence anba^{n}-b has no fixed prime divisor;

  • (ii)

    The polynomial xhb[x]x^{h}-b\in\mathbb{Z}[x] is irreducible, where hh is the largest integer such that aa is an hhth power;

but for every 1rp1\leq r\leq p either of the following holds:

  • (iii)

    The sequence apn+rba^{pn+r}-b has a fixed prime divisor;

  • (iv)

    The polynomial arxhpb[x]a^{r}x^{hp}-b\in\mathbb{Z}[x] is reducible.

Proof.

By Dirichlet’s theorem, we can pick odd, distinct primes q1,,qp11(modp2)q_{1},\ldots,q_{p-1}\equiv 1\pmod{p^{2}}. Now the congruences xp1(modqi)x^{p}\equiv 1\pmod{q_{i}} all have a solution with x1(modqi)x\not\equiv 1\pmod{q_{i}}, so by the Chinese remainder theorem we can find an even a>max{q1,,qp1}a>\max\{q_{1},\ldots,q_{p-1}\} which is not a perfect power and which satisfies ordqi(a)=p\textup{ord}_{q_{i}}(a)=p for all ii. Then pick c>1c>1 such that cpai(modqi)c^{p}\equiv a^{i}\pmod{q_{i}} for all 1ip11\leq i\leq p-1 (this can be done, since xpg(qi1)/px^{p}\equiv g^{(q_{i}-1)/p} is solvable for gg any primitive root (modqi)\pmod{q_{i}}). Since the set of suitable cc forms an arithmetic progression (modq1qp1)\pmod{q_{1}\cdots q_{p-1}} and since (qi,a(a1))=1(q_{i},a(a-1))=1, we may additionally require that (c,a)=1(c,a)=1 and (cp1,a1)=1(c^{p}-1,a-1)=1. Now let b:=cpb:=c^{p}.

Conditions (i) and (ii) are now obvious, as h=1h=1 and (a,b)=(a1,b1)=1(a,b)=(a-1,b-1)=1. Moreover, the polynomial apxhpba^{p}x^{hp}-b factorizes as the difference of two ppth powers, and for each 1ip11\leq i\leq p-1 we have apn+ib0(modqi)a^{pn+i}-b\equiv 0\pmod{q_{i}} for all n0n\geq 0. This means that also conditions (iii) and (iv) hold. ∎

We remark that it seems difficult to tell whether the conditions of Question 1.1 hold for a given sequence. One case where this is difficult is the case of Sierpinski numbers: it has been conjectured by Erdős that if kk is a Sierpinski number (that is, k2n+1k\cdot 2^{n}+1 is composite for all natural numbers nn), then the smallest prime divisor of k2n+1k\cdot 2^{n}+1 is bounded (see [7, Conjecture 2]). This corresponds to condition (i) in Question 1.1 for a=2a=2, b=kb=-k (since for large primes pp the congruence k2n+10(modp)k\cdot 2^{n}+1\equiv 0\pmod{p} is solvable if and only if 2n+k0(modp)2^{n}+k\equiv 0\pmod{p} is solvable). In [7], Filaseta, Finch and Kozek state that it is “highly likely” that Erdős’s conjecture is false, as the conjecture does not take into account polynomial identities, corresponding to condition (ii) in Question 1.1. They conjecture that Erdős’s intuition is correct for all power-free kk. They also conjecture that the smallest prime divisor of 52n+15\cdot 2^{n}+1 is unbounded as nn varies. Our proof of Theorem 1.6 gives that ω(52n+1)\omega(5\cdot 2^{n}+1)\to\infty along almost all natural numbers nn, but does not rule out the smallest prime factor being bounded.

References

  • [1] P. Balister, B. Bollobás, R. Morris, J. Sahasrabudhe, and M. Tiba. On the Erdős covering problem: the density of the uncovered set. Invent. Math., 228(1):377–414, 2022.
  • [2] J. Bourgain, A. Gamburd, and P. Sarnak. Affine linear sieve, expanders, and sum-product. Invent. Math., 179(3):559–644, 2010.
  • [3] J. Bourgain, A. Gamburd, and P. Sarnak. Markoff triples and strong approximation. C. R. Math. Acad. Sci. Paris, 354(2):131–135, 2016.
  • [4] G. Cooke and P. J. Weinberger. On the construction of division chains in algebraic number rings, with applications to SL2{\rm SL}_{2}. Comm. Algebra, 3:481–524, 1975.
  • [5] K. Debaene. Explicit counting of ideals and a Brun-Titchmarsh inequality for the Chebotarev density theorem. Int. J. Number Theory, 15(5):883–905, 2019.
  • [6] C. Elsholtz. Density of primes in sequences of the form an+ba^{n}+b. MathOverflow answer. https://mathoverflow.net/q/268957.
  • [7] M. Filaseta, C. Finch, and M. Kozek. On powers associated with Sierpiński numbers, Riesel numbers and Polignac’s conjecture. Journal of Number Theory, 128(7):1916 – 1940, 2008.
  • [8] M. Filaseta, K. Ford, S. Konyagin, C. Pomerance, and G. Yu. Sieving by large integers and covering systems of congruences. J. Amer. Math. Soc., 20(2):495–517, 2007.
  • [9] P. X. Gallagher. The number of conjugacy classes in a finite group. In Representation theory of finite groups and related topics (Proc. Sympos. Pure Math., Vol. XXI, Univ. Wisconsin, Madison, Wis., 1970), pages 51–52, 1971.
  • [10] P. Garrett. Linear independence of roots. http://www-users.math.umn.edu/~garrett/m/v/linear_indep_roots.pdf.
  • [11] R. K. Guy. Unsolved problems in number theory. Problem Books in Mathematics. Springer-Verlag, New York, second edition, 1994. Unsolved Problems in Intuitive Mathematics, I.
  • [12] C. Hooley. On Artin’s conjecture. J. Reine Angew. Math., 225:209–220, 1967.
  • [13] C. Hooley. Applications of sieve methods to the theory of numbers. Cambridge University Press, Cambridge-New York-Melbourne, 1976. Cambridge Tracts in Mathematics, No. 70.
  • [14] H. Iwaniec and E. Kowalski. Analytic number theory, volume 53 of American Mathematical Society Colloquium Publications. American Mathematical Society, Providence, RI, 2004.
  • [15] O. Järviniemi. Equality of orders of a set of integers modulo a prime. Proc. Amer. Math. Soc., 149(9):3651–3668, 2021.
  • [16] G. Kalai. Proposals for polymath projects. MathOverflow answer. https://mathoverflow.net/q/229580.
  • [17] J. C. Lagarias and A. M. Odlyzko. Effective versions of the Chebotarev density theorem. In Algebraic number fields: LL-functions and Galois properties (Proc. Sympos., Univ. Durham, Durham, 1975), pages 409–464, 1977.
  • [18] H. W. Lenstra, Jr. On Artin’s conjecture and Euclid’s algorithm in global fields. Invent. Math., 42:201–224, 1977.
  • [19] H. L. Montgomery. The pair correlation of zeros of the zeta function. In Analytic number theory (Proc. Sympos. Pure Math., Vol. XXIV, St. Louis Univ., St. Louis, Mo., 1972), pages 181–193, 1973.
  • [20] P. Moree and P. Stevenhagen. A two-variable Artin conjecture. J. Number Theory, 85(2):291–304, 2000.
  • [21] M. R. Murty, V. K. Murty, and P.-J. Wong. The Chebotarev density theorem and the pair correlation conjecture. J. Ramanujan Math. Soc., 33(4):399–426, 2018.
  • [22] M. R. Murty and A. Perelli. The pair correlation of zeros of functions in the Selberg class. International Mathematics Research Notices, 1999(10):531–545, 01 1999.
  • [23] M. R. Murty, F. Séguin, and C. L. Stewart. A lower bound for the two-variable Artin conjecture and prime divisors of recurrence sequences. J. Number Theory, 194:8–29, 2019.
  • [24] J. Neukirch. Algebraic number theory, volume 322 of Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences]. Springer-Verlag, Berlin, 1999. Translated from the 1992 German original and with a note by Norbert Schappacher, With a foreword by G. Harder.
  • [25] A. M. Odlyzko. On the distribution of spacings between zeros of the zeta function. Math. Comp., 48(177):273–308, 1987.
  • [26] G. J. Rieger. Über Primzahlen und dünne Folgen. Arch. Math. (Basel), 28(6):600–602, 1977.
  • [27] M. Roitman. On Zsigmondy primes. Proc. Amer. Math. Soc., 125(7):1913–1919, 1997.
  • [28] Z. Rudnick and P. Sarnak. Zeros of principal LL-functions and random matrix theory. Duke Math. J., 81(2):269–322, 1996. A celebration of John F. Nash, Jr.
  • [29] A. Salehi Golsefidy and P. Sarnak. The affine sieve. J. Amer. Math. Soc., 26(4):1085–1105, 2013.
  • [30] P. Sarnak. Slide presentation. http://geometrie.math.cnrs.fr/Sarnak.pdf.
  • [31] J.-P. Serre. Quelques applications du théorème de densité de Chebotarev. Inst. Hautes Études Sci. Publ. Math., (54):323–401, 1981.
  • [32] J. Thorner and A. Zaman. A Chebotarev Variant of the Brun–Titchmarsh Theorem and Bounds for the Lang-Trotter conjectures. International Mathematics Research Notices, 2018(16):4991–5027, 03 2017.
  • [33] A. Weingartner. The distribution functions of σ(n)/n\sigma(n)/n and n/ϕ(n)n/\phi(n). Proc. Amer. Math. Soc., 135(9):2677–2681, 2007.
  • [34] P.-J. Wong. Applications of group theory to conjectures of Artin and Langlands. Int. J. Number Theory, 14(3):881–898, 2018.