Maynard's Approach to Bounded Gaps Between Primes

Maynard introduces a refinement of the GPY method with unconditional information on $\theta$ to establishing bounded gaps between primes.

Theorem 1. Let $m \in \mathbb{N}$. We have $$\liminf_n (p_{n+m} - p_n) \ll m^3 e^{4m}.$$

Theorem 2. We have $$\liminf_n (p_{n+1} - p_n) \leq 600$$

Here, we consider the sum

$$S(N, \rho) = \sum_{N \leq n < 2N} \left( \sum_{i=1}^k \chi(n+h_i) - \rho \right) w_n$$

where $\chi(n) = 1$ if $n$ is prime and $0$ otherwise. Maynard discovered a more general form of the sieve weights

$$w_n = \left( \sum_{d_i | n+h_i \forall i} \lambda_{d_1, \dots, d_k} \right)^2.$$

This gives us extra flexibility by allowing the weights to depend on the divisors of each factor. Now, we fix $\mathcal{H}$ to be an admissible $k$ tuple. For the purpose of detecting whether $n+h_i$ are primes, the author first removes the effects of small prime factors by letting $(n+h_i, W) = 1$ where $W = \prod_{p \leq D_0} p$ and $D_0 = \log\log\log N$. By applying the Chinese Remainder Theorem, we can choose $v_0$ so that $(v_0 + h_i, W) = 1$ for each $i$ since $\mathcal{H}$ is admissible. Therefore, we focus on the sums

$$S_1 = \sum_{\substack{N \leq n < 2N \\ n \equiv v_0 \pmod W}} w_n$$ $$S_2 = \sum_{\substack{N \leq n < 2N \\ n \equiv v_0 \pmod W}} \left( \sum_{i=1}^k \chi(n+h_i) \right) w_n$$

Assume that the primes have a fixed level of distribution $\theta$, and $R = N^{\frac{\theta}{2} - \delta}$. Note, $\lambda_{d_1, \dots, d_k}$ functions to eliminate problematic terms. Hence, by restricting the support to tuples for which $d = \prod_{i=1}^k d_i < R$ and also satisfies $(d, W) = 1$ and $\mu(d)^2 = 1$, $\lambda_{d_1, \dots, d_k}$ filters out the effect of small prime factors, terms with repetitive factors and possible large errors. Intuitively, terms with $d_m = 1$ repetitively in $w_n$ shall be the main term, i.e. $n+h_m$ is a prime, and this is exactly what Maynard focuses on. Note, $\mu(d)^2 = 1$ indicates that $(d_i, d_j) = 1$ for all $i \neq j$.

Now, we start to rewrite our sums $S_1$ and $S_2$ in a simpler form.

$$S_1 = \sum_{\substack{N \leq n < 2N \\ n \equiv v_0 \pmod W}} \left( \sum_{d_i | n+h_i \forall i} \lambda_{d_1, \dots, d_k} \right)^2 = \sum_{\substack{d_1, \dots, d_k \\ e_1, \dots, e_k}} \lambda_{d_1, \dots, d_k} \lambda_{e_1, \dots, e_k} \sum_{\substack{N \leq n < 2N \\ n \equiv v_0 \pmod W \\ [d_i, e_i] | n+h_i \forall i}} 1$$

If $(W, [d_i, e_i]) \neq 1$ for some $i$, then $\lambda_{d_1, \dots, d_k} = 0$. Therefore, we consider $W, [d_1, e_1], \dots, [d_k, e_k]$ are pairwise coprime, otherwise, the sum is $0$. Then, we can apply CRT to write the inner sum as a sum over a single residue class modulo $q = W \prod_{i=1}^k [d_i, e_i]$. This gives

$$S_1 = \frac{N}{W} \sideset{}{'}\sum_{\substack{d_1, \dots, d_k \\ e_1, \dots, e_k}} \frac{\lambda_{d_1, \dots, d_k} \lambda_{e_1, \dots, e_k}}{\prod_{i=1}^k [d_i, e_i]} + O\left( \sideset{}{'}\sum_{\substack{d_1, \dots, d_k \\ e_1, \dots, e_k}} |\lambda_{d_1, \dots, d_k} \lambda_{e_1, \dots, e_k}| \right)$$

where $\sideset{}{'}\sum$ is used to denote the restriction for $[d_1, e_1], \dots, [d_k, e_k]$ to be pairwise coprime. Put $\lambda_{\max} = \sup_{d_1, \dots, d_k} |\lambda_{d_1, \dots, d_k}|$, the error is then

$$\ll \lambda_{\max}^2 \left( \sum_{d < R} \tau_k(d) \right)^2 \ll \lambda_{\max}^2 R^2 (\log R)^{2k}$$

For the main term, we apply the identity

$$\frac{1}{[d_i, e_i]} = \frac{1}{d_i e_i} \sum_{u_i | d_i, e_i} \varphi(u_i).$$

It follows that

$$\frac{N}{W} \sum_{u_1, \dots, u_k} \left( \prod_{i=1}^k \varphi(u_i) \right) \sideset{}{'}\sum_{\substack{d_1, \dots, d_k \\ e_1, \dots, e_k \\ u_i | d_i, e_i \forall i}} \frac{\lambda_{d_1, \dots, d_k} \lambda_{e_1, \dots, e_k}}{(\prod d_i)(\prod e_i)}.$$

The requirement for $[d_1, e_1], \dots, [d_k, e_k]$ to be pairwise disjoint can be written as $(d_i, e_j) = 1$ for all $i \neq j$ since $\lambda_{d_1, \dots, d_k} = 0$ if $(d_i, d_j) \neq 1$ for some $i \neq j$. By applying the identity

$$\sum_{d|n} \mu(d) = \begin{cases} 1 & \text{if } n=1 \\ 0 & \text{otherwise} \end{cases},$$

this transforms the main term to

$$\frac{N}{W} \sum_{u_1, \dots, u_k} \left( \prod_{i=1}^k \varphi(u_i) \right) \sum_{s_{1,2}, \dots, s_{k, k-1}} \left( \prod_{\substack{1 \leq i, j \leq k \\ i \neq j}} \mu(s_{i,j}) \right) \sum_{\substack{d_1, \dots, d_k \\ e_1, \dots, e_k \\ u_i | d_i, e_i \forall i \\ s_{i,j} | d_i, e_j \forall i \neq j}} \frac{\lambda_{d_1, \dots, d_k} \lambda_{e_1, \dots, e_k}}{(\prod d_i)(\prod e_i)}$$

The sum over $s_{i,j}$ remains the same with restricting $(s_{i,j}, u_i) = (s_{i,j}, u_j) = 1$ and $(s_{i,j}, s_{i,a}) = (s_{i,j}, s_{b,j}) = 1$ for all $a \neq j$, $b \neq i$ and denotes it by $\sideset{}{^*}\sum$. This is because $\lambda_{d_1, \dots, d_k} = 0$ unless $(d_i, d_j) = 1$. Then we break the sum over $s_{i,j}$ to $s_{i,j} = 1$ and $s_{i,j} > D_0$. The contribution when $s_{i,j} > D_0$ is

$$\ll \frac{N}{W} \left( \sum_{\substack{u < R \\ (u, W) = 1}} \varphi(u) \right)^k \left( \sum_{s_{i,j} > D_0} \mu(s_{i,j})^2 \right)^{k^2-k}\dots$$

While taking $a_j = u_j \prod_{i \neq j} s_{j,i}$ and $b_j = u_j \prod_{i \neq j} s_{i,j}$, then $(a_i, a_j) = 1$, $\mu(a_i)^2 = 1$ for all $i \neq j$ and similarly for all $b_i$'s, it reduces the complexity of considering the sum over $u_i$ and $s_{i,j}$. Maynard makes a change of variable by letting

$$y_{r_1, \dots, r_k} = \left( \prod_{i=1}^k \mu(r_i) \varphi(r_i) \right) \sum_{\substack{d_1, \dots, d_k \\ r_i | d_i \forall i}} \frac{\lambda_{d_1, \dots, d_k}}{\prod_{i=1}^k d_i}$$ $$y_{\max} = \sup_{r_1, \dots, r_k} |y_{r_1, \dots, r_k}|.$$

By applying Möbius inversion we have

$$\frac{\lambda_{d_1, \dots, d_k}}{\prod_{i=1}^k d_i} = \sum_{\substack{r_1, \dots, r_k \\ d_i | r_i \forall i}} \frac{y_{r_1, \dots, r_k}}{\prod_{i=1}^k \varphi(r_i)} = \sum_{\substack{r_1, \dots, r_k \\ d_i | r_i \forall i}} \left( \prod_{i=1}^k \mu(r_i) \right) \sum_{\substack{e_1, \dots, e_k \\ r_i | e_i \forall i}} \frac{\lambda_{e_1, \dots, e_k}}{\prod_{i=1}^k e_i}$$ $$= \sum_{e_1, \dots, e_k} \frac{\lambda_{e_1, \dots, e_k}}{\prod_{i=1}^k e_i} \sum_{\substack{r_1, \dots, r_k \\ d_i | r_i \forall i \\ r_i | e_i \forall i}} \prod_{i=1}^k \mu(r_i).$$

Thus, any choice of $y_{r_1, \dots, r_k}$ supported on $r_1, \dots, r_k$, with the product $r = \prod_{i=1}^k r_i$ square-free and satisfying $r < R$ and $(r, W) = 1$, will give a suitable choice of $d_1, \dots, d_k$. Now, since $d/\varphi(d) = \sum_{e|d} 1/\varphi(e)$ for square-free $d$, we find by taking $r = \prod_{i=1}^k r_i / d_i$ that

$$\lambda_{\max} \ll y_{\max} (\log R)^k.$$

Hence, the error term $O(\lambda_{\max}^2 R^2 (\log R)^{2k})$ is of size

$$O(y_{\max}^2 R^2 (\log R)^{4k}).$$

It follows that the main term is

$$\frac{N}{W} \sum_{u_1, \dots, u_k} \left( \prod_{i=1}^k \frac{\mu(u_i)^2}{\varphi(u_i)} \right) \sideset{}{^*}\sum_{s_{1,2}, \dots, s_{k, k-1}} \left( \prod_{\substack{1 \leq i, j \leq k \\ i \neq j}} \frac{\mu(s_{i,j})}{\varphi(s_{i,j})^2} \right) y_{a_1, \dots, a_k} y_{b_1, \dots, b_k}$$

In this way, the contribution when $s_{i,j} > D_0$ is

$$\ll \frac{y_{\max}^2 \varphi(W)^k N (\log R)^k}{W^{k+1} D_0}.$$

When $s_{i,j} = 1$ with $i \neq j$, the rest of terms are

$$S_1 = \frac{N}{W} \sum_{u_1, \dots, u_k} \frac{y_{u_1, \dots, u_k}^2}{\prod_{i=1}^k \varphi(u_i)} + O\left( \frac{y_{\max}^2 \varphi(W)^k N (\log R)^k}{W^{k+1} D_0} \right)$$

For $S_2$, we write $S_2 = \sum_{m=1}^k S_2^{(m)}$, where

$$S_2^{(m)} = \sum_{\substack{N \leq n < 2N \\ n \equiv v_0 \pmod W}} \chi(n+h_m) w_n.$$

Similarly to what we do in previous and the analysis in GPY's method, by putting

$$y_{r_1, \dots, r_k}^{(m)} = \left( \prod_{i=1}^k \mu(r_i) g(r_i) \right) \sum_{\substack{d_1, \dots, d_k \\ r_i | d_i \forall i \\ d_m = 1}} \frac{\lambda_{d_1, \dots, d_k}}{\prod_{i=1}^k \varphi(d_i)}$$

where $g$ is a totally multiplicative function defined on primes by $g(p) = p - 2$, for any fixed $A > 0$, we have

$$S_2^{(m)} = \frac{N}{\varphi(W) \log N} \sum_{r_1, \dots, r_k} \frac{(y_{r_1, \dots, r_k}^{(m)})^2}{\prod_{i=1}^k g(r_i)} + O\left( \frac{(y_{\max}^{(m)})^2 \varphi(W)^{k-2} N (\log N)^{k-2}}{W^{k-1} D_0} + \frac{y_{\max}^2 N}{(\log N)^A} \right).$$

Lemma. If $r_m = 1$ then $$y_{r_1, \dots, r_k}^{(m)} = \sum_{a_m} \frac{y_{r_1, \dots, r_{m-1}, a_m, r_{m+1}, \dots, r_k}}{\varphi(a_m)} + O\left( \frac{y_{\max} \varphi(W) \log R}{W D_0} \right).$$

Our goal is to select $y_{r_1, \dots, r_k}$ which maximizes the ratio of the main terms of $S_2$ and $S_1$. We use Lagrangian multipliers to maximize the ratio so that

$$\lambda \frac{y_{r_1, \dots, r_k}}{\prod_{i=1}^k \varphi(r_i)} = \sum_{m=1}^k \frac{y_{r_1, \dots, r_{m-1}, 1, r_{m+1}, \dots, r_k}^{(m)}}{\prod_{i \neq m} g(r_i) \varphi(r_m)}$$

by taking the partial derivative wrt $y$, where $\lambda$ is a fixed constant. Note,

$$\frac{\partial S_2}{\partial y_{r_1, \dots, r_k}} = \sum_{m=1}^k \frac{\partial S_2}{\partial y_{r_1, \dots, 1, r_{m+1}, \dots, r_k}^{(m)}} \frac{\partial y^{(m)}}{\partial y_{r_1, \dots, r_k}} = \sum_{m=1}^k \frac{2 y_{r_1, \dots, 1, \dots, r_k}^{(m)}}{\prod_{i \neq m} g(r_i)} \frac{1}{\varphi(r_m)}.$$

Hence,

$$\lambda y_{r_1, \dots, r_k} = \left( \prod_{i=1}^k \frac{\varphi(r_i)}{g(r_i)} \right) \sum_{m=1}^k \frac{g(r_m)}{\varphi(r_m)} y_{r_1, \dots, r_{m-1}, 1, r_{m+1}, \dots, r_k}^{(m)}.$$

Since $y$ are supported on integers free of small prime factors, thus for most such $r$ we have $\varphi(r) \approx g(r) \approx r$, and so the above reduces to

$$\lambda y_{r_1, \dots, r_k} = \sum_{m=1}^k y_{r_1, \dots, r_{m-1}, 1, r_{m+1}, \dots, r_k}^{(m)}.$$

Maynard assumed that in such format this $y_{r_1, \dots, r_k}$ is smooth, and choose

$$y_{r_1, \dots, r_k} = F\left( \frac{\log r_1}{\log R}, \dots, \frac{\log r_k}{\log R} \right),$$

for some smooth function $F : \mathbb{R}^k \rightarrow \mathbb{R}$ supported on $\mathcal{R}_k = \left\{ (x_1, \dots, x_k) \in [0, 1]^k : \sum_{i=1}^k x_i \leq 1 \right\}$. And set $y_{r_1, \dots, r_k} = 0$ if $r = \prod_{i=1}^k r_i$, $(r, W) > 1$ or $\mu(r) = 0$. Let

$$F_{\max} = \sup_{(t_1, \dots, t_k) \in [0, 1]^k} |F(t_1, \dots, t_k)| + \sum_{i=1}^k \left| \frac{\partial F}{\partial t_i}(t_1, \dots, t_k) \right|.$$

Now, we substitute such choice of $y$ into our expression of $S_1$ and $S_2$, respectively. This gives

$$S_1 = \frac{N}{W} \sum_{\substack{u_1, \dots, u_k \\ (u_i, u_j) = 1 \forall i \neq j \\ (u_i, W) = 1}} \left( \prod_{i=1}^k \frac{\mu(u_i)^2}{\varphi(u_i)} \right) F\left( \frac{\log u_1}{\log R}, \dots, \frac{\log u_k}{\log R} \right) + O\left( \frac{F_{\max}^2 \varphi(W)^k N (\log R)^k}{W^{k+1} D_0} \right).$$

Note, we can drop the requirement that $(u_i, u_j) = 1$, this contributes to an error

$$\ll \frac{F_{\max}^2 N}{W} \sum_{p > D_0} \sum_{\substack{u_1, \dots, u_k \leq R \\ p | u_i, u_j \\ \text{for } i \neq j}} \prod_{i=1}^k \frac{\mu(u_i)^2}{\varphi(u_i)}$$ $$\ll \frac{F_{\max}^2 N}{W} \sum_{p > D_0} \frac{1}{(p-1)^2} \left( \sum_{\substack{u \leq R \\ (u, W) = 1}} \frac{\mu(u)^2}{\varphi(u)} \right)^k$$ $$\ll \frac{F_{\max}^2 \varphi(W)^k N (\log R)^k}{W^{k+1} D_0}$$

Thus we are left to evaluate the sum

$$\sum_{\substack{u_1, \dots, u_k \\ (u_i, W) = 1 \forall i}} \left( \prod_{i=1}^k \frac{\mu(u_i)^2}{\varphi(u_i)} \right) F\left( \frac{\log u_1}{\log R}, \dots, \frac{\log u_k}{\log R} \right).$$

Lemma. Let $A_1, A_2, L > 0$. Let $\gamma$ be a multiplicative function satisfying $$0 \leq \frac{\gamma(p)}{p} \leq 1 - A_1,$$ and $$-L \leq \sum_{w \leq p < z} \frac{\gamma(p) \log p}{p} - \log \frac{z}{w} \leq A_2$$ for any $2 \leq w \leq z$. Let $g$ be the multiplicative function defined by $$g(d) = \prod_{p | d} \frac{\gamma(p)}{p - \gamma(p)}.$$ Assume also that $F : [0, 1] \rightarrow \mathbb{R}$ be smooth, and let $G_{\max} = \sup_{t \in [0, 1]} (|G(t)| + |G'(t)|)$. Then $$\sum_{d \leq z} \mu(d)^2 g(d) G\left( \frac{\log d}{\log z} \right) = \mathfrak{S} \log z \int_0^1 G(x) dx + O_{A_1, A_2} \left( \mathfrak{S} L G_{\max} \right),$$ where $$\mathfrak{S} = \prod_p \left( 1 - \frac{\gamma(p)}{p} \right)^{-1} \left( 1 - \frac{1}{p} \right).$$

We can now apply this lemma $k$ times dealing with the sum over each $u_i$ in turn. For each application we take

$$\gamma(p) = \begin{cases} 1, & p \nmid W, \\ 0, & \text{otherwise}. \end{cases}$$ $$L \ll 1 + \sum_{p | W} \frac{\log p}{p} \ll \log D_0,$$

This gives

$$\sum_{\substack{u_1, \dots, u_k \\ (u_i, W) = 1 \forall i}} \left( \prod_{i=1}^k \frac{\mu(u_i)^2}{\varphi(u_i)} \right) F\left( \frac{\log u_1}{\log R}, \dots, \frac{\log u_k}{\log R} \right) = \frac{\varphi(W)^k (\log R)^k}{W^k} I_k(F)$$ $$+ O\left( \frac{F_{\max}^2 \varphi(W)^k (\log D_0) (\log R)^{k-1}}{W^k} \right)$$

where

$$I_k(F) = \int_0^1 \dots \int_0^1 F(t_1, \dots, t_k)^2 dt_1 \dots dt_k.$$

Similarly, we can apply this lemma to $y_{r_1, \dots, r_k}^{(m)}$ and then into $S_2$, this gives

$$S_2^{(m)} = \frac{\varphi(W)^k N (\log R)^{k+1}}{W^{k+1} \log N} J_k^{(m)} + O\left( \frac{F_{\max}^2 \varphi(W)^k N (\log N)^k}{W^{k+1} D_0} \right),$$

where

$$J_k^{(m)} = \int_0^1 \dots \int_0^1 \left( \int_0^1 F(t_1, \dots, t_k) dt_m \right)^2 dt_1 \dots dt_{m-1} dt_{m+1} \dots dt_k.$$

We let $\mathcal{S}_k$ denote the set of Riemann integrable function $F : [0, 1]^k \rightarrow \mathbb{R}$ supported on $\mathcal{R}_k$ with $I_k(F) \neq 0$ and $J_k^{(m)}(F) \neq 0$ for each $m$. We would like to obtain a lower bound for

$$M_k = \sup_{F \in \mathcal{S}_k} \frac{\sum_{m=1}^k J_k^{(m)}(F)}{I_k(F)}$$

For $k$ large, Maynard chooses $F$ to be the form

$$F(t_1, \dots, t_k) = \begin{cases} \prod_{i=1}^k g(k t_i), & \text{if } \sum_{i=1}^k t_i \leq 1, \\ 0, & \text{otherwise}, \end{cases}$$

for some smooth function $g : [0, \infty] \rightarrow \mathbb{R}$, supported on $[0, T]$. Such choice of $F$ is symmetric, and $J_k^{(m)}(F)$ is independent of $m$. Thus we put $J_k = J_k^{(1)}$ and $I_k = I_k(F)$.

By observation, if the center of mass $\int_0^\infty u g(u)^2 du / \int_0^\infty g(u)^2 du$ of $g^2$ is strictly less than $1$, then we are able to drop the restriction of $\sum_{i=1}^k t_i \leq 1$ with a small error. With the constraint of the center of mass of $g^2$ and optimizing the ratio, we use the Euler-Lagrange theorem and obtain the form for $g$ with sufficiently large $k$.