Chen's Theorem

Chen’s theorem states that every sufficiently large even integer can be written as the sum of an odd prime and a number that is a product of at most two primes.

Theorem Let \(r(N)\) denote the number of representations of \(N\) in the form \[N = p+n,\] where \(p\) is an odd prime and \(n\) is the product of at most two primes. Then \[r(N)\gg \mathfrak{S}(N)\,\frac{N}{(\log N)^2},\] where \[\mathfrak{S}(N)=\prod_{p>2}\!\left(1-\frac{1}{(p-1)^2}\right)\; \prod_{\substack{p\mid N\\ p>2}}\frac{p-1}{p-2}.\]

The idea is to apply the sieving method. Let \(z = N^{1/8}\) and we consider first consider \(n = N - p\) with \[n \in \{1, p, p_1 p_2 : p_1, p_2 \ge z\},\] we then obtain \[r(N) \ge \sum_{\substack{N = p + n \\ n \in \{1, p, p_1 p_2 : p_1, p_2 \ge z\}}} 1\] and in this case the difference is much smaller than \(N^{1/4}\).

Let \(\mathcal{P}\) be the set of primes that do not divide \(N\). Then, let \[\mathcal{A} = \{\, N - p : p \le N,\ p \in \mathcal{P} \,\}\] and if \(n \in \mathcal{A}\), then \((n,N) = 1\) and the difference of its cardinality between the previous set is \(\ll\) \(\omega(n)\). Let \[P(z) = \prod_{\substack{p \in \mathcal{P} \\ p < z}} p,\] If \(n \in \mathcal{A} \ \text{ and } \ n \in \{1, p, p_1 p_2 : p_1, p_2 \ge z\}, \ \text{ then } (n, P(z)) = 1,\ \text{ and}\) \[r(N) \ge \sum_{\substack{n \in \mathcal{A} \\ (n, P(z)) = 1 \\ n \in \{1, p, p_1 p_2 : p_1, p_2 \ge z\}}} 1\]

Then, we want to drop the last condition to replace \(1\) with a non-negative weight function that loses to \(1\). Let \(N\) be an even integer, \(N \ge 4^8\), and \(y = N^{1/8}\). We define \[f(n) = 1 \;-\; \frac{1}{2}\sum_{\substack{z \le q < y \\ q^k \parallel n}} k \;-\; \frac{1}{2}\sum_{\substack{p_1 p_2 p_3 \mid n \\ z \le p_1 < y \le p_2 \le p_3}} 1 ,\] then \(f(n)=1\) iff \(n\) is divisible by no prime in \([z,y)\). If \(n \in \mathcal{A}\) with \((n,P(z))=1\), then \(n=p_1\cdots p_r\cdots p_{r+s}\), where \[z \le p_1 \le \cdots \le p_r < y \le p_{r+1} \le \cdots \le p_{r+s},\] and if \(f(n)>0\), then either \(r=0\) and \(s=0,1,\) or \(2\), or \(r=1\), \(s=0,\) or \(1\). That is, it follows that \[r(N) \ge \sum_{\substack{n \in \mathcal{A}\\(n,P(z))=1}} f(n) = \left( \sum_{\substack{n \in \mathcal{A}\\(n,P(z))=1}} 1 \right) - \frac{1}{2} \sum_{z \le q < y} \left( \sum_{\substack{n \in \mathcal{A}\\(n,P(z))=1\\ q^k \parallel n}} k \right) - \frac{1}{2} \sum_{\substack{p_1 p_2 p_3 \mid n\\ z \le p_1 < y \le p_2 \le p_3}} 1\] To express these three terms as sieving functions, let \[A=\sum_{n=1}^\infty 1_{\mathcal{A}}(n)\] and \(A_q=\sum_{n=1}^\infty a(n)\), where \[a(n)= \begin{cases} 1, & \text{if } n\in\mathcal{A}\text{ and } q\mid n,\\ 0, & \text{otherwise}. \end{cases}\] Let \[\mathcal{B}=\{\, N-p_1p_2p_3:\ z\le p_1< y \le p_2\le p_3,\ (p_1p_2p_3,N)=1,\ p_1p_2p_3<N \,\},\] and \(B=\sum_{n=1}^\infty 1_B(n)\). Then \[r(N)\ge S(\mathcal{A},\mathcal{P},z) -\sum_{z\le q<y} S(\mathcal{A}_q,\mathcal{P},z) -\Big(\sum_{\substack{n\in\mathcal{A}\\(n,P(z))=1}}\sum_{\substack{z\le q<y\\q^k \parallel n\\k \ge 2}}(k-1)\Big) -\sum_{p\in\mathcal{B}}1\] \[\gg S(\mathcal{A},\mathcal{P},z)-\sum_{z\le q<y} S(\mathcal{A}_q,\mathcal{P},z)-S(B,\mathcal{P},y).\] To apply the Jurkat–Richert Sieve, we choose the multiplicative function \[g(d) = g_n(d) = \frac{1}{\varphi(d)}\] for all \(n \ge 1\).

Lemma For all \(\varepsilon > 0\), there exists \(u_1(\varepsilon)\) such that \[\prod_{u < p \le z} \left( 1 - \frac{1}{p} \right)^{-1} < \left( 1 + \frac{\varepsilon}{3} \right) \frac{\log z}{\log u}\] for any \(u_1(\varepsilon) \le u < z\).

This lemma is a corollary from Merten’s theorem. Also, by the convergence of infinite products, it yields that there exists \(u_2(\varepsilon)\) such that \[\prod_{\substack{p \ge u_2(\epsilon)}} \frac{(p-1)^2}{p(p-2)} < 1 + \frac{\varepsilon}{3}.\]

Therefore, for \(u \ge u_0(\varepsilon) \equiv \max(u_1(\varepsilon), u_2(\varepsilon))\), we have \[\prod_{\substack{u < p \le z}} \left( 1 - g(p) \right)^{-1} < \left( 1 + \varepsilon \right) \frac{\log z}{\log u}.\] Hence, let \(\mathcal{Q}(\varepsilon)\) be the set of all primes \(p < u_0(\varepsilon)\), and let \(\mathcal{Q}=\mathcal{P}\ \cap \ \mathcal{Q}\), and \[Q = \prod_{p \in \mathcal{Q}(\varepsilon)} p,\] since \(Q\) depends on \(\varepsilon\), then we can choose sufficiently large \(N\) so that \(Q < \log N\).

Recall that \[V(z) = \prod_{p \mid P(z)} \left( 1 - g(p) \right)^{-1} = \prod_{\substack{p < z \\ p \nmid N}} \left( 1 - \frac{1}{p-1} \right)\] \[= \mathfrak{S}(N) \frac{e^{-\gamma}}{\log z} \left( 1 + O\left( \frac{1}{\log N} \right) \right),\] where

\[\mathfrak{S}(N) = 2 \prod_{p>2} \left( 1 - \frac{1}{(p-1)^2} \right) \prod_{\substack{p \mid N \\ p > 2}} \frac{p-1}{p-2}.\] We then first estimate the first sieving function. For the main term, we have \[|\mathcal{A}| = \sum_{\substack{p < N \\ (p, N) = 1}} 1 = \pi(N) - \omega(N) = \frac{N}{\log N} \left( 1 + O\left( \frac{1}{\log N} \right) \right).\] As for the remainder term, \[R = \sum_{\substack{d < QD \\ d \mid P(z)}} |r(d)|\] where \[r(d) = |\mathcal{A}_d| - \sum_{n} a(n) g(d) = |\mathcal{A}_d| - \frac{|\mathcal{A}|}{\varphi(d)}.\] Note \[\begin{aligned} |\mathcal{A}_d| &= \sum_{\substack{n=1\\d\mid n}}^{\infty} a(n)\\ &= \sum_{\substack{p\in \mathcal{P}\\p \equiv N \ (\mathrm{mod}\ d) \\ p \le N}} 1\\ &= \sum_{\substack{p \in \mathcal{P} \\ p \equiv N \ (\mathrm{mod}\ d)}} 1 + O(\omega(N))\\ &= \pi(N; d, N) + O(\log N).\end{aligned}\]

It follows that \[r(d) = \pi(N; d, N) - \frac{\pi(N)}{\varphi(d)} + O(\log N).\] By applying the Bombieri–Vinogradov Theorem, we have for any \(A > 0\), \(N^{1/2} (\log N)^{-A} \le QD \le N^{1/2}\), \[\sum_{\substack{d < QD}} \max_{\substack{d\\(d,N)=1}} \left| \pi(N; d, N) - \frac{\pi(N)}{\varphi(d)} \right| \ll N^{1/2} Q (\log N)^6.\] It follows that \[R \ll \frac{N}{(\log N)^3}, \quad D = N^{\frac{1}{2} - \varepsilon}, \quad \text{and} \quad s = \frac{\log D}{\log z} \in [3,4].\]

Therefore, \[S(\mathcal{A}, \mathcal{P}, z) > \left( \frac{e^{-\gamma} \log 3}{2} + O(\varepsilon) \right) \frac{N V(z)}{\log N}.\] Similarly, we can obtain the estimation for \[\sum_{z \le q < y} S(\mathcal{A}_q, \mathcal{P}, z) < \left( \frac{e^{-\gamma} \log 6}{2} + O(\varepsilon) \right) \frac{N V(z)}{\log N}.\] However, for the last sieving function, we estimate the following sieve. Let \[\mathcal{B}^{(\ell)} = \{ N - p_1 p_2 p_3 : z \le p_1 < y \le p_2 \le p_3,\ l \le p_1 < (1+\varepsilon)l,\ p_1 p_2 p_3 < N, \ (p_1 p_2 p_3, N) = 1 \},\] where \(l\) is a number of the form \[l = z (1+\varepsilon)^k\] such that \(z \le l < y\), then \(k \ll \frac{\log N}{\varepsilon}\). Let \[\widetilde{\mathcal{B}} = \bigcup_{\ell} \mathcal{B}^{(\ell)}.\] Then, \[\mathcal{B} \subset \widetilde{\mathcal{B}} \subset \{ N - p_1 p_2 p_3 : z \le p_1 < y \le p_2 \le p_3, \ p_1 p_2 p_3 < (1+\varepsilon) N \}.\] Let \(b(n)\), \(b^{(\ell)}(n)\) and \(\widetilde{b}(n)\) be the characteristic functions of the sets \(\mathcal{B}\), \(\mathcal{B}^{(\ell)}\) and \(\widetilde{\mathcal{B}}\), respectively. Since the sets are pairwise disjoint, we have \[|\widetilde{\mathcal{B}}| = \sum_{\ell} |\mathcal{B}^{(\ell)}|\] and \[S(B, \mathcal{P}, y) \le S(\widetilde{{B}}, \mathcal{P}, y) = \sum_{\ell} S(B^{(\ell)}, \mathcal{P}, y).\] Again, with the similar process as before, we obtain \[S(B, \mathcal{P}, y) < \left( \frac{ce^{\gamma}}{2} + O(\varepsilon) \right) \frac{N V(z)}{\log N} + R^{(\ell)}\] where \(c=0.364 \dots\) As for \(R^{(\ell)}\), this would be a consequence of the following theorem, which is going to be proved in another chapter.

Theorem Let \(a(n)\) be an arithmetic function such that \(|a(n)| \le 1\) for all \(n\). Let \(A > 0\), let \(X > (\log Y)^{2A}\), and let \[D^* = \frac{(X Y)^{1/2}}{(\log Y)^A}.\] Then \[\sum_{\substack{d < D^* }} \max_{(a,d)=1} \left| \sum_{n<X} \sum_{\substack{z \le p < Y \\ np \equiv a \pmod{d}}} a(n) - \frac{1}{\varphi(d)} \sum_{n<X}\sum_{\substack{z \le p < Y \\ (np, d)=1}} a(m) \right| \ll \frac{X Y (\log X Y)^2}{(\log Y)^A},\] where the implied constant depends only on \(A\).

Note, \[r_d^{(\ell)} = |B_d^{(\ell)}| - \frac{|B^{(\ell)}|}{\varphi(d)} = \sum_{\substack{ z \le p_1 < y \le p_2 \le p_3\\ l \le p_1 < (1+\varepsilon)l\\ p_1p_2p_3 < N,\ (p_1p_2p_3,N)=1\\ p_1p_2p_3 \equiv N \pmod d }} 1 \;-\; \frac{1}{\varphi(d)} \sum_{\substack{ z \le p_1 < y \le p_2 \le p_3\\ l \le p_1 < (1+\varepsilon)l\\ p_1p_2p_3 < N,\ (p_1p_2p_3,N)=1 }} 1 .\] For the latter term, we add the condition that \((p_1 p_2 p_3, d) = 1\), same as \((p_1, d) = 1\), as \(d \mid P(z)\) refers to \((p_1 p_2 p_3, d) = 1\). Then the difference is at most of \[\frac{1}{\varphi(d)}\sum_{\substack{p_1p_2p_3<(1+\epsilon)N\\p_1\mid d , \, p_1\ge z}} 1<\frac{(1+\epsilon)N}{\varphi(d)} \sum_{\substack{p_1 \mid d \\ p_1 \ge z}} \frac{1}{p_1} \le \frac{(1+\varepsilon) N \omega(d)}{\varphi(d) z} \ll \frac{N \log d}{z \varphi(d)}.\] Let \(a(n)\) be the characteristic function of the set of numbers of the form \(n = p_2 p_3\), where \(y \le p_2 \le p_3\) and \((p_2 p_3, N) = 1\). Then \[r_d^{(\ell)} = \sum_{n \le X} \sum_{\substack{z \le p < Y \\ np\equiv a(\operatorname{mod} d)}} a(n) - \frac{1}{\varphi(d)} \sum_{n < X} \sum_{\substack{z \le p < Y \\ (np, d) = 1}} a(n) + O\left( \frac{N \log d}{z\varphi(d) \, } \right)\] By inserting the following to the previous identity \[X = \frac{N}{l}, \quad Y = \min \{\, y, (1+\varepsilon)l \,\}, \quad Z = \max \{\, z, l \,\}, \quad \text{and} \ \ a = N,\] we obtain \[D = \frac{N^{1/2}}{(\log N)^A} \le \frac{(X Y)^{1/2}}{(\log Y)^A} = D^* < (X Y)^{1/2} \le N.\] Therefore, if we choose \(A = 6\), we have \[R^{(\ell)} \le \sum_{\substack{d < D^* \\ d \mid P(y)}} |r_d^{(\ell)}| \ll \frac{N}{(\log N)^4}.\] And after combining the estimation of the three sieving functions, we complete the proof.