Hua's Vinogradov Mean Value Theorem

The following version of Vinogradov’s Mean Value Theorem is an
improvement edition by Loo-Keng Hua.

Theorem. Let \(P\) and \(T\) be integers and \(P \ge 2\), \[f(x) = \alpha_k x^k + \cdots + \alpha_1 x,\] and \[C_k = C_k(P) = \sum_{T < x \le T+P} e(f(x)).\] Then, when \(s = s(k) \geq \tfrac{1}{4}k(k+1)+k\), we have \[\int_0^1 \cdots \int_0^1 |C_k|^{2s}\, d\alpha_1 \cdots d\alpha_k \;\;\le\;\; (5s)^{5s\ell} (\log P)^{2\ell} P^{2s - \tfrac{1}{2}k(k+1) + \delta},\] where \[\delta = \delta(k) = \tfrac{1}{2}k(k+1)(1-a) + k, \qquad a = \tfrac{1}{k}.\] Note. The value of this integral is equal to the number of sets of solutions of the equation \[x_1^h + \cdots + x_s^h = y_1^h + \cdots + y_s^h, \qquad 1 \le h \le k,\] with \[T < x_i, \; y_i \le T+P.\] For the sake of proving, we let \(T=0\) from now on.

Proof. First we show that the theorem follows from the following lemma.

Lemma. Let \(b\) be an integer \(\ge \tfrac{1}{4}k(k+1)+k\) and let \(\eta = \left[ \tfrac{1}{k} \tfrac{\log Q}{\log 2} \right],\;\; a = \tfrac{1}{k}\). Then \[\begin{aligned} \int_0^1 \cdots \int_0^1 |C_k(Q)|^{2b}\, &d\alpha_1 \cdots d\alpha_k\le (5b)^{5b} \max(1,\eta^2)\, Q^{\,2k - \tfrac{1}{2}k(k+1) + 2(b-k)a} \\& \times \int_0^1 \cdots \int_0^1 |C_k(Q^{1-a})|^{2(b-k)} d\alpha_1 \cdots d\alpha_k .\end{aligned}\]

Then we show that this recursion formula can be obtained by separating \(C_k(Q)\) into so-called "well-spaced” sums, and since the construction of these sums gives us information over their sizes, it allows us to bound the integral.

We prove the theorem by proceeding by induction progress over \(\ell\). If \(\ell=0\), then the theorem is obviously true. If \(\ell \ge 1\), we assume that \(P^{1-a} > 3\), then \(P > e\), otherwise \(P \le 9\), which is obviously true. And by the induction hypothesis, we have \[\begin{aligned} \int_0^1 \cdots \int_0^1 &|C_k(P^{1-a})|^{2(s-k)} d\alpha_1 \cdots d\alpha_k \\ &\le (5s)^{5s(\ell-1)} (\log P)^{2(\ell-1)} P^{(1-a)(2s - 2k - \tfrac{1}{2}k(k+1) + \tfrac{1}{2}k(k+1)(1-a)^{\ell-1})}.\end{aligned}\] Therefore, the theorem follows by combining this bound with the lemma where \(b = s\).

Hence, we are left to prove the recursion formula.

We define a set of integers \((g_1,\ldots,g_b),\; 1 \le g_i \le H,\; i \in [1,b]\) to be a well–spaced set if there exists a \(k\)–set \(g_{j_1},\ldots,g_{j_k}\) with \[|g_{j_{i+1}} - g_{j_i}| > 1 \quad \text{for } 1 \le i \le k-1.\] Then, by a rough estimation, the number of sets which are not well–spaced are at most \[H^{\,k-1}(2(k-1))^b.\]

Suppose \(\eta \ge 2\), and let \(s\) denote an integer with \(1 \le s \le \eta-1\). By splitting \(C_k(Q)\) into \(2^s\) parts whose length of each part is \(R_s = Q 2^{-s}\), it follows \[C_k(Q) = \sum_{g=1}^{2^s} \sum_{(g-1)R_s < x \le gR_s} e(f(x)) = \sum_{g=1}^{2^s} Z_{sg},\] where \[Z_{sg} = \sum_{(g-1)R_s < x \le gR_s} e(f(x)).\]

Let \[Z = (C_k(Q))^b.\] Then \[Z = \sum_{g=1}^{2^{sb}} Z_{sg_1} \cdots Z_{sg_b} = \sum_{s_{g_1},\ldots,s_{g_b}} Z_{s;g_1,\ldots,g_b} = \sum_{}^{2^{sb}} Z_j,\] where \(\sum^M\) denotes a sum, the number of terms at most \(M\).

Now, we define \(Z_{s;g_1,\ldots,g_b}\) to be a well–spaced sum if \(g_1,\ldots,g_b\) is a well–spaced set, and we denote it as \(Z_s'\). For each not well–spaced \(Z_s\), we divide each of the factors of \(Z_s\), then we obtain \(2^b\) sums \(Z_{s+1}\), i.e. at most \((4(k-1))^b 2^{s(k-1)}\) of well–spaced \(Z_{s+1}'\). We repeat this process over \(s\) from \(1\) to \(\eta-1\), and use \(Z_\eta'\) to denote all \(Z_\eta\) obtained from the \(Z_{\eta-1}\) which are not well–ordered. Therefore, we can express \[Z = \sum_{s=1}^\eta M_s \sum_{}^{M_s} Z_s',\] where \(M_s = (4(k-1))^b \, 2^{s(k-1)}\). It follows that \[|C_k(Q)|^{2b} = |Z|^2 \le \eta \sum_{s=1}^\eta M_s \sum_{}^{M_s} |Z_s'|^2.\]

We then rearrange the indices of \(Z_s'\) (\(s \le \eta-1\)) where the first \(k\) terms of the factor form a well–spaced sum. For the rest of the terms, since \(4 \le 2^\eta \le Q^a\), we can split \[Z_{sg_i} \,(1 \le i \le b) \quad \text{into } \left\lfloor \frac{Q 2^{-s}}{Q^{1-a}} \right\rfloor +1 \le Q^a 2^{1-s} \;\;\text{parts},\] each of length \(Q' \le Q^{1-a}\).

Let \[C^* = \max_{T \in [0,Q]} |C_k(Q')| = \left| \sum_{\omega < x \le \omega + Q'} e(f(x)) \right|,\] then \[|Z_{sg_i}|^{2(b-k)} \le (Q^a 2^{1-s} C^*)^{2(b-k)},\] and \[|Z|^2 \le \eta \sum_{s=1}^\eta M_s \,(Q^a 2^{1-s} C^*)^{2(b-k)} \sum_{}^{M_s} |Z_{sg_1\ldots g_k}|^2.\]

Therefore, by integrating both sides over the \(k\)–dimensional unit cube (\(0 < \alpha_i \le 1,\; 0 \le \alpha_k \le 1\)), we obtain \[\begin{aligned} \int_0^1 \cdots \int_0^1 |Z|^2 d\alpha_1 \cdots &d\alpha_k \;\;\le\;\; \eta \sum_{s=1}^\eta M_s (Q^a 2^{1-s})^{2(b-k)} \\ &\times \sum_{}^{M_s} \int_0^1 \cdots \int_0^1 (C^*)^{2(b-k)} \big| Z_{sg_1} \cdots Z_{sg_k} \big|^2 d\alpha_1 \cdots d\alpha_k.\end{aligned}\]

The integral is equivalent to the number of solutions of the equations \[x_1^h + \cdots + x_k^h + y_1^h + \cdots + y_b^h = x_1'^h + \cdots + x_k'^h + y_1'^h + \cdots + y_b'^h, \, (1 \le h \le k),\] where \[(g_{j-1}-1) R_s < x_i, x_i' \le g_j R_s\] and \[\omega < y_i, y_i' \le \omega + Q'.\]

Note: the difference between \[x_1^h + \cdots + x_b^h, \quad x_1'^h + \cdots + x_b'^h\] must not exceed \(2(b-k) Q^{(1-a)h}\); and for fixed \(x_i, x_i'\) that equip with the above condition, the number of sets of \(y_i, y_i'\) does not exceed \[\int_0^1 \cdots \int_0^1 |C_k(Q^{1-a})|^{2(b-k)} d\alpha_1 \cdots d\alpha_k,\] since it is equivalent to \[\int_0^1 \cdots \int_0^1 C^{\,b-k}_k(Q')\,\overline{C^{\,b-k}_k(Q')}\, e(N_1\alpha_1 + \cdots + N_k\alpha_k)\, d\alpha_1 \cdots d\alpha_k.\]
Lemma. Let \(Q = RH\), \(R > 1\), \(H > 1\) and \[1 \le g_1 < \cdots < g_k \le H, \qquad g_\nu - g_{\nu-1} > 1,\] where \(g_1, \ldots, g_k\) are integers. Further, let \(x_\nu\) vary over the interval \[-\omega + (g_\nu - 1)R \;\le\; x_\nu \;<\; -\omega + g_\nu R\] for \(\omega\) any integer, \(0 \le \omega \le Q\). Then, the number of sets \(x_1, \ldots, x_k\) such that \[|x_1^h + \cdots + x_k^h - x_1'^h - \cdots - x_k'^h| \;\le\; Q^{h-1}, \qquad (1 \le h \le k),\] for some fixed \(x_1', \ldots, x_k'\), is \[\le 2^k (2kH)^{\tfrac{1}{2}k(k-1)}.\]

The proof of this lemma is based on inductions over the difference of elementary symmetric functions of the \(h\)th degree of \(x_1, \ldots, x_k\) and \(x_1', \ldots, x_k'\). Since we have information on the difference of their power functions, by the induction hypothesis of assuming the first \(h-1\) terms are correct, it is not difficult to show it is also correct for the \(h\)th term by considering Newton’s identity. Then, we define \[\varphi(x) = |(x - x_1)(x - x_2)\cdots(x - x_k) - (x - x_1')(x - x_2')\cdots(x - x_k')|\] and thus we can bound it for \(|x| \le Q\) by the difference of elementary symmetric functions.

Thus, let \(x = x_i\), it gives information of \(|x_i^k - x_i'^k|\) and the lemma follows from the induction hypothesis for \(k-1\).

Hence, for \(1 \le h \le k\) with \[|x_1^h + \cdots + x_k^h - x_1'^h - \cdots - x_k'^h| \;\le\; 2(b-k)\, Q^{(1-a)h},\] we simply split this interval into \(Q^{a h}\) sub-intervals, and get the number of sets of \(x_1,\ldots,x_k\) and \(x_1',\ldots,x_k'\) that do not exceed \[(8(b-k))^k (2k2^s)^{\tfrac{1}{2}k(k-1)} Q^{\tfrac{1}{2}(k-1)} R_s^k.\] Therefore, we obtain the upper bound for \(1 \le s \le \eta-1\).

For \(s = \eta\), we use the trivial bound \[|Z_{sg_1} \cdots Z_{sg_k}| \le R_\eta^{2k},\] and thus we can show the lemma is true for \(\eta \ge 2\).

If \(\eta < 2\), then \[\frac{1}{k} \frac{\log Q}{\log 2} < 2, \qquad \text{i.e. } Q < 4^k.\] Then we simply split \(C_k(Q)\) into \(4^k\) parts and each part sums over length \(Q' \le \tfrac{1}{4} Q \le Q^{1-a}\).

Hence, we obtain the lemma and complete the proof for the theorem.