The first moment method

The simplest instance of the probabilistic method is thefirst moment method, which seeks to control the distribution of a random variableXin terms of its expectation (or first moment)E(X). Firstly, we make the trivial observation (essentially the pigeonhole principle) thatX ≤E(X) with positive probability, andX ≥E(X) with positive probability. A more quantitative variant of this is

Theorem 1.1 (Markov’s inequality) Let X be a non-negative random variable.

Then for any positive realλ >0

P(X≥λ)≤ E(X)

λ . (1.2)

Proof Start with the trivial inequality X≥λI(X ≥λ) and take expectations of

both sides.

Informally, this inequality asserts thatX =O(E(X)) with high probability; for instance,X ≤10E(X) with probability at least 0.9. Note that this is only anupper tail estimate; it gives an upper bound for how likely X is to be much larger than E(X), but does not control how likelyXis to be much smaller thanE(X). Indeed, if all one knows is the expectationE(X), it is easy to see thatXcould be as small as zero with probability arbitrarily close to 1, so the first moment method cannot give any non-trivial lower tail estimate. Later on we shall introduce more refined methods, such as the second moment method, that give further upper and lower tail estimates.

To apply the first moment method, we of course need to compute the expectations of random variables. A fundamental tool in doing so islinearity of expectation, which asserts that

E(c1X1+ ã ã ã +cnXn)=c1E(X1)+ ã ã ã +cnE(Xn) (1.3) whenever X1, . . . ,Xnare random variables andc1, . . . ,cnare real numbers. The power of this principle comes from there being no restriction on the independence or dependence between theXis. A very typical application of (1.3) is in estimating the size|B|of a subsetBof a given setA, whereBis generated in some random manner. From the obvious identity

|B| =

a∈A

I(a∈ B) and (1.3), (1.1) we see that

E(|B|)=

a∈A

P(a∈ B). (1.4)

Again, we emphasize that the eventsa ∈Bdo not need to be independent in order for (1.4) to apply.

A weaker version of the linearity of expectation principle is theunion bound P(E1∨ ã ã ã ∨En)≤P(E1)+ ã ã ã +P(En) (1.5) for arbitrary events E1, . . . ,En (compare this with (1.3) with Xi :=I(Ei) and ci :=1). This trivial bound is still useful, especially in the case when the events E1, . . . ,Enare rare and not too strongly correlated (see Exercise 1.1.3). A related estimate is as follows.

Lemma 1.2 (Borel–Cantelli lemma) Let E1,E2, . . .be a sequence of events (possibly infinite or dependent), such that

nP(En)<∞. Then for any integer M, we have

P(Fewer than M of the events E1,E2, . . .hold)≥1−

nP(En)

M .

In particular, with probability1at most finitely many of the events E1,E2, . . .hold.

Another useful way of phrasing the Borel–Cantelli lemma is that ifF1,F2, . . . are events such that

n(1−P(Fn))<∞, then, with probabilityn, all but finitely many of the events Fnhold.

Proof By monotone convergence it suffices to prove the claim when there are only finitely many events. From (1.3) we haveE(

nI(En))=

nP(En). If one now applies Markov’s inequality withλ=M, the claim follows.

1.1.1 Sum-free sets

We now apply the first moment method to the theory of sum-free sets. An additive set Ais called sum-freeiff it does not contain three elements x,y,z such that x+y=z; equivalently,Ais sum-free iff A∩2A= ∅.

Theorem 1.3 Let A be an additive set of non-zero integers. Then A contains a sum-free subset B of size|B|>|A|/3.

Proof Choose a prime number p=3k+2, wherekis sufficiently large so that A⊂[−p/3,p/3]\{0}. We can thus view Aas a subset of the cyclic group Zp

rather than the integersZ, and observe that a subsetBofAwill be sum-free inZp

if and only if1it is sum-free inZ.

Now choose a random numberx ∈Zp\{0}uniformly, and form the random set B :=A∩(xã[k+1,2k+1])= {a∈ A:x−1a∈ {k+1, . . . ,2k+1}}.

Since [k+1,2k+1] is sum-free inZp, we see that xã[k+1,2k+1] is too, and thus B is a sum-free subset of A. We would like to show that|B|>|A|/3 with positive probability; by the first moment method it suffices to show that E(|B|)>|A|/3. From (1.4) we have

E(|B|)=

a∈A

P(a ∈ B)=

a∈A

P(x−1a∈[k+1,2k+1]).

Ifa∈ A, thena is an invertible element ofZp, and thusx−1a is uniformly dis- tributed in Zp\{0}. Since|[k+1,2k+1]|> p−31, we conclude that P(x−1a∈ [k+1,2k+1])>13 for alla∈ A. Thus we haveE(|B|)> |A3| as desired.

Theorem 1.3 was proved by Erd˝os in 1965 [86]. Several years later, Bour- gain [37] used harmonic analysis arguments to improve the bound slightly. It is surprising that the following question is open.

Question 1.4 Can one replace n/3by(n/3)+10?

Alon and Kleiman [10] considered the case of more general additive sets (not necessarily in Z). They showed that in this case A always contains a sum-free subset of 2|A|/7 elements and the constant 2/7 is best possible.

Another classical problem concerning sum-free sets is the Erd˝os–Moser problem. Consider a finite additive set A. A subsetBof Aissum-freewith respect to Aif 2∗B∩A= ∅, where 2∗B = {b1+b2|b1,b2∈ B,b1 =b2}. Erd˝os and Moser asked for an estimate of the size of the largest sum-free subset of any given setA of cardinalityn. We will discuss this problem in Section 6.2.1.

1 This trick can be placed in a more systematic context using the theory ofFreiman homomorphisms:

see Section 5.3.

Exercises

1.1.1 IfXis a non-negative random variable, establish the identity E(X)=

∞

P(X > λ)dλ (1.6)

and more generally for any 0< p<∞ E(Xp)= p

∞

λp−1P(X > λ)dλ. (1.7) Thus the probability distribution function P(X > λ) controls all the momentsE(Xp) ofX.

1.1.2 When does equality hold in Markov’s inequality?

1.1.3 IfE1, . . . ,Enare arbitrary probabilistic events, establish the lower bound P(E1∨ ã ã ã ∨En)≥

n i=1

P(Ei)−

1≤i<j≤n

P(Ei∧Ej);

this bound should be compared with (1.5), and can be thought of as a variant of the second moment method which we discuss in the next section.

(Hint: consider the random variablen

i=1I(Ei)−

1≤i<j≤nI(Ei)I(Ej).) More generally, establish theBonferroni inequalities

P(E1∨ ã ã ã ∨En)≥

A⊂[1,n]:1≤|A|≤k

(−1)kP

i∈A

whenkis even, and

P(E1∨ ã ã ã ∨En)≤

A⊂[1,n]:1≤|A|≤k

(−1)kP

i∈A

whenkis odd.

1.1.4 LetXbe a non-negative random variable. Establish thepopularity princi- pleE(XI(X > 12E(X)))≥ 12E(X). In particular, ifXis bounded by some constantM, thenP(X > 12E(X))≥ 2M1 E(X). Thus while there is in general no lower tail estimate on the event X≤ 12E(X), we can say that the majority of the expectation of X is generated outside of this tail event, which does lead to a lower tail estimate ifXis bounded.

1.1.5 Let A,Bbe non-empty subsets of a finite additive group Z. Show that there exists anx∈Z such that

1−|A∩(B+x)|

|Z| ≤

1−|A|

|Z|

1−|B|

|Z| ,

and ay∈ Zsuch that

1−|A∩(B+y)|

|Z| ≥

1−|A|

|Z|

1−|B|

|Z| .

1.1.6 Consider a setAas above. Show that there exists a subset{v1, . . . , vd}of Z withd =O(log||ZA||) such that

|A+[0,1]dã(v1, . . . , vd)| ≥ |Z|/2.

1.1.7 Consider a setAas above. Show that there exists a subset{v1, . . . , vd}of Z withd :=O(log||ZA|| +log log(10+ |Z|)) such that

A+[0,1]dã(v1, . . . , vd)=Z.

Sidon’s problem on thin bases

Thin bases of higher order