BREAKING NEWS
Boole's inequality

Summary

In probability theory, Boole's inequality, also known as the union bound, says that for any finite or countable set of events, the probability that at least one of the events happens is no greater than the sum of the probabilities of the individual events. This inequality provides an upper bound on the probability of occurrence of at least one of a countable number of events in terms of the individual probabilities of the events. Boole's inequality is named for its discoverer George Boole.[1]

Formally, for a countable set of events A1, A2, A3, ..., we have

${\displaystyle {\mathbb {P} }\left(\bigcup _{i=1}^{\infty }A_{i}\right)\leq \sum _{i=1}^{\infty }{\mathbb {P} }(A_{i}).}$

In measure-theoretic terms, Boole's inequality follows from the fact that a measure (and certainly any probability measure) is σ-sub-additive.

Proof

Proof using induction

Boole's inequality may be proved for finite collections of ${\displaystyle n}$  events using the method of induction.

For the ${\displaystyle n=1}$  case, it follows that

${\displaystyle \mathbb {P} (A_{1})\leq \mathbb {P} (A_{1}).}$

For the case ${\displaystyle n}$ , we have

${\displaystyle {\mathbb {P} }\left(\bigcup _{i=1}^{n}A_{i}\right)\leq \sum _{i=1}^{n}{\mathbb {P} }(A_{i}).}$

Since ${\displaystyle \mathbb {P} (A\cup B)=\mathbb {P} (A)+\mathbb {P} (B)-\mathbb {P} (A\cap B),}$  and because the union operation is associative, we have

${\displaystyle \mathbb {P} \left(\bigcup _{i=1}^{n+1}A_{i}\right)=\mathbb {P} \left(\bigcup _{i=1}^{n}A_{i}\right)+\mathbb {P} (A_{n+1})-\mathbb {P} \left(\bigcup _{i=1}^{n}A_{i}\cap A_{n+1}\right).}$

Since

${\displaystyle {\mathbb {P} }\left(\bigcup _{i=1}^{n}A_{i}\cap A_{n+1}\right)\geq 0,}$

by the first axiom of probability, we have

${\displaystyle \mathbb {P} \left(\bigcup _{i=1}^{n+1}A_{i}\right)\leq \mathbb {P} \left(\bigcup _{i=1}^{n}A_{i}\right)+\mathbb {P} (A_{n+1}),}$

and therefore

${\displaystyle \mathbb {P} \left(\bigcup _{i=1}^{n+1}A_{i}\right)\leq \sum _{i=1}^{n}\mathbb {P} (A_{i})+\mathbb {P} (A_{n+1})=\sum _{i=1}^{n+1}\mathbb {P} (A_{i}).}$

Proof without using induction

For any events in ${\displaystyle A_{1},A_{2},A_{3},\dots }$ in our probability space we have

${\displaystyle \mathbb {P} \left(\bigcup _{i}A_{i}\right)\leq \sum _{i}\mathbb {P} (A_{i}).}$

One of the axioms of a probability space is that if ${\displaystyle B_{1},B_{2},B_{3},\dots }$  are disjoint subsets of the probability space then

${\displaystyle \mathbb {P} \left(\bigcup _{i}B_{i}\right)=\sum _{i}\mathbb {P} (B_{i});}$

If ${\displaystyle B\subset A,}$  then ${\displaystyle \mathbb {P} (B)\leq \mathbb {P} (A).}$

Indeed, from the axioms of a probability distribution,

${\displaystyle \mathbb {P} (A)=\mathbb {P} (B)+\mathbb {P} (A-B).}$

Note that both terms on the right are nonnegative.

Now we have to modify the sets ${\displaystyle A_{i}}$ , so they become disjoint.

${\displaystyle B_{i}=A_{i}-\bigcup _{j=1}^{i-1}A_{j}.}$

So if ${\displaystyle B_{i}\subset A_{i}}$ , then we know

${\displaystyle \bigcup _{i=1}^{\infty }B_{i}=\bigcup _{i=1}^{\infty }A_{i}.}$

Therefore, we can deduce the following equation

${\displaystyle \mathbb {P} \left(\bigcup _{i}A_{i}\right)=\mathbb {P} \left(\bigcup _{i}B_{i}\right)=\sum _{i}\mathbb {P} (B_{i})\leq \sum _{i}\mathbb {P} (A_{i}).}$

Bonferroni inequalities

Boole's inequality may be generalized to find upper and lower bounds on the probability of finite unions of events.[2] These bounds are known as Bonferroni inequalities, after Carlo Emilio Bonferroni; see Bonferroni (1936).

Define

${\displaystyle S_{1}:=\sum _{i=1}^{n}{\mathbb {P} }(A_{i}),}$

and

${\displaystyle S_{2}:=\sum _{1\leq i

as well as

${\displaystyle S_{k}:=\sum _{1\leq i_{1}<\cdots

for all integers k in {3, ..., n}.

Then, for odd k in {1, ..., n},

${\displaystyle {\mathbb {P} }\left(\bigcup _{i=1}^{n}A_{i}\right)\leq \sum _{j=1}^{k}(-1)^{j-1}S_{j},}$

and for even k in {2, ..., n},

${\displaystyle {\mathbb {P} }\left(\bigcup _{i=1}^{n}A_{i}\right)\geq \sum _{j=1}^{k}(-1)^{j-1}S_{j}.}$

Boole's inequality is the initial case, k = 1. When k = n, then equality holds and the resulting identity is the inclusion–exclusion principle.

Example

Suppose that you are estimating 5 parameters based on a random sample, and you can control each parameter separately. If you want your estimations of all five parameters to be good with a chance 95%, how should you do to each parameter?

Obviously, controlling each parameter good with a chance 95% is not enough because "all are good" is a subset of each event "Estimate i is good". We can use Boole's Inequality to solve this problem. By finding the complement of event "all fives are good", we can change this question into another condition:

P( at least one estimation is bad) = 0.05 ≤ P( A1 is bad) + P( A2 is bad) + P( A3 is bad) + P( A4 is bad) + P( A5 is bad)

One way is to make each of them equal to 0.05/5 = 0.01, that is 1%. In another word, you have to guarantee each estimate good to 99%( for example, by constructing a 99% confidence interval) to make sure the total estimation to be good with a chance 95%. This is called Bonferroni Method of simultaneous inference.