The motivation for studying empirical measures is that it is often impossible to know the true underlying probability measure. We collect observations and compute relative frequencies. We can estimate , or a related distribution function by means of the empirical measure or empirical distribution function, respectively. These are uniformly good estimates under certain conditions. Theorems in the area of empirical processes provide rates of this convergence.
Definitionedit
Let be a sequence of independent identically distributed random variables with values in the state space S with probability distribution P.
Definition
The empirical measurePn is defined for measurable subsets of S and given by
In particular, the empirical measure of A is simply the empirical mean of the indicator function, Pn(A) = PnIA.
For a fixed measurable function , is a random variable with mean and variance .
By the strong law of large numbers, Pn(A) converges to P(A) almost surely for fixed A. Similarly converges to almost surely for a fixed measurable function . The problem of uniform convergence of Pn to P was open until Vapnik and Chervonenkis solved it in 1968.[1]
If the class (or ) is Glivenko–Cantelli with respect to P then Pn converges to P uniformly over (or ). In other words, with probability 1 we have
Empirical distribution functionedit
The empirical distribution function provides an example of empirical measures. For real-valued iid random variables it is given by
In this case, empirical measures are indexed by a class It has been shown that is a uniform Glivenko–Cantelli class, in particular,
^Vapnik, V.; Chervonenkis, A (1968). "Uniform convergence of frequencies of occurrence of events to their probabilities". Dokl. Akad. Nauk SSSR. 181.
Further readingedit
Billingsley, P. (1995). Probability and Measure (Third ed.). New York: John Wiley and Sons. ISBN 0-471-80478-9.
Donsker, M. D. (1952). "Justification and extension of Doob's heuristic approach to the Kolmogorov–Smirnov theorems". Annals of Mathematical Statistics. 23 (2): 277–281. doi:10.1214/aoms/1177729445.
Dudley, R. M. (1978). "Central limit theorems for empirical measures". Annals of Probability. 6 (6): 899–929. doi:10.1214/aop/1176995384. JSTOR 2243028.
Dudley, R. M. (1999). Uniform Central Limit Theorems. Cambridge Studies in Advanced Mathematics. Vol. 63. Cambridge, UK: Cambridge University Press. ISBN 0-521-46102-2.
Wolfowitz, J. (1954). "Generalization of the theorem of Glivenko–Cantelli". Annals of Mathematical Statistics. 25 (1): 131–138. doi:10.1214/aoms/1177728852. JSTOR 2236518.