BREAKING NEWS
Jackknife resampling

## Summary

In statistics, the jackknife is a resampling technique especially useful for variance and bias estimation. The jackknife pre-dates other common resampling methods such as the bootstrap. The jackknife estimator of a parameter is found by systematically leaving out each observation from a dataset and calculating the estimate and then finding the average of these calculations. Given a sample of size ${\displaystyle n}$, the jackknife estimate is found by aggregating the estimates of each ${\displaystyle (n-1)}$-sized sub-sample.

The jackknife technique was developed by Maurice Quenouille (1924–1973) from 1949 and refined in 1956. John Tukey expanded on the technique in 1958 and proposed the name "jackknife" because, like a physical jack-knife (a compact folding knife), it is a rough-and-ready tool that can improvise a solution for a variety of problems even though specific problems may be more efficiently solved with a purpose-designed tool.[1]

The jackknife is a linear approximation of the bootstrap.[1]

## Estimation

The jackknife estimate of a parameter can be found by estimating the parameter for each subsample omitting the i-th observation.[2] For example, if the parameter to be estimated is the population mean of x, we compute the mean ${\displaystyle {\bar {x}}_{i}}$ for each subsample consisting of all but the i-th data point:

${\displaystyle {\bar {x}}_{i}={\frac {1}{n-1}}\sum _{j=1,j\neq i}^{n}x_{j},\quad \quad i=1,\dots ,n.}$

These n estimates form an estimate of the distribution of the sample statistic if it were computed over a large number of samples. In particular, the mean of this sampling distribution is the average of these n estimates:

${\displaystyle {\bar {x}}={\frac {1}{n}}\sum _{i=1}^{n}{\bar {x}}_{i}.}$

One can show explicitly that this ${\displaystyle {\bar {x}}}$ equals the usual estimate ${\displaystyle {\frac {1}{n}}\sum _{i=1}^{n}x_{i}}$, so the real point emerges for higher moments than the mean. A jackknife estimate of the variance of the estimator can be calculated from the variance of this distribution of ${\displaystyle {\bar {x}}_{i}}$:[3][4]

${\displaystyle \operatorname {Var} ({\bar {x}})={\frac {n-1}{n}}\sum _{i=1}^{n}({\bar {x}}_{i}-{\bar {x}})^{2}={\frac {1}{n(n-1)}}\sum _{i=1}^{n}(x_{i}-{\bar {x}})^{2}.}$

## Bias estimation and correction

The jackknife technique can be used to estimate the bias of an estimator calculated over the entire sample. Say ${\displaystyle {\hat {\theta }}}$ is the calculated estimator of the parameter of interest based on all ${\displaystyle {n}}$ observations. Let

${\displaystyle {\hat {\theta }}_{\mathrm {(.)} }={\frac {1}{n}}\sum _{i=1}^{n}{\hat {\theta }}_{(i)}}$

where ${\displaystyle {\hat {\theta }}_{(i)}}$ is the estimate of interest based on the sample with the i-th observation removed, and ${\displaystyle {\hat {\theta }}_{\mathrm {(.)} }}$ is the average of these "leave-one-out" estimates. The jackknife estimate of the bias of ${\displaystyle {\hat {\theta }}}$ is given by:

${\displaystyle {\widehat {\text{bias}}}_{\mathrm {(\theta )} }=(n-1)({\hat {\theta }}_{\mathrm {(.)} }-{\hat {\theta }})}$

and the resulting bias-corrected jackknife estimate of ${\displaystyle \theta }$ is given by:

${\displaystyle {\hat {\theta }}_{\text{jack}}={\hat {\theta }}-{\widehat {\text{bias}}}_{\mathrm {(\theta )} }=n{\hat {\theta }}-(n-1){\hat {\theta }}_{\mathrm {(.)} }}$

This removes the bias in the special case that the bias is ${\displaystyle O(n^{-1})}$ and removes it to ${\displaystyle O(n^{-2})}$ in other cases.[1]

## Literature

• Berger, Y.G. (2007). "A jackknife variance estimator for unistage stratified samples with unequal probabilities". Biometrika. 94 (4): 953–964. doi:10.1093/biomet/asm072.
• Berger, Y.G.; Rao, J.N.K. (2006). "Adjusted jackknife for imputation under unequal probability sampling without replacement". Journal of the Royal Statistical Society, Series B. 68 (3): 531–547. doi:10.1111/j.1467-9868.2006.00555.x.
• Berger, Y.G.; Skinner, C.J. (2005). "A jackknife variance estimator for unequal probability sampling". Journal of the Royal Statistical Society, Series B. 67 (1): 79–89. doi:10.1111/j.1467-9868.2005.00489.x.
• Jiang, J.; Lahiri, P.; Wan, S-M. (2002). "A unified jackknife theory for empirical best prediction with M-estimation". The Annals of Statistics. 30 (6): 1782–810. doi:10.1214/aos/1043351257.
• Jones, H.L. (1974). "Jackknife estimation of functions of stratum means". Biometrika. 61 (2): 343–348. doi:10.2307/2334363. JSTOR 2334363.
• Kish, L.; Frankel, M.R. (1974). "Inference from complex samples". Journal of the Royal Statistical Society, Series B. 36 (1): 1–37.
• Krewski, D.; Rao, J.N.K. (1981). "Inference from stratified samples: properties of the linearization, jackknife and balanced repeated replication methods". The Annals of Statistics. 9 (5): 1010–1019. doi:10.1214/aos/1176345580.
• Quenouille, M.H. (1956). "Notes on bias in estimation". Biometrika. 43 (3–4): 353–360. doi:10.1093/biomet/43.3-4.353.
• Rao, J.N.K.; Shao, J. (1992). "Jackknife variance estimation with survey data under hot deck imputation". Biometrika. 79 (4): 811–822. doi:10.1093/biomet/79.4.811.
• Rao, J.N.K.; Wu, C.F.J.; Yue, K. (1992). "Some recent work on resampling methods for complex surveys". Survey Methodology. 18 (2): 209–217.
• Shao, J. and Tu, D. (1995). The Jackknife and Bootstrap. Springer-Verlag, Inc.
• Tukey, J.W. (1958). "Bias and confidence in not-quite large samples (abstract)". The Annals of Mathematical Statistics. 29 (2): 614.
• Wu, C.F.J. (1986). "Jackknife, Bootstrap and other resampling methods in regression analysis" (PDF). The Annals of Statistics. 14 (4): 1261–1295. doi:10.1214/aos/1176350142.

## Notes

1. ^ a b c Cameron & Trivedi 2005, p. 375.
2. ^ Efron 1982, p. 2.
3. ^ Efron 1982, p. 14.
4. ^ McIntosh, Avery I. "The Jackknife Estimation Method" (PDF). Boston University. Avery I. McIntosh. Retrieved 2016-04-30.: p. 3.

## References

• Cameron, Adrian; Trivedi, Pravin K. (2005). Microeconometrics : methods and applications. Cambridge New York: Cambridge University Press. ISBN 9780521848053.
• Efron, Bradley; Stein, Charles (May 1981). "The Jackknife Estimate of Variance". The Annals of Statistics. 9 (3): 586–596. doi:10.1214/aos/1176345462. JSTOR 2240822.
• Efron, Bradley (1982). The jackknife, the bootstrap, and other resampling plans. Philadelphia, PA: Society for Industrial and Applied Mathematics. ISBN 9781611970319.
• Quenouille, Maurice H. (September 1949). "Problems in Plane Sampling". The Annals of Mathematical Statistics. 20 (3): 355–375. doi:10.1214/aoms/1177729989. JSTOR 2236533.
• Quenouille, Maurice H. (1956). "Notes on Bias in Estimation". Biometrika. 43 (3–4): 353–360. doi:10.1093/biomet/43.3-4.353. JSTOR 2332914.
• Tukey, John W. (1958). "Bias and confidence in not quite large samples (abstract)". The Annals of Mathematical Statistics. 29 (2): 614. doi:10.1214/aoms/1177706647.