Generalized linear mixed model

Summary

In statistics, a generalized linear mixed model (GLMM) is an extension to the generalized linear model (GLM) in which the linear predictor contains random effects in addition to the usual fixed effects.[1][2][3] They also inherit from GLMs the idea of extending linear mixed models to non-normal data.

GLMMs provide a broad range of models for the analysis of grouped data, since the differences between groups can be modelled as a random effect. These models are useful in the analysis of many kinds of data, including longitudinal data.[4]

Model edit

GLMMs are generally defined such that, conditioned on the random effects  , the dependent variable   is distributed according to the exponential family with its expectation related to the linear predictor   via a link function  :

 .

Here   and   are the fixed effects design matrix, and fixed effects respectively;   and   are the random effects design matrix and random effects respectively. To understand this very brief definition you will first need to understand the definition of a generalized linear model and of a mixed model.

Generalized linear mixed models are a special cases of hierarchical generalized linear models in which the random effects are normally distributed.

The complete likelihood[5]

 

has no general closed form, and integrating over the random effects is usually extremely computationally intensive. In addition to numerically approximating this integral(e.g. via Gauss–Hermite quadrature), methods motivated by Laplace approximation have been proposed.[6] For example, the penalized quasi-likelihood method, which essentially involves repeatedly fitting (i.e. doubly iterative) a weighted normal mixed model with a working variate,[7] is implemented by various commercial and open source statistical programs.

Fitting a model edit

Fitting GLMMs via maximum likelihood (as via AIC) involves integrating over the random effects. In general, those integrals cannot be expressed in analytical form. Various approximate methods have been developed, but none has good properties for all possible models and data sets (e.g. ungrouped binary data are particularly problematic). For this reason, methods involving numerical quadrature or Markov chain Monte Carlo have increased in use, as increasing computing power and advances in methods have made them more practical.

The Akaike information criterion (AIC) is a common criterion for model selection. Estimates of AIC for GLMMs based on certain exponential family distributions have recently been obtained.[8]

Software edit

  • Several contributed packages in R provide GLMM functionality,[9][10] including lme4[11] and glmm.[12]
  • GLMM can be fitted using SAS and SPSS[13]
  • MATLAB also provides a function called "fitglme" to fit GLMM models.
  • The Python package Statsmodels supports binomial and poisson implementation [14]
  • The Julia package MixedModels.jl provides a function called GeneralizedLinearMixedModel that fits a GLMM to provided data.[15]
  • DHARMa: residual diagnostics for hierarchical (multi-level/mixed) regression models (utk.edu)[16]

See also edit

References edit

  1. ^ Breslow, N. E.; Clayton, D. G. (1993), "Approximate Inference in Generalized Linear Mixed Models", Journal of the American Statistical Association, 88 (421): 9–25, doi:10.2307/2290687, JSTOR 2290687
  2. ^ Stroup, W.W. (2012), Generalized Linear Mixed Models, CRC Press
  3. ^ Jiang, J. (2007), Linear and Generalized Linear Mixed Models and Their Applications, Springer
  4. ^ Fitzmaurice, G. M.; Laird, N. M.; Ware, J.. (2011), Applied Longitudinal Analysis (2nd ed.), John Wiley & Sons, ISBN 978-0-471-21487-8
  5. ^ Pawitan, Yudi. In All Likelihood: Statistical Modelling and Inference Using Likelihood (Paperbackition ed.). OUP Oxford. p. 459. ISBN 978-0199671229.
  6. ^ Breslow, N. E.; Clayton, D. G. (20 December 2012). "Approximate Inference in Generalized Linear Mixed Models". Journal of the American Statistical Association. 88 (421): 9–25. doi:10.1080/01621459.1993.10594284.
  7. ^ Wolfinger, Russ; O'connell, Michael (December 1993). "Generalized linear mixed models a pseudo-likelihood approach". Journal of Statistical Computation and Simulation. 48 (3–4): 233–243. doi:10.1080/00949659308811554.
  8. ^ Saefken, B.; Kneib, T.; van Waveren, C.-S.; Greven, S. (2014), "A unifying approach to the estimation of the conditional Akaike information in generalized linear mixed models" (PDF), Electronic Journal of Statistics, 8: 201–225, doi:10.1214/14-EJS881
  9. ^ Pinheiro, J. C.; Bates, D. M. (2000), Mixed-effects models in S and S-PLUS, Springer, New York
  10. ^ Berridge, D. M.; Crouchley, R. (2011), Multivariate Generalized Linear Mixed Models Using R, CRC Press
  11. ^ "lme4 package - RDocumentation". www.rdocumentation.org. Retrieved 15 September 2022.
  12. ^ "glmm package - RDocumentation". www.rdocumentation.org. Retrieved 15 September 2022.
  13. ^ "IBM Knowledge Center". www.ibm.com. Retrieved 6 December 2017.
  14. ^ "Statsmodels Documentation". www.statsmodels.org. Retrieved 17 March 2021.
  15. ^ "Details of the parameter estimation · MixedModels". juliastats.org. Retrieved 16 June 2021.
  16. ^ Installing, loading and citing the package, retrieved 2022-08-24