Approaches for dealing with various sources of overdispersion in modeling count data: Scale adjustment versus modeling. Stat Methods Med Res 2017 Aug;26(4):1802-1823
Date
06/03/2015Pubmed ID
26031359DOI
10.1177/0962280215588569Scopus ID
2-s2.0-85027708622 (requires institutional sign-in at Scopus site) 43 CitationsAbstract
Overdispersion is a common problem in count data. It can occur due to extra population-heterogeneity, omission of key predictors, and outliers. Unless properly handled, this can lead to invalid inference. Our goal is to assess the differential performance of methods for dealing with overdispersion from several sources. We considered six different approaches: unadjusted Poisson regression (Poisson), deviance-scale-adjusted Poisson regression (DS-Poisson), Pearson-scale-adjusted Poisson regression (PS-Poisson), negative-binomial regression (NB), and two generalized linear mixed models (GLMM) with random intercept, log-link and Poisson (Poisson-GLMM) and negative-binomial (NB-GLMM) distributions. To rank order the preference of the models, we used Akaike's information criteria/Bayesian information criteria values, standard error, and 95% confidence-interval coverage of the parameter values. To compare these methods, we used simulated count data with overdispersion of different magnitude from three different sources. Mean of the count response was associated with three predictors. Data from two real-case studies are also analyzed. The simulation results showed that NB and NB-GLMM were preferred for dealing with overdispersion resulting from any of the sources we considered. Poisson and DS-Poisson often produced smaller standard-error estimates than expected, while PS-Poisson conversely produced larger standard-error estimates. Thus, it is good practice to compare several model options to determine the best method of modeling count data.
Author List
Payne EH, Hardin JW, Egede LE, Ramakrishnan V, Selassie A, Gebregziabher MAuthor
Leonard E. Egede MD Center Director, Chief, Professor in the Medicine department at Medical College of WisconsinMESH terms used to index this publication - Major topics in bold
AgedBayes Theorem
Female
Humans
Linear Models
Lung Neoplasms
Male
Middle Aged
Models, Statistical
Poisson Distribution
Regression Analysis
Salmonella