Medical College of Wisconsin
CTSICores SearchResearch InformaticsREDCap

Comparison of multiple imputation and other methods for the analysis of imputed genotypes. BMC Genomics 2023 Jun 06;24(1):303

Date

06/06/2023

Pubmed ID

37277705

Pubmed Central ID

PMC10242917

DOI

10.1186/s12864-023-09415-0

Scopus ID

2-s2.0-85160972672 (requires institutional sign-in at Scopus site)

Abstract

BACKGROUND: Analysis of imputed genotypes is an important and routine component of genome-wide association studies and the increasing size of imputation reference panels has facilitated the ability to impute and test low-frequency variants for associations. In the context of genotype imputation, the true genotype is unknown and genotypes are inferred with uncertainty using statistical models. Here, we present a novel method for integrating imputation uncertainty into statistical association tests using a fully conditional multiple imputation (MI) approach which is implemented using the Substantive Model Compatible Fully Conditional Specification (SMCFCS). We compared the performance of this method to an unconditional MI and two additional approaches that have been shown to demonstrate excellent performance: regression with dosages and a mixture of regression models (MRM).

RESULTS: Our simulations considered a range of allele frequencies and imputation qualities based on data from the UK Biobank. We found that the unconditional MI was computationally costly and overly conservative across a wide range of settings. Analyzing data with Dosage, MRM, or MI SMCFCS resulted in greater power, including for low frequency variants, compared to unconditional MI while effectively controlling type I error rates. MRM andl MI SMCFCS are both more computationally intensive then using Dosage.

CONCLUSIONS: The unconditional MI approach for association testing is overly conservative and we do not recommend its use in the context of imputed genotypes. Given its performance, speed, and ease of implementation, we recommend using Dosage for imputed genotypes with MAF [Formula: see text] 0.001 and Rsq [Formula: see text] 0.3.

Author List

Auer PL, Wang G, Li G, DeWan AT, Leal SM

Author

Paul L. Auer PhD Professor in the Institute for Health and Equity department at Medical College of Wisconsin




MESH terms used to index this publication - Major topics in bold

Gene Frequency
Genome-Wide Association Study
Genotype
Models, Statistical
Polymorphism, Single Nucleotide