Medical College of Wisconsin
CTSIResearch InformaticsREDCap

Bi-level variable selection for case-cohort studies with group variables. Stat Methods Med Res 2019;28(10-11):3404-3414

Date

10/12/2018

Pubmed ID

30306838

Pubmed Central ID

PMC6748310

DOI

10.1177/0962280218803654

Scopus ID

2-s2.0-85060332349 (requires institutional sign-in at Scopus site)   4 Citations

Abstract

The case-cohort design is an economical approach to estimate the effect of risk factors on the survival outcome when collecting exposure information or covariates on all patients is expensive in a large cohort study. Variables often have group structure such as categorical variables and highly correlated continuous variables. The existing literature for case-cohort data is limited to identifying non-zero variables at individual level only. In this article, we propose a bi-level variable selection method to select non-zero group and within-group variables for case-cohort data when variables have group structure. The proposed method allows the number of variables to diverge as the sample size increases. The asymptotic properties of the estimator including bi-level variable selection consistency and the asymptotic normality are shown. We also conduct simulations to compare our proposed method with some existing method and apply them to the Busselton Health data.

Author List

Kim S, Woo Ahn K

Authors

Kwang Woo Ahn PhD Director, Professor in the Data Science Institute department at Medical College of Wisconsin
Soyoung Kim PhD, BS, MS Associate Professor in the Data Science Institute department at Medical College of Wisconsin




MESH terms used to index this publication - Major topics in bold

Biomarkers
Cohort Studies
Ferritins
Humans
Models, Statistical
Research Design
Risk Assessment
Risk Factors
Sample Size
Stroke