Semi-Automatically Inducing Semantic Classes of Clinical Research Eligibility Criteria Using UMLS and Hierarchical Clustering. AMIA Annu Symp Proc 2010 Nov 13;2010:487-91
Date
02/25/2011Pubmed ID
21347026Pubmed Central ID
PMC3041461Scopus ID
2-s2.0-84964928019 (requires institutional sign-in at Scopus site) 20 CitationsAbstract
This paper presents a novel approach to learning semantic classes of clinical research eligibility criteria. It uses the UMLS Semantic Types to represent semantic features and the Hierarchical Clustering method to group similar eligibility criteria. By establishing a gold standard using two independent raters, we evaluated the coverage and accuracy of the induced semantic classes. On 2,718 random eligibility criteria sentences, the inter-rater classification agreement was 85.73%. In a 10-fold validation test, the average Precision, Recall and F-score of the classification results of a decision-tree classifier were 87.8%, 88.0%, and 87.7% respectively. Our induced classes well aligned with 16 out of 17 eligibility criteria classes defined by the BRIDGE model. We discuss the potential of this method and our future work.
Author List
Luo Z, Johnson SB, Weng CAuthor
Jake Luo Ph.D. Associate Professor; Director, Center for Biomedical Data and Language Processing (BioDLP) in the Health Informatics & Administration department at University of Wisconsin - MilwaukeeMESH terms used to index this publication - Major topics in bold
Biomedical ResearchCluster Analysis
Humans
Models, Theoretical
Natural Language Processing
Semantics
Unified Medical Language System