Medical College of Wisconsin
CTSICores SearchResearch InformaticsREDCap

Hotelling's T2 multivariate profiling for detecting differential expression in microarrays. Bioinformatics 2005 Jul 15;21(14):3105-13

Date

05/21/2005

Pubmed ID

15905280

DOI

10.1093/bioinformatics/bti496

Scopus ID

2-s2.0-25144474547   70 Citations

Abstract

The most widely used statistical methods for finding differentially expressed genes (DEGs) are essentially univariate. In this study, we present a new T(2) statistic for analyzing microarray data. We implemented our method using a multiple forward search (MFS) algorithm that is designed for selecting a subset of feature vectors in high-dimensional microarray datasets. The proposed T2 statistic is a corollary to that originally developed for multivariate analyses and possesses two prominent statistical properties. First, our method takes into account multidimensional structure of microarray data. The utilization of the information hidden in gene interactions allows for finding genes whose differential expressions are not marginally detectable in univariate testing methods. Second, the statistic has a close relationship to discriminant analyses for classification of gene expression patterns. Our search algorithm sequentially maximizes gene expression difference/distance between two groups of genes. Including such a set of DEGs into initial feature variables may increase the power of classification rules. We validated our method by using a spike-in HGU95 dataset from Affymetrix. The utility of the new method was demonstrated by application to the analyses of gene expression patterns in human liver cancers and breast cancers. Extensive bioinformatics analyses and cross-validation of DEGs identified in the application datasets showed the significant advantages of our new algorithm.

Author List

Lu Y, Liu PY, Xiao P, Deng HW

Author

Pengyuan Liu PhD Adjunct Professor in the Physiology department at Medical College of Wisconsin




MESH terms used to index this publication - Major topics in bold

Algorithms
Computer Simulation
Data Interpretation, Statistical
Gene Expression Profiling
Models, Genetic
Models, Statistical
Multivariate Analysis
Oligonucleotide Array Sequence Analysis
Sample Size
jenkins-FCD Prod-484 8aa07fc50b7f6d102f3dda2f4c7056ff84294d1d