Medical College of Wisconsin
CTSICores SearchResearch InformaticsREDCap

Multivariate exploratory tools for microarray data analysis. Biostatistics 2003 Oct;4(4):555-67

Date

10/15/2003

Pubmed ID

14557111

DOI

10.1093/biostatistics/4.4.555

Scopus ID

2-s2.0-2942677611   31 Citations

Abstract

The ultimate success of microarray technology in basic and applied biological sciences depends critically on the development of statistical methods for gene expression data analysis. The most widely used tests for differential expression of genes are essentially univariate. Such tests disregard the multidimensional structure of microarray data. Multivariate methods are needed to utilize the information hidden in gene interactions and hence to provide more powerful and biologically meaningful methods for finding subsets of differentially expressed genes. The objective of this paper is to develop methods of multidimensional search for biologically significant genes, considering expression signals as mutually dependent random variables. To attain these ends, we consider the utility of a pertinent distance between random vectors and its empirical counterpart constructed from gene expression data. The distance furnishes exploratory procedures aimed at finding a target subset of differentially expressed genes. To determine the size of the target subset, we resort to successive elimination of smaller subsets resulting from each step of a random search algorithm based on maximization of the proposed distance. Different stopping rules associated with this procedure are evaluated. The usefulness of the proposed approach is illustrated with an application to the analysis of two sets of gene expression data.

Author List

Szabo A, Boucher K, Jones D, Tsodikov AD, Klebanov LB, Yakovlev AY

Author

Aniko Szabo PhD Professor in the Institute for Health and Equity department at Medical College of Wisconsin




MESH terms used to index this publication - Major topics in bold

Algorithms
Computer Simulation
Confidence Intervals
Data Interpretation, Statistical
Discriminant Analysis
Gene Expression Profiling
Humans
Leukemia
Multivariate Analysis
Oligonucleotide Array Sequence Analysis
Probability
Statistics, Nonparametric
jenkins-FCD Prod-482 91ad8a360b6da540234915ea01ff80e38bfdb40a