Cancer classification and prediction using logistic regression with Bayesian gene selection

Xiaobo Zhou, Kuang Yu Liu, Stephen T.C. Wong

Research output: Contribution to journalArticlepeer-review

124 Scopus citations

Abstract

In microarray-based cancer classification and prediction, gene selection is an important research problem owing to the large number of genes and the small number of experimental conditions. In this paper, we propose a Bayesian approach to gene selection and classification using the logistic regression model. The basic idea of our approach is in conjunction with a logistic regression model to relate the gene expression with the class labels. We use Gibbs sampling and Markov chain Monte Carlo (MCMC) methods to discover important genes. To implement Gibbs Sampler and MCMC search, we derive a posterior distribution of selected genes given the observed data. After the important genes are identified, the same logistic regression model is then used for cancer classification and prediction. Issues for efficient implementation for the proposed method are discussed. The proposed method is evaluated against several large microarray data sets, including hereditary breast cancer, small round blue-cell tumors, and acute leukemia. The results show that the method can effectively identify important genes consistent with the known biological findings while the accuracy of the classification is also high. Finally, the robustness and sensitivity properties of the proposed method are also investigated.

Original languageEnglish (US)
Pages (from-to)249-259
Number of pages11
JournalJournal of Biomedical Informatics
Volume37
Issue number4
DOIs
StatePublished - Aug 2004

Keywords

  • Bayesian gene selection
  • Cancer classification
  • Gene microarray
  • Logistic regression

ASJC Scopus subject areas

  • Computer Science Applications
  • Health Informatics

Fingerprint

Dive into the research topics of 'Cancer classification and prediction using logistic regression with Bayesian gene selection'. Together they form a unique fingerprint.

Cite this