Partitioning large-sample microarray-based gene expression profiles using principal components analysis

Leif E. Peterson

Research output: Contribution to journalArticle

33 Scopus citations

Abstract

Principal components analysis (PCA) is useful for reproducing the total variation among hundreds or thousands of continuously-scaled variables with a much smaller number of unobservable variables called 'latent factors'. The CLUSFAVOR computer program was used to implement PCA for identifying groups of genes with similar expression profiles from a large number of genes used on DNA microarrays. This paper describes the principal components solution to the factor model of the correlation matrix R, calculation of eigenvalues and eigenvectors of R, extraction of factors, and calculation of factor loadings and identification of genes with similar loading patterns to construct groups of genes with similar expression profiles. With regard to extraction of factors, it was found that more than 90% of the total variance in input data could be accounted for by extracting factors whose eigenvalues exceed unity. Bipolar factors containing strong positive and negative loadings can also be used for identifying two unique groups of genes, since expression profiles of genes that load positive are unlike expression profiles of genes that load negative on the same factor. While PCA does not provide the absolute answer to a multidimensional problem, it nevertheless can provide a heuristic with which natural groupings of genes with similar expression profiles can be assembled. While cluster analysis essentially generates a single dendogram (tree branch) containing every gene in the input data, PCA can be used to assemble gene expression profiles that strongly correlate with the latent factors accounting for a majority of total variance. Example results for CLUSFAVOR computer program runs are provided.

Original languageEnglish (US)
Pages (from-to)107-119
Number of pages13
JournalComputer Methods and Programs in Biomedicine
Volume70
Issue number2
DOIs
StatePublished - Feb 2003

Keywords

  • cDNA microarrays
  • Eigenanalysis
  • Eigenvalues
  • Eigenvectors
  • Factor analysis
  • Gene expression
  • Principal components analysis

ASJC Scopus subject areas

  • Software

Fingerprint Dive into the research topics of 'Partitioning large-sample microarray-based gene expression profiles using principal components analysis'. Together they form a unique fingerprint.

Cite this