A fully Bayesian latent variable model for integrative clustering analysis of multi-type omics data

Qianxing Mo, Ronglai Shen, Cui Guo, Marina Vannucci, Keith S. Chan, Susan G. Hilsenbeck

    Research output: Contribution to journalArticlepeer-review

    169 Scopus citations

    Abstract

    Identification of clinically relevant tumor subtypes and omics signatures is an important task in cancer translational research for precision medicine. Large-scale genomic profiling studies such as The Cancer Genome Atlas (TCGA) Research Network have generated vast amounts of genomic, transcriptomic, epigenomic, and proteomic data. While these studies have provided great resources for researchers to discover clinically relevant tumor subtypes and driver molecular alterations, there are fewcomputationally efficient methods and tools for integrative clustering analysis of these multi-type omics data. Therefore, the aim of this article is to develop a fully Bayesian latent variable method (called iClusterBayes) that can jointly model omics data of continuous and discrete data types for identification of tumor subtypes and relevant omics features. Specifically, the proposed method uses a few latent variables to capture the inherent structure of multiple omics data sets to achieve joint dimension reduction. As a result, the tumor samples can be clustered in the latent variable space and relevant omics features that drive the sample clustering are identified through Bayesian variable selection. This method significantly improve on the existing integrative clustering method iClusterPlus in terms of statistical inference and computational speed. By analyzing TCGA and simulated data sets, we demonstrate the excellent performance of the proposed method in revealing clinically meaningful tumor subtypes and driver omics features.

    Original languageEnglish (US)
    Pages (from-to)71-86
    Number of pages16
    JournalBiostatistics
    Volume19
    Issue number1
    DOIs
    StatePublished - Jan 1 2018

    Keywords

    • Bayesian variable selection
    • Integrative clustering
    • Latent variable model
    • Multi-type omics data
    • iCluster
    • iClusterBayes
    • iClusterPlus

    ASJC Scopus subject areas

    • General Medicine

    Fingerprint

    Dive into the research topics of 'A fully Bayesian latent variable model for integrative clustering analysis of multi-type omics data'. Together they form a unique fingerprint.

    Cite this