TY - JOUR
T1 - Integrating spatial transcriptomics and bulk RNA-seq
T2 - predicting gene expression with enhanced resolution through graph attention networks
AU - Baul, Sudipto
AU - Ahmed, Khandakar Tanvir
AU - Jiang, Qibing
AU - Wang, Guangyu
AU - Li, Qian
AU - Yong, Jeongsik
AU - Zhang, Wei
N1 - Publisher Copyright:
© The Author(s) 2024. Published by Oxford University Press.
PY - 2024/7/1
Y1 - 2024/7/1
N2 - Spatial transcriptomics data play a crucial role in cancer research, providing a nuanced understanding of the spatial organization of gene expression within tumor tissues. Unraveling the spatial dynamics of gene expression can unveil key insights into tumor heterogeneity and aid in identifying potential therapeutic targets. However, in many large-scale cancer studies, spatial transcriptomics data are limited, with bulk RNA-seq and corresponding Whole Slide Image (WSI) data being more common (e.g. TCGA project). To address this gap, there is a critical need to develop methodologies that can estimate gene expression at near-cell (spot) level resolution from existing WSI and bulk RNA-seq data. This approach is essential for reanalyzing expansive cohort studies and uncovering novel biomarkers that have been overlooked in the initial assessments. In this study, we present STGAT (Spatial Transcriptomics Graph Attention Network), a novel approach leveraging Graph Attention Networks (GAT) to discern spatial dependencies among spots. Trained on spatial transcriptomics data, STGAT is designed to estimate gene expression profiles at spot-level resolution and predict whether each spot represents tumor or non-tumor tissue, especially in patient samples where only WSI and bulk RNA-seq data are available. Comprehensive tests on two breast cancer spatial transcriptomics datasets demonstrated that STGAT outperformed existing methods in accurately predicting gene expression. Further analyses using the TCGA breast cancer dataset revealed that gene expression estimated from tumor-only spots (predicted by STGAT) provides more accurate molecular signatures for breast cancer sub-type and tumor stage prediction, and also leading to improved patient survival and disease-free analysis. Availability: Code is available at https://github.com/compbiolabucf/STGAT.
AB - Spatial transcriptomics data play a crucial role in cancer research, providing a nuanced understanding of the spatial organization of gene expression within tumor tissues. Unraveling the spatial dynamics of gene expression can unveil key insights into tumor heterogeneity and aid in identifying potential therapeutic targets. However, in many large-scale cancer studies, spatial transcriptomics data are limited, with bulk RNA-seq and corresponding Whole Slide Image (WSI) data being more common (e.g. TCGA project). To address this gap, there is a critical need to develop methodologies that can estimate gene expression at near-cell (spot) level resolution from existing WSI and bulk RNA-seq data. This approach is essential for reanalyzing expansive cohort studies and uncovering novel biomarkers that have been overlooked in the initial assessments. In this study, we present STGAT (Spatial Transcriptomics Graph Attention Network), a novel approach leveraging Graph Attention Networks (GAT) to discern spatial dependencies among spots. Trained on spatial transcriptomics data, STGAT is designed to estimate gene expression profiles at spot-level resolution and predict whether each spot represents tumor or non-tumor tissue, especially in patient samples where only WSI and bulk RNA-seq data are available. Comprehensive tests on two breast cancer spatial transcriptomics datasets demonstrated that STGAT outperformed existing methods in accurately predicting gene expression. Further analyses using the TCGA breast cancer dataset revealed that gene expression estimated from tumor-only spots (predicted by STGAT) provides more accurate molecular signatures for breast cancer sub-type and tumor stage prediction, and also leading to improved patient survival and disease-free analysis. Availability: Code is available at https://github.com/compbiolabucf/STGAT.
KW - Graph Attention Network
KW - spatial transcriptomics
KW - spot-level gene expression estimation
KW - whole slide image
UR - http://www.scopus.com/inward/record.url?scp=85197513889&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85197513889&partnerID=8YFLogxK
U2 - 10.1093/bib/bbae316
DO - 10.1093/bib/bbae316
M3 - Article
C2 - 38960406
AN - SCOPUS:85197513889
SN - 1467-5463
VL - 25
JO - Briefings in bioinformatics
JF - Briefings in bioinformatics
IS - 4
M1 - bbae316
ER -