Large Language Models for Translational Cancer Informatics

Yining Pan, Yanfei Wang, Guangyu Wang, Jing Su, Umit Topaloglu, Qianqian Song

Research output: Contribution to journalReview articlepeer-review

Abstract

PURPOSE – Cancer remains a leading cause of death worldwide. The growing volume of high-throughput single-cell and spatial transcriptomic data sets—particularly those related to cancer—offers immense opportunities as well as analytical challenges for effective data analysis and interpretation. Large language models (LLMs), pretrained on vast data sets and capable of various biomedical tasks, offer a promising solution. This review explores the application of LLMs in cancer research from both cellular and pathologic perspectives, aiming to showcase their potential in advancing precision oncology.MATERIALS AND METHODS – We systematically review current LLMs in analyzing single-cell RNA sequencing, spatial transcriptomic, and histology image data, emphasizing their relevance to cancer biology and translational research.RESULTS – A total of 24 LLMs, published or in preprint between 2022 and 2025, were selected for review. In single-cell transcriptomics, LLMs have primarily been used for cell type annotation, batch integration, and drug-response prediction. In spatial transcriptomics, LLMs support multislide and multimodal spatial data integration, gene expression imputation, niche and region label prediction, spatial domain identification, cell-cell communication inference, and marker gene detection. In computational pathology, LLMs have been applied to cancer subtyping, detection of rare malignancies, genomic mutation prediction, image segmentation, as well as cross-modal retrieval. Despite these advances, many models remain underoptimized for cancer-specific applications, highlighting the need for domain-specific fine-tuning and scalable adaptation strategies.CONCLUSION – LLMs have the potential to significantly advance cancer research by providing scalable and effective tools for analyzing and interpreting single-cell, spatial transcriptomic, and pathology data. Future efforts should prioritize tailoring these models to cancer-specific contexts to enhance their utility in uncovering disease mechanisms, identifying biomarkers, and informing therapeutic strategies.

Original languageEnglish (US)
Pages (from-to)1-14
Number of pages14
JournalJCO clinical cancer informatics
Volume9
DOIs
StatePublished - Oct 2025

ASJC Scopus subject areas

  • Oncology
  • Health Informatics
  • Cancer Research

Fingerprint

Dive into the research topics of 'Large Language Models for Translational Cancer Informatics'. Together they form a unique fingerprint.

Cite this