TY - JOUR
T1 - Diverse data augmentation for learning image segmentation with cross-modality annotations
AU - Chen, Xu
AU - Lian, Chunfeng
AU - Wang, Li
AU - Deng, Hannah
AU - Kuang, Tianshu
AU - Fung, Steve H.
AU - Gateno, Jaime
AU - Shen, Dinggang
AU - Xia, James J.
AU - Yap, Pew Thian
N1 - Funding Information:
This work was supported in part by NIH/NIDCR grants R01 DE022676, R01 DE027251, and R01 DE021863.
Publisher Copyright:
© 2021 Elsevier B.V.
PY - 2021/7
Y1 - 2021/7
N2 - The dearth of annotated data is a major hurdle in building reliable image segmentation models. Manual annotation of medical images is tedious, time-consuming, and significantly variable across imaging modalities. The need for annotation can be ameliorated by leveraging an annotation-rich source modality in learning a segmentation model for an annotation-poor target modality. In this paper, we introduce a diverse data augmentation generative adversarial network (DDA-GAN) to train a segmentation model for an unannotated target image domain by borrowing information from an annotated source image domain. This is achieved by generating diverse augmented data for the target domain by one-to-many source-to-target translation. The DDA-GAN uses unpaired images from the source and target domains and is an end-to-end convolutional neural network that (i) explicitly disentangles domain-invariant structural features related to segmentation from domain-specific appearance features, (ii) combines structural features from the source domain with appearance features randomly sampled from the target domain for data augmentation, and (iii) train the segmentation model with the augmented data in the target domain and the annotations from the source domain. The effectiveness of our method is demonstrated both qualitatively and quantitatively in comparison with the state of the art for segmentation of craniomaxillofacial bony structures via MRI and cardiac substructures via CT.
AB - The dearth of annotated data is a major hurdle in building reliable image segmentation models. Manual annotation of medical images is tedious, time-consuming, and significantly variable across imaging modalities. The need for annotation can be ameliorated by leveraging an annotation-rich source modality in learning a segmentation model for an annotation-poor target modality. In this paper, we introduce a diverse data augmentation generative adversarial network (DDA-GAN) to train a segmentation model for an unannotated target image domain by borrowing information from an annotated source image domain. This is achieved by generating diverse augmented data for the target domain by one-to-many source-to-target translation. The DDA-GAN uses unpaired images from the source and target domains and is an end-to-end convolutional neural network that (i) explicitly disentangles domain-invariant structural features related to segmentation from domain-specific appearance features, (ii) combines structural features from the source domain with appearance features randomly sampled from the target domain for data augmentation, and (iii) train the segmentation model with the augmented data in the target domain and the annotations from the source domain. The effectiveness of our method is demonstrated both qualitatively and quantitatively in comparison with the state of the art for segmentation of craniomaxillofacial bony structures via MRI and cardiac substructures via CT.
KW - Data augmentation
KW - Disentangled representation learning
KW - Generative adversarial learning
KW - Medical image segmentation
UR - http://www.scopus.com/inward/record.url?scp=85105066992&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85105066992&partnerID=8YFLogxK
U2 - 10.1016/j.media.2021.102060
DO - 10.1016/j.media.2021.102060
M3 - Article
C2 - 33957558
AN - SCOPUS:85105066992
VL - 71
JO - Medical Image Analysis
JF - Medical Image Analysis
SN - 1361-8415
M1 - 102060
ER -