TY - GEN
T1 - Evaluating the Performance of StyleGAN2-ADA on Medical Images
AU - Woodland, McKell K.
AU - Wood, John
AU - Anderson, Brian M.
AU - Kundu, Suprateek
AU - Lin, Ethan
AU - Koay, Eugene
AU - Odisio, Bruno
AU - Chung, Caroline
AU - Kang, Hyunseon Christine
AU - Venkatesan, Aradhana M.
AU - Yedururi, Sireesha
AU - De, Brian
AU - Lin, Yuan Mao
AU - Patel, Ankit B.
AU - Brock, Kristy K.
N1 - Funding Information:
Acknowledgements. This work was supported by the Tumor Measurement Initiative through the MD Anderson Strategic Initiative Development Program (STRIDE). We thank the NIH Clinical Center for the ChestX-ray14 dataset.
Publisher Copyright:
© 2022, The Author(s), under exclusive license to Springer Nature Switzerland AG.
PY - 2022
Y1 - 2022
N2 - Although generative adversarial networks (GANs) have shown promise in medical imaging, they have four main limitations that impede their utility: computational cost, data requirements, reliable evaluation measures, and training complexity. Our work investigates each of these obstacles in a novel application of StyleGAN2-ADA to high-resolution medical imaging datasets. Our dataset is comprised of liver-containing axial slices from non-contrast and contrast-enhanced computed tomography (CT) scans. Additionally, we utilized four public datasets composed of various imaging modalities. We trained a StyleGAN2 network with transfer learning (from the Flickr-Faces-HQ dataset) and data augmentation (horizontal flipping and adaptive discriminator augmentation). The network’s generative quality was measured quantitatively with the Fréchet Inception Distance (FID) and qualitatively with a visual Turing test given to seven radiologists and radiation oncologists. The StyleGAN2-ADA network achieved a FID of 5.22 (±0.17) on our liver CT dataset. It also set new record FIDs of 10.78, 3.52, 21.17, and 5.39 on the publicly available SLIVER07, ChestX-ray14, ACDC, and Medical Segmentation Decathlon (brain tumors) datasets. In the visual Turing test, the clinicians rated generated images as real 42% of the time, approaching random guessing. Our computational ablation study revealed that transfer learning and data augmentation stabilize training and improve the perceptual quality of the generated images. We observed the FID to be consistent with human perceptual evaluation of medical images. Finally, our work found that StyleGAN2-ADA consistently produces high-quality results without hyperparameter searches or retraining.
AB - Although generative adversarial networks (GANs) have shown promise in medical imaging, they have four main limitations that impede their utility: computational cost, data requirements, reliable evaluation measures, and training complexity. Our work investigates each of these obstacles in a novel application of StyleGAN2-ADA to high-resolution medical imaging datasets. Our dataset is comprised of liver-containing axial slices from non-contrast and contrast-enhanced computed tomography (CT) scans. Additionally, we utilized four public datasets composed of various imaging modalities. We trained a StyleGAN2 network with transfer learning (from the Flickr-Faces-HQ dataset) and data augmentation (horizontal flipping and adaptive discriminator augmentation). The network’s generative quality was measured quantitatively with the Fréchet Inception Distance (FID) and qualitatively with a visual Turing test given to seven radiologists and radiation oncologists. The StyleGAN2-ADA network achieved a FID of 5.22 (±0.17) on our liver CT dataset. It also set new record FIDs of 10.78, 3.52, 21.17, and 5.39 on the publicly available SLIVER07, ChestX-ray14, ACDC, and Medical Segmentation Decathlon (brain tumors) datasets. In the visual Turing test, the clinicians rated generated images as real 42% of the time, approaching random guessing. Our computational ablation study revealed that transfer learning and data augmentation stabilize training and improve the perceptual quality of the generated images. We observed the FID to be consistent with human perceptual evaluation of medical images. Finally, our work found that StyleGAN2-ADA consistently produces high-quality results without hyperparameter searches or retraining.
KW - Data augmentation
KW - Fréchet Inception Distance
KW - StyleGAN2-ADA
KW - Transfer learning
KW - Visual turing test
UR - http://www.scopus.com/inward/record.url?scp=85140450274&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85140450274&partnerID=8YFLogxK
U2 - 10.1007/978-3-031-16980-9_14
DO - 10.1007/978-3-031-16980-9_14
M3 - Conference contribution
AN - SCOPUS:85140450274
SN - 9783031169793
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 142
EP - 153
BT - Simulation and Synthesis in Medical Imaging - 7th International Workshop, SASHIMI 2022, Held in Conjunction with MICCAI 2022, Proceedings
A2 - Zhao, Can
A2 - Svoboda, David
A2 - Wolterink, Jelmer M.
A2 - Escobar, Maria
PB - Springer Science and Business Media Deutschland GmbH
T2 - 7th International Workshop on Simulation and Synthesis in Medical Imaging, SASHIMI 2022, held in conjunction with 25th International Conference on Medical Image Computing and Computer-Assisted Intervention, MICCAI 2022
Y2 - 18 September 2022 through 18 September 2022
ER -