Using machine learning to predict antimicrobial MICs and associated genomic features for nontyphoidal Salmonella

Marcus Nguyen, S. Wesley Long, Patrick F. McDermott, Randall J. Olsen, Robert Olson, Rick L. Stevens, Gregory H. Tyson, Shaohua Zhao, James J. Davisa

Research output: Contribution to journalArticlepeer-review

155 Scopus citations


Nontyphoidal Salmonella species are the leading bacterial cause of foodborne disease in the United States. Whole-genome sequences and paired antimicrobial susceptibility data are available for Salmonella strains because of surveillance efforts from public health agencies. In this study, a collection of 5,278 nontyphoidal Salmonella genomes, collected over 15 years in the United States, was used to generate extreme gradient boosting (XGBoost)-based machine learning models for predicting MICs for 15 antibiotics. The MIC prediction models had an overall average accuracy of 95% within 1 2-fold dilution step (confidence interval, 95% to 95%), an average very major error rate of 2.7% (confidence interval, 2.4% to 3.0%), and an average major error rate of 0.1% (confidence interval, 0.1% to 0.2%). The model predicted MICs with no a priori information about the underlying gene content or resistance phenotypes of the strains. By selecting diverse genomes for the training sets, we show that highly accurate MIC prediction models can be generated with less than 500 genomes. We also show that our approach for predicting MICs is stable over time, despite annual fluctuations in antimicrobial resistance gene content in the sampled genomes. Finally, using feature selection, we explore the important genomic regions identified by the models for predicting MICs. To date, this is one of the largest MIC modeling studies to be published. Our strategy for developing whole-genome sequence-based models for surveillance and clinical diagnostics can be readily applied to other important human pathogens.

Original languageEnglish (US)
Article numbere01260
JournalJournal of Clinical Microbiology
Issue number2
Early online dateOct 17 2018
StatePublished - Feb 1 2019


  • Antimicrobial susceptibility testing
  • Deep learning
  • Diagnostics
  • Genome sequencing
  • Machine learning

ASJC Scopus subject areas

  • Microbiology (medical)


Dive into the research topics of 'Using machine learning to predict antimicrobial MICs and associated genomic features for nontyphoidal Salmonella'. Together they form a unique fingerprint.

Cite this