Screen efficiency comparisons of decision tree and neural network algorithms in machine learning assisted drug design

Qiumei Pu, Yinghao Li, Hong Zhang, Haodong Yao, Bo Zhang, Bingji Hou, Lin Li, Yuliang Zhao, Lina Zhao

Research output: Contribution to journalArticlepeer-review

10 Scopus citations


In view of huge search space in drug design, machine learning has become a powerful method to predict the affinity between small molecular drug and targeting protein with the development of artificial intelligence technology. However, various machine learning algorithms including massive different parameters make the prediction framework choice to be quite difficult. In this work, we took a recent drug design competition (from XtalPi company on the DataCastle platform) as the typical case to find the optimized parameters for different machines learning algorithms and the most effective algorithm. After the parameter optimizations, we compared the typical machine learning methods as decision tree (XGBoost, LightGBM) and artificial neural network (MLP, CNN) with root-mean-square error (RMSE) and coefficient of determination (R 2 ) evaluation. As a result, decision tree is more effective than the neural network as LightGBM>XGBoost>CNN>MLP in the affinity prediction of the specific drug design problem with ~160000 samples. For a much larger screening task in a more complicated drug design study, the sophisticated neural network model may go beyond the decision tree algorithm after generalization enhancing and overfitting reducing. The advanced machine learning methods could extract more information of protein-ligand bindings than traditional ones and improve the screen efficiency of drug design up to 200–1000 times.

Original languageEnglish (US)
Pages (from-to)506-514
Number of pages9
JournalScience China Chemistry
Issue number4
StatePublished - Apr 1 2019


  • affinity prediction
  • drug design
  • machine learning
  • protein-ligand binding

ASJC Scopus subject areas

  • Chemistry(all)


Dive into the research topics of 'Screen efficiency comparisons of decision tree and neural network algorithms in machine learning assisted drug design'. Together they form a unique fingerprint.

Cite this