TY - JOUR
T1 - Comparison of reversible-jump Markov-chain-Monte-Carlo learning approach with other methods for missing enzyme identification
AU - Geng, Bo
AU - Zhou, Xiaobo
AU - Zhu, Jinmin
AU - Hung, Y. S.
AU - Wong, Stephen T.C.
N1 - Funding Information:
The authors would like to acknowledge the excellent collaboration with their biology collaborators in this research effort, and, in particular, Dr. Santosh Kesari of the Dana-Farber Cancer Institute, Harvard Medical School, and also helpful discussion with Dr. Yufei Huang in the Department of Electrical Engineering, The University of Texas at San Antonio. They would like to thank other members of the Life Science Imaging Group of the Center for Bioinformatics, Harvard Center for Neurodegeneration and Repair (HCNR) and Brigham and Women’s Hospital, HMS for their technical comments. This research is partly sponsored by the Center for Bioinformatics Program Grant from HCNR, Harvard Medical School.
PY - 2008/4
Y1 - 2008/4
N2 - Computational identification of missing enzymes plays a significant role in accurate and complete reconstruction of metabolic network for both newly sequenced and well-studied organisms. For a metabolic reaction, given a set of candidate enzymes identified according to certain biological evidences, a powerful mathematical model is required to predict the actual enzyme(s) catalyzing the reactions. In this study, several plausible predictive methods are considered for the classification problem in missing enzyme identification, and comparisons are performed with an aim to identify a method with better performance than the Bayesian model used in previous work. In particular, a regression model consisting of a linear term and a nonlinear term is proposed to apply to the problem, in which the reversible jump Markov-chain-Monte-Carlo (MCMC) learning technique (developed in [Andrieu C, Freitas Nando de, Doucet A. Robust full Bayesian learning for radial basis networks 2001;13:2359-407.]) is adopted to estimate the model order and the parameters. We evaluated the models using known reactions in Escherichia coli, Mycobacterium tuberculosis, Vibrio cholerae and Caulobacter cresentus bacteria, as well as one eukaryotic organism, Saccharomyces Cerevisiae. Although support vector regression also exhibits comparable performance in this application, it was demonstrated that the proposed model achieves favorable prediction performance, particularly sensitivity, compared with the Bayesian method.
AB - Computational identification of missing enzymes plays a significant role in accurate and complete reconstruction of metabolic network for both newly sequenced and well-studied organisms. For a metabolic reaction, given a set of candidate enzymes identified according to certain biological evidences, a powerful mathematical model is required to predict the actual enzyme(s) catalyzing the reactions. In this study, several plausible predictive methods are considered for the classification problem in missing enzyme identification, and comparisons are performed with an aim to identify a method with better performance than the Bayesian model used in previous work. In particular, a regression model consisting of a linear term and a nonlinear term is proposed to apply to the problem, in which the reversible jump Markov-chain-Monte-Carlo (MCMC) learning technique (developed in [Andrieu C, Freitas Nando de, Doucet A. Robust full Bayesian learning for radial basis networks 2001;13:2359-407.]) is adopted to estimate the model order and the parameters. We evaluated the models using known reactions in Escherichia coli, Mycobacterium tuberculosis, Vibrio cholerae and Caulobacter cresentus bacteria, as well as one eukaryotic organism, Saccharomyces Cerevisiae. Although support vector regression also exhibits comparable performance in this application, it was demonstrated that the proposed model achieves favorable prediction performance, particularly sensitivity, compared with the Bayesian method.
KW - Markov-chain-Monte-Carlo
KW - Metabolic network
KW - Missing enzymes identification
KW - Regression model
UR - http://www.scopus.com/inward/record.url?scp=40049107547&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=40049107547&partnerID=8YFLogxK
U2 - 10.1016/j.jbi.2007.09.002
DO - 10.1016/j.jbi.2007.09.002
M3 - Article
C2 - 17950040
AN - SCOPUS:40049107547
SN - 1532-0464
VL - 41
SP - 272
EP - 281
JO - Journal of Biomedical Informatics
JF - Journal of Biomedical Informatics
IS - 2
ER -