Computational identification of missing enzymes plays a significant role in accurate and complete reconstruction of metabolic network for both newly sequenced and well-studied organisms. For a metabolic reaction, given a set of candidate enzymes identified according to certain biological evidences, a powerful mathematical model is required to predict the actual enzyme(s) catalyzing the reactions. In this study, several plausible predictive methods are considered for the classification problem in missing enzyme identification, and comparisons are performed with an aim to identify a method with better performance than the Bayesian model used in previous work. In particular, a regression model consisting of a linear term and a nonlinear term is proposed to apply to the problem, in which the reversible jump Markov-chain-Monte-Carlo (MCMC) learning technique (developed in [Andrieu C, Freitas Nando de, Doucet A. Robust full Bayesian learning for radial basis networks 2001;13:2359-407.]) is adopted to estimate the model order and the parameters. We evaluated the models using known reactions in Escherichia coli, Mycobacterium tuberculosis, Vibrio cholerae and Caulobacter cresentus bacteria, as well as one eukaryotic organism, Saccharomyces Cerevisiae. Although support vector regression also exhibits comparable performance in this application, it was demonstrated that the proposed model achieves favorable prediction performance, particularly sensitivity, compared with the Bayesian method.
- Metabolic network
- Missing enzymes identification
- Regression model
ASJC Scopus subject areas
- Computer Science Applications
- Health Informatics
- Computer Science (miscellaneous)