S-glutathionylation, the reversible formation of mixed disulfides between glutathione(GSH) and cysteine residues in proteins, is a specific form of post-translational modification that plays important roles in various biological processes, including signal transduction, redox homeostasis, and metabolism inside cells. Experimentally identifying S-glutathionylation sites is labor-intensive and time consuming, whereas bioinformatics methods provide an alternative way to this problem by predicting S-glutathionylation sites in silico. The bioinformatics approaches give not only candidate sites for further experimental verification but also bio-chemical insights into the mechanism of S-glutathionylation. In this paper, we firstly collect experimentally determined S-glutathionylated proteins and their corresponding modification sites from the literature, and then propose a new method for predicting S-glutathionylation sites by employing machine learning methods based on protein sequence data. Promising results are obtained by our method with an AUC (area under ROC curve) score of 0.879 in 5-fold cross-validation, which demonstrates the predictive power of our proposed method. The datasets used in this work are available at http://csb.shu.edu.cn/SGDB.
ASJC Scopus subject areas
- Biochemistry, Genetics and Molecular Biology(all)
- Agricultural and Biological Sciences(all)