Prediction of carbohydrate-binding proteins from sequences using support vector machines

Kentaro Shimizu, Seizi Someya, Masanori Kakuta, Mizuki Morita, Kazuya Sumikoshi, Wei Cao, Zhenyi Ge, Osamu Hirose, Shugo Nakamura, Tohru Terada

Research output: Contribution to journalArticle

8 Citations (Scopus)

Abstract

Carbohydrate-binding proteins are proteins that can interact with sugar chains but do not modify them. They are involved in many physiological functions, and we have developed a method for predicting them from their amino acid sequences. Our method is based on support vector machines (SVMs). We first clarified the definition of carbohydrate-binding proteins and then constructed positive and negative datasets with which the SVMs were trained. By applying the leave-one-out test to these datasets, our method delivered 0.92 of the area under the receiver operating characteristic (ROC) curve. We also examined two amino acid grouping methods that enable effective learning of sequence patterns and evaluated the performance of these methods. When we applied our method in combination with the homology-based prediction method to the annotated human genome database, H-invDB, we found that the true positive rate of prediction was improved.

Original languageEnglish
Article number289301
JournalAdvances in Bioinformatics
Volume2010
DOIs
Publication statusPublished - Nov 22 2010
Externally publishedYes

ASJC Scopus subject areas

  • Biomedical Engineering
  • Biochemistry, Genetics and Molecular Biology (miscellaneous)
  • Computer Science Applications

Fingerprint Dive into the research topics of 'Prediction of carbohydrate-binding proteins from sequences using support vector machines'. Together they form a unique fingerprint.

  • Cite this

    Shimizu, K., Someya, S., Kakuta, M., Morita, M., Sumikoshi, K., Cao, W., Ge, Z., Hirose, O., Nakamura, S., & Terada, T. (2010). Prediction of carbohydrate-binding proteins from sequences using support vector machines. Advances in Bioinformatics, 2010, [289301]. https://doi.org/10.1155/2010/289301