Abstract
We used human protein-protein interaction (PPI) data transformed into documents to perform text-mining via concept clusters. The advantage of text-mining PPI data is that words (proteins) that are very sparse or over-abundant can be dropped, leaving the remaining bulk of data for clustering and rule mining. Libraries of tissue-specific binary PPIs were constructed from a list of 36,137 binary PPIs in the Human Protein Reference Database(HPRD). A randomization test for intermittency in the form of spikes and holes in frequency distributions of cluster-specific word frequencies was developed using scaled factorial moments. The test was based on a permutation form of a log-linear regression model to determine differences in slopes for ln(F 2) vs. ln(M) in the intermittent and null distributions. Significant intermittency (p < 0.0005) in PPI was detected for prostate and testis tissue after a Bonferroni adjustment for multiple tests. The presence of intermittency reflects spikes and holes in histograms of cluster-specific word frequencies and possibly suggests identification of novel large signal transduction pathways or networks.
| Original language | English (US) |
|---|---|
| Title of host publication | 2008 International Joint Conference on Neural Networks, IJCNN 2008 |
| Pages | 3634-3640 |
| Number of pages | 7 |
| DOIs | |
| State | Published - Nov 24 2008 |
| Event | 2008 International Joint Conference on Neural Networks, IJCNN 2008 - Hong Kong, China Duration: Jun 1 2008 → Jun 8 2008 |
Other
| Other | 2008 International Joint Conference on Neural Networks, IJCNN 2008 |
|---|---|
| Country/Territory | China |
| City | Hong Kong |
| Period | 6/1/08 → 6/8/08 |
ASJC Scopus subject areas
- Software
- Artificial Intelligence
Fingerprint
Dive into the research topics of 'Text-mining protein-protein interaction corpus using concept clustering to identify intermittency'. Together they form a unique fingerprint.Cite this
- APA
- Standard
- Harvard
- Vancouver
- Author
- BIBTEX
- RIS