Inferring Protein-Protein Interactions from Protein Domain Combinations

Master's Thesis

Abstract

A goal of contemporary proteome research is the elucidation of protein-protein interactions in a cell. Based on currently available protein-protein interaction and domain data of S. cerevisiae, we introduce a novel method, Maximum Specificity Set Cover (MSSC), to predict protein-protein interactions. MSSC features three stages: First, MSSC selects high quality protein-protein interactions based on a clustering measure. Second, MSSC assigns probabilities to domain pairs. Third, MSSC uses the domain pairs to infer protein-protein interactions. We also modified MSSC to include the possibility of having more than one domain from each protein causing the protein-protein interaction. MSSC allows us to predict previously unknown protein-protein interactions with a degree of sensitivity and specificity that clearly out-scores other approaches. MSSC achieved 86% sensitivity and 62% specificity using 80% of the high quality interactions in the DIP database. The predicted interaction network preserves the characteristics of the initial web of known protein interactions. We also observe high levels of co-expression among putative interactions. We also observe high levels of co-expression among putative interactions. We extend our method to infer protein-protein interactions in multicellular organisms where protein-protein interaction data currently does not exist. Starting from predictions in yeast, we find a set of orthologous interactions in A. thaliana, C. elegans, D. melanogaster, M. musculus, and H. sapiens.

Attributes

Attribute NameValues
URN
  • etd-04192012-124218

Author Simon Peter Kanaan
Advisor Jesus A. Izaguirre
Contributor Gregory R. Madey, Committee Member
Contributor Raul Santelices , Committee Member
Contributor Jesus A. Izaguirre, Committee Chair
Degree Level Master's Thesis
Degree Discipline Computer Science and Engineering
Degree Name MSCSE
Defense Date
  • 2012-04-09

Submission Date 2012-04-19
Country
  • United States of America

Subject
  • Set Cover Problem

  • Protein-Protein Interactions

  • Domain-Domain Interactions

  • MSSC

  • Domain Combinations

Publisher
  • University of Notre Dame

Language
  • English

Record Visibility Public
Content License
  • All rights reserved

Departments and Units

Files

Please Note: You may encounter a delay before a download begins. Large or infrequently accessed files can take several minutes to retrieve from our archival storage system.