Novel Computational Approaches for Multi-network Analysis to Improve Protein Function Prediction
Networks can be used to model complex real-world systems from many domains, including computational biology. A protein-protein interaction (PPI) network (PPIN), in which nodes are proteins and edges are PPIs, is a popular type of biological network. While PPIN data are becoming widely available thanks to biotechnological advancements, functions of many proteins remain unknown. As such, many computational techniques have been developed to analyze PPINs in order to gain insights into proteins' functions.
One such technique is biological network alignment (NA), which aims to find a node mapping between species' molecular networks that uncovers similar network regions, thus allowing for the transfer of functional knowledge between the aligned nodes. However, a major issue of NA methods is that often aligned nodes (proteins) do not actually share the same function. So we aim to address such challenges by introducing several novel computational advances, such as allowing for the alignment of heterogeneous biological networks for the first time, or by learning from -omics data what patterns of network topological relatedness (rather than similarity) correspond to functional relatedness between biological networks of different species. We show that the novel computational advances improve the accuracy of across-species protein functional prediction compared to existing NA methods.
One limitation of across-species NA is that it only considers biological networks at the same scale: PPINs. However, at a more fine-grained scale, a protein's 3D structure has important implications for its function. Such structures have been modeled using protein structure networks (PSNs), where nodes are amino acids and edges join those that are close in the 3D crystal structure, to great success. Thus, we argue that PPIN and PSN data should be integrated as a ``network of networks'' (NoN). We aim to answer whether NoN-based data integration is effective, by evaluating whether NoN-based protein functional prediction, fusing the complementary PPIN and PSN information, is more accurate than single-scale functional prediction, using only PPIN or only PSN information. We show that NoN-based data integration has the potential to uncover novel biological knowledge compared to only considering a single scale, and thus is an exciting direction for future research.
History
Date Modified
2022-02-10Defense Date
2022-01-21CIP Code
- 40.0501
Research Director(s)
Tijana MilenkovićDegree
- Doctor of Philosophy
Degree Level
- Doctoral Dissertation
Alternate Identifier
1295845722Library Record
6163779OCLC Number
1295845722Additional Groups
- Computer Science and Engineering
Program Name
- Computer Science and Engineering