University of Notre Dame
Browse
1/1
8 files

Novel Computational Approaches for Multi-network Analysis to Improve Protein Function Prediction

thesis
posted on 2022-02-03, 00:00 authored by Shawn Gu

Networks can be used to model complex real-world systems from many domains, including computational biology. A protein-protein interaction (PPI) network (PPIN), in which nodes are proteins and edges are PPIs, is a popular type of biological network. While PPIN data are becoming widely available thanks to biotechnological advancements, functions of many proteins remain unknown. As such, many computational techniques have been developed to analyze PPINs in order to gain insights into proteins' functions.

One such technique is biological network alignment (NA), which aims to find a node mapping between species' molecular networks that uncovers similar network regions, thus allowing for the transfer of functional knowledge between the aligned nodes. However, a major issue of NA methods is that often aligned nodes (proteins) do not actually share the same function. So we aim to address such challenges by introducing several novel computational advances, such as allowing for the alignment of heterogeneous biological networks for the first time, or by learning from -omics data what patterns of network topological relatedness (rather than similarity) correspond to functional relatedness between biological networks of different species. We show that the novel computational advances improve the accuracy of across-species protein functional prediction compared to existing NA methods.

One limitation of across-species NA is that it only considers biological networks at the same scale: PPINs. However, at a more fine-grained scale, a protein's 3D structure has important implications for its function. Such structures have been modeled using protein structure networks (PSNs), where nodes are amino acids and edges join those that are close in the 3D crystal structure, to great success. Thus, we argue that PPIN and PSN data should be integrated as a ``network of networks'' (NoN). We aim to answer whether NoN-based data integration is effective, by evaluating whether NoN-based protein functional prediction, fusing the complementary PPIN and PSN information, is more accurate than single-scale functional prediction, using only PPIN or only PSN information. We show that NoN-based data integration has the potential to uncover novel biological knowledge compared to only considering a single scale, and thus is an exciting direction for future research.

History

Date Modified

2022-02-10

Defense Date

2022-01-21

CIP Code

  • 40.0501

Research Director(s)

Tijana Milenković

Degree

  • Doctor of Philosophy

Degree Level

  • Doctoral Dissertation

Alternate Identifier

1295845722

Library Record

6163779

OCLC Number

1295845722

Program Name

  • Computer Science and Engineering

Usage metrics

    Dissertations

    Categories

    No categories selected

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC