University of Notre Dame
Browse

File(s) under permanent embargo

Learning to Augment Data in Graphs

thesis
posted on 2022-04-10, 00:00 authored by Tong Zhao

Given the omnipresence of graph-structured data, graph machine learning has copious applications in multifarious fields such as social media, e-commerce platform, cyber-physical system, or chemical synthesis. Nonetheless, data driven models for graph data also face their unique challenges including over-smoothing caused by message passing-based graph neural networks, structural data sparsity brought by power-law distributions, lack of labelled data due to costly annotations, and noisy signals caused by spurious correlations. In order to address these challenges, other than developing more advanced and complicated machine learning models, graph data augmentation allows researchers to improve graph machine learning from the perspective of data.

Works in this thesis develop advanced graph data augmentation techniques for various graph machine learning tasks: node classification, link prediction, and anomaly detection. Differs from traditional ad-hoc data augmentation techniques that integrated augmentation into the learning process of representations and decisions, this thesis introduces learn-to-augment approaches which leverage machine learning models for data augmentation. Therefore, this thesis designs a holistic learning process from data augmentation to representation learning to decision making. Such learn-to-augment approaches are able to achieve superior downstream task performance as well as alleviate the above-mentioned challenges in graph machine learning. Furthermore, by enhancing machine learning from data perspective, graph data augmentation solutions can be used with different graph machine learning models and would not significantly increase the model's complexity.

History

Date Modified

2022-05-10

Defense Date

2022-03-23

CIP Code

  • 40.0501

Research Director(s)

Meng Jiang

Committee Members

Nitesh Chawla Tim Weninger Leman Akoglu Neil Shah

Degree

  • Doctor of Philosophy

Degree Level

  • Doctoral Dissertation

Alternate Identifier

1314919342

Library Record

6209904

OCLC Number

1314919342

Program Name

  • Computer Science and Engineering

Usage metrics

    Dissertations

    Categories

    No categories selected

    Keywords

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC