University of Notre Dame
Browse
- No file added yet -

Healthcare Applications and Security Concerns of Speech Processing Systems

Download (7.46 MB)
thesis
posted on 2020-07-16, 00:00 authored by Yuan Gong

Conventionally, most research on speech processing focuses on automatic speech recognition (ASR), i.e., transcribing speech to text. However, natural speech does not only contain text content information, but also much other information such as emotion and even the speaker's health status. That means we can extract more information from speech besides text content and use them for novel applications. Specifically, we can develop speech processing systems for healthcare applications such as building convenient and low-cost diagnose, screening, or monitoring solutions. In the first part of this thesis, I investigate how to build speech processing systems for healthcare applications. Specifically, I explore the use of speech systems for monitoring and early diagnosis the autism spectrum disorders, emotional and behavioral disorders, and major depressive disorder.

On the other hand, with the fast-growing number of users and usage scenarios, the security problem of speech processing systems (e.g., Amazon Alexa) becomes a new concern. Recent work has found speech processing systems are vulnerable to multiple types of attacks. However, it is still unclear how dangerous these attacks are in realistic settings. Therefore, in the second part of the thesis, I first systematically explore the vulnerabilities of speech processing systems. Then, I conduct a focused study on adversarial attacks to deep neural network-based models since deep neural networks are becoming the mainstream technique in a variety of speech applications such as speech recognition and speaker identification. Finally, I investigate the effective defense strategies protecting speech processing systems against malicious attacks in realistic settings.

Overall, this thesis aims to address two orthogonal problems about speech processing, and the goal of the research is to broaden the applications and proves the robustness of machine learning-based speech processing systems.

History

Date Modified

2020-07-30

Defense Date

2020-07-09

CIP Code

  • 40.0501

Research Director(s)

Christian Poellabauer

Committee Members

Meng Jiang Adam Czajka Taeho Jung

Degree

  • Doctor of Philosophy

Degree Level

  • Doctoral Dissertation

Language

  • English

Alternate Identifier

1178997203

Library Record

5780191

OCLC Number

1178997203

Program Name

  • Computer Science and Engineering

Usage metrics

    Dissertations

    Categories

    No categories selected

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC