Healthcare Applications and Security Concerns of Speech Processing Systems

Gong, Yuan

doi:10.7274/8c97kp81j7k

Healthcare Applications and Security Concerns of Speech Processing Systems

thesis

posted on 2020-07-16, 00:00 authored by Yuan Gong

Conventionally, most research on speech processing focuses on automatic speech recognition (ASR), i.e., transcribing speech to text. However, natural speech does not only contain text content information, but also much other information such as emotion and even the speaker's health status. That means we can extract more information from speech besides text content and use them for novel applications. Specifically, we can develop speech processing systems for healthcare applications such as building convenient and low-cost diagnose, screening, or monitoring solutions. In the first part of this thesis, I investigate how to build speech processing systems for healthcare applications. Specifically, I explore the use of speech systems for monitoring and early diagnosis the autism spectrum disorders, emotional and behavioral disorders, and major depressive disorder.

On the other hand, with the fast-growing number of users and usage scenarios, the security problem of speech processing systems (e.g., Amazon Alexa) becomes a new concern. Recent work has found speech processing systems are vulnerable to multiple types of attacks. However, it is still unclear how dangerous these attacks are in realistic settings. Therefore, in the second part of the thesis, I first systematically explore the vulnerabilities of speech processing systems. Then, I conduct a focused study on adversarial attacks to deep neural network-based models since deep neural networks are becoming the mainstream technique in a variety of speech applications such as speech recognition and speaker identification. Finally, I investigate the effective defense strategies protecting speech processing systems against malicious attacks in realistic settings.

Overall, this thesis aims to address two orthogonal problems about speech processing, and the goal of the research is to broaden the applications and proves the robustness of machine learning-based speech processing systems.

History

Date Modified

2020-07-30

Defense Date

2020-07-09

CIP Code

40.0501

Research Director(s)

Christian Poellabauer

Committee Members

Meng Jiang Adam Czajka Taeho Jung

Degree

Doctor of Philosophy

Degree Level

Doctoral Dissertation

Language

English

Alternate Identifier

1178997203

Library Record

5780191

OCLC Number

1178997203

Additional Groups

Computer Science and Engineering

Program Name

Computer Science and Engineering

Healthcare Applications and Security Concerns of Speech Processing Systems

History

Date Modified

Defense Date

CIP Code

Research Director(s)

Committee Members

Degree

Degree Level

Language

Alternate Identifier

Library Record

OCLC Number

Additional Groups

Program Name

Usage metrics

Categories

Keywords

Licence

Exports