University of Notre Dame
Browse

File(s) under embargo

Exploring Trustworthy Concerns in Computer Vision: From Deterministic to Generative Domains

dataset
posted on 2024-07-17, 15:46 authored by Ziyi Kou
Computer vision (CV) research designs AI algorithms to approximate the human visual system across two primary aspects: deterministic and generative domains. Numerous CV applications have been proposed to enhance real-world implementations, such as face recognition and fauxtography detection with deterministic models, as well as text guided image generation with foundational generative models. While contemporary research has been largely concentrating on enhancing the overall accuracy of CV models, it is also crucial to explore the pivotal trustworthy concerns from the application users who prioritize their trust on AI models over the model outputs. Identifying such concerns in the CV applications and refining the CV models to better serve the users is a long-standing challenging task, as novel and advanced CV models are emerging in an endless stream. In this dissertation, we explore the trustworthy concerns of CV models in both the deterministic and generative domains. In particular, we focus on three trustworthy concerns that prevalently spreads in major CV applications: i) the privacy issue in face recognition, ii) the explainability issue in multimodal misinformation detection and iii) the faithfulness issue in text guided image generation. Along these lines, the dissertation consists of three specific aims: A1) we aim to protect the user privacy when their sensitive information is utilized as inputs by deterministic CV models, e.g., safeguarding identity privacy of users in facial recognition applications with improved accuracy and fairness; A2) we aim to enable explainability for the decisions making process of deterministic models, e.g., accurately detecting online multimodal misinformation and providing reasonable explanations for the detection results; and A3) we aim to explore and enhance the faithfulness property of image generative models, e.g., revealing the vulnerabilities of the diffusion model, a widely-used text-driven image generation model, where slight alterations to the input text can cause the model to generate incorrect images. Moreover, when given relevant context like storylines, improving the consistency of the character appearance in a series of generated images by regulating the generation process is also required.

History

Date Created

2024-07-03

Date Modified

2024-07-17

Defense Date

2024-05-22

CIP Code

  • 14.0901

Research Director(s)

Xiangliang Zhang

Committee Members

Nitesh Chawla Chaoli Wang Adam Czajka

Degree

  • Doctor of Philosophy

Degree Level

  • Doctoral Dissertation

Language

  • English

Library Record

006603641

OCLC Number

1446444704

Publisher

University of Notre Dame

Additional Groups

  • Computer Science and Engineering

Program Name

  • Computer Science and Engineering

Usage metrics

    Dissertations

    Categories

    No categories selected

    Keywords

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC