University of Notre Dame
Browse

Uncertainty and Novelty in Machine Learning

Download (4.64 MB)
dataset
posted on 2024-12-09, 16:45 authored by Derek Scott Prijatelj

Uncertainty and novelty are inherent in machine learning, especially as new information is encountered and the hypothesis set’s best model is to be determined given the current information. Ideally, we could answer the following: what are the types of uncertainty and novelty that a predictor could encounter and how do we measure them, how does uncertainty and novelty effect the information perceived from observations, and how can a predictor be evaluated when learning a such phenomena.

This work answers these questions through both theory and application. We provide a Bayesian evaluation framework for subjective tasks where different sources of uncertainty are considered and the truth itself is uncertain. We introduce an abstraction of novelty that is then further developed in terms of information theory and algorithms.

This formalizes the concept of identifiable information that arises from the language used to express the relationship between distinct states. Through the computation of the indicator function, model identifiability and sample complexity are defined and their properties are described for different data-generating processes, ranging from deterministic to ergodic stationary stochastic processes. This demonstrates identifying information in finite steps to asymptotic statistics and PAC-learning, where we recover identification within finite observations at the cost of uncertainty and error.

We explore the practical evaluation of novelty detection and adaptation with new benchmarks in handwriting recognition and human activity recognition.

History

Date Created

2024-12-01

Date Modified

2024-12-09

Defense Date

2024-11-18

CIP Code

  • 14.0901

Research Director(s)

Walter Scheirer

Committee Members

Kevin Bowyer Tim Weninger Joshua Alspector

Degree

  • Doctor of Philosophy

Degree Level

  • Doctoral Dissertation

Language

  • English

Library Record

006642275

OCLC Number

1477749769

Publisher

University of Notre Dame

Additional Groups

  • Computer Science and Engineering

Program Name

  • Computer Science and Engineering

Usage metrics

    Dissertations

    Categories

    No categories selected

    Keywords

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC