University of Notre Dame
Browse

Finding Problems In, Proposing Solutions to, and Performing Analysis on Imbalanced Data

Download (3.24 MB)
thesis
posted on 2009-07-08, 00:00 authored by David Alan Cieslak
The data mining and machine learning research communities have focused on developing specializedalgorithms and methods to handle a multitude of potential complications which may confront the traditional supervised learning task. Of these, class imbalance is among the most persistent in real-world applications. Less attention has been garnered for the set of problems in which the data distribution changes, potentially wiping out the gains from expensive data mining methods. To an even lesser degree has the combination of problems been considered. It is the purpose of this dissertation to explore concepts of distributionalchange, particularly within the context of imbalanced data problems and the effects of the performance on solutions from this realm. Based on this exploration, the proposed dissertation will derive methods to identify and handle both problems simultaneously.

History

Date Modified

2017-06-02

Defense Date

2009-06-10

Research Director(s)

Nitesh Chawla

Committee Members

Aaron Striegel Pat Flynn Kevin Bowyer

Degree

  • Doctor of Philosophy

Degree Level

  • Doctoral Dissertation

Language

  • English

Alternate Identifier

etd-07082009-100035

Publisher

University of Notre Dame

Additional Groups

  • Computer Science and Engineering

Program Name

  • Computer Science and Engineering

Usage metrics

    Dissertations

    Categories

    No categories selected

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC