University of Notre Dame
Browse

File(s) under permanent embargo

Data Privacy via Integration of Differential Privacy and Data Synthesis

thesis
posted on 2018-04-03, 00:00 authored by Claire McKay Bowen

When sharing data among collaborators or releasing data publicly, one of the crucial concerns is the extreme risk of exposing personal information of individuals who contribute to the data. Many statistical methods of data privacy and confidentiality have little to no means in measuring an altered data set's privacy guarantee. Differential privacy, a condition on data releasing algorithms, quantifies disclosure risk, but is traditionally used in a query based privacy method instead of in a synthetic dataset release. My dissertation develops and explores various methods of incorporating differential privacy in synthetic data generation using predicted values within a Bayesian framework. I call these methods, differentially private data synthesis (DIPS) techniques. In my dissertation, I first conducted a comparative study of several DIPS approaches on various data types as well as a case study on Male Fertility data. Next, I created a method (called SPECKS) to compare DIPS data to real-life data, and another method to improve the statistical inferences of non-parametric DIPS approaches. These methods were tested on voter registration data. Finally, I developed a DIPS technique for social network data called Noisy Edges and Traits (NET) and applied it to two real-life data sets.

History

Alt Title

Differentially private data synthesis

Date Created

2018-04-03

Date Modified

2018-10-30

Defense Date

2018-03-27

Research Director(s)

Fang Liu

Degree

  • Doctor of Philosophy

Degree Level

  • Doctoral Dissertation

Program Name

  • Applied and Computational Mathematics and Statistics

Usage metrics

    Dissertations

    Categories

    No categories selected

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC