posted on 2024-04-30, 17:46authored byXingyuan Zhao
Differential privacy (DP) formalizes privacy guarantees in a rigorous mathematical framework and is a state-of-the-art concept in data privacy research. The DP mechanisms ensure the privacy of each individual in a sensitive dataset while releasing useful information about the whole population in that dataset. Since its debut in 2006, significant advancements in DP theory, methodologies, and applications have been made; new research topics and questions have been proposed and studied. This dissertation aims to contribute to the advancement of DP concepts and methods in the robustness of DP mechanisms to privacy attacks, privacy amplification through subsampling, and DP guarantees of procedures with their intrinsic randomness. Specifically, this dissertation consists of three research projects on DP. The first project explores the protection potency of DP mechanisms against homogeneity attacks (HA) by providing analytical relations between measures of disclosure risk from HA and privacy loss parameters, which will assist practitioners in understanding the abstract concepts of DP by putting them in a concrete privacy attack model and offer a perspective for choosing privacy loss parameters. The second project proposes a class of subsampling methods ``MUltistage Sampling Technique (MUST)'' for privacy amplification. It provides the privacy composition analysis over repeated applications of MUST via the Fourier accountant algorithm. The utility experiments show that MUST demonstrates comparable utility and stability in privacy-preserving outputs compared to one-stage subsampling methods at similar privacy loss while improving the computational efficiency of algorithms requiring complex function calculations on distinct data points. MUST can be seamlessly integrated into stochastic optimization algorithms or procedures involving parallel or simultaneous subsampling when DP guarantees are necessary. The third project investigates the inherent DP guarantees in Bayesian posterior sampling. It provides a new privacy loss bound in releasing a single posterior sample with any prior given a bounded log ratio of the likelihood kernels based on two neighboring data sets. The new bound is tighter than the existing bounds and consistent with the likelihood principle. Experiments show that the privacy-preserving synthetic data released from Bayesian models leveraging the inherently private posterior samples are of improved utility compared to those generated by sanitizing the original information through explicit DP mechanisms.
History
Date Created
2024-04-08
Date Modified
2024-04-30
Defense Date
2024-03-27
CIP Code
27.9999
Research Director(s)
Fang Liu
Committee Members
Xiufan Yu
Changbo Zhu
Degree
Doctor of Philosophy
Degree Level
Doctoral Dissertation
Language
English
Library Record
006582890
OCLC Number
1432178251
Publisher
University of Notre Dame
Additional Groups
Applied and Computational Mathematics and Statistics
Program Name
Applied and Computational Mathematics and Statistics