Loading...
Thumbnail Image
Publication

Practical Methods for High-Dimensional Data Publication with Differential Privacy

Citations
Altmetric:
Abstract
In recent years, differential privacy has seen significant growth, and has been widely embraced as the dominant privacy definition by the research community. Much progress has been made on designing theoretically principled and practically sound privacy mechanisms. There have even been some real-world deployments of differential privacy, although it has not yet seen widespread adoption. One challenge is that for some problems, there is a gap between the privacy budget required to have a meaningful privacy guarantee and to retain data utility. A second challenge is that many privacy mechanisms have trouble scaling to high-dimensional data, limiting their applicability to real world data. In this work, we take significant steps towards addressing these challenges, by designing mechanisms and tools that mitigate this gap and scale effectively to high-dimensional settings. This thesis consists of three high-level contributions. In Chapt 3, we present HDMM, a mechanism for linear query answering under differential privacy that scales effectively to large multi-dimensional domains while providing more utility than a large body of prior work. In Chapter 4, we present PrivatePGM, a general-purpose post-processing tool that can estimate a discrete data distribution from noisy observations, improving the utility and scalability of many existing mechanisms at no cost to privacy. In Chapter 5, we present AIM, a mechanism for differentially private synthetic data generation, that leverages PrivatePGM to scale to high-dimensional settings, while introducing a number of novel components to overcome the utility limitations of prior work.
Type
dissertation
Date
2022-05
Publisher
License
License
http://creativecommons.org/licenses/by/4.0/
Research Projects
Organizational Units
Journal Issue
Embargo Lift Date
Publisher Version
Embedded videos
Collections