Off-campus UMass Amherst users: To download campus access dissertations, please use the following link to log into our proxy server with your UMass Amherst user name and password.

Non-UMass Amherst users: Please talk to your librarian about requesting this dissertation through interlibrary loan.

Dissertations that have an embargo placed on them will not be available to anyone until the embargo expires.

Author ORCID Identifier

https://orcid.org/0000-0003-4699-4226

AccessType

Open Access Dissertation

Document Type

dissertation

Degree Name

Doctor of Philosophy (PhD)

Degree Program

Computer Science

Year Degree Awarded

2022

Month Degree Awarded

May

First Advisor

Gerome Miklau

Second Advisor

Daniel Sheldon

Subject Categories

Databases and Information Systems

Abstract

In recent years, differential privacy has seen significant growth, and has been widely embraced as the dominant privacy definition by the research community. Much progress has been made on designing theoretically principled and practically sound privacy mechanisms. There have even been some real-world deployments of differential privacy, although it has not yet seen widespread adoption. One challenge is that for some problems, there is a gap between the privacy budget required to have a meaningful privacy guarantee and to retain data utility. A second challenge is that many privacy mechanisms have trouble scaling to high-dimensional data, limiting their applicability to real world data. In this work, we take significant steps towards addressing these challenges, by designing mechanisms and tools that mitigate this gap and scale effectively to high-dimensional settings. This thesis consists of three high-level contributions. In Chapt 3, we present HDMM, a mechanism for linear query answering under differential privacy that scales effectively to large multi-dimensional domains while providing more utility than a large body of prior work. In Chapter 4, we present PrivatePGM, a general-purpose post-processing tool that can estimate a discrete data distribution from noisy observations, improving the utility and scalability of many existing mechanisms at no cost to privacy. In Chapter 5, we present AIM, a mechanism for differentially private synthetic data generation, that leverages PrivatePGM to scale to high-dimensional settings, while introducing a number of novel components to overcome the utility limitations of prior work.

DOI

https://doi.org/10.7275/28565202

Creative Commons License

Creative Commons Attribution 4.0 License
This work is licensed under a Creative Commons Attribution 4.0 License.

Share

COinS