Off-campus UMass Amherst users: To download campus access dissertations, please use the following link to log into our proxy server with your UMass Amherst user name and password.

Non-UMass Amherst users: Please talk to your librarian about requesting this dissertation through interlibrary loan.

Dissertations that have an embargo placed on them will not be available to anyone until the embargo expires.

Author ORCID Identifier

https://orcid.org/0000-0002-8050-377X

Document Type

Open Access Dissertation

Degree Name

Doctor of Philosophy (PhD)

Degree Program

Computer Science

Year Degree Awarded

2020

Month Degree Awarded

February

First Advisor

Daniel Sheldon

Abstract

Domains involving sensitive human data, such as health care, human mobility, and online activity, are becoming increasingly dependent upon machine learning algorithms. This leads to scenarios in which data owners wish to protect the privacy of individuals comprising the sensitive data, while at the same time data modelers wish to analyze and draw conclusions from the data. Thus there is a growing demand to develop effective private inference methods that can marry the needs of both parties. For this we turn to differential privacy, which provides a framework for executing algorithms in a private fashion by injecting specifically-designed randomization at various points in the process. The majority of existing work proceeds by ignoring the injected randomization, potentially leading to pathologies in algorithmic performance. There is, however, a small body of existing work that performs inference over the injected randomization in an attempt to design more principled algorithms. This thesis summarizes the subfield of noise-aware differentially private inference and contributes novel algorithms for important problems.

Differential privacy literature provides a multitude of privacy mechanisms. We opt for sufficient statistics perturbation (SSP), in which sufficient statistics, a quantity that captures all information about the model parameters, are corrupted with random noise and released to the public. This mechanism offers desirable efficiency properties in comparison to alternatives. In this thesis we develop methods in a principled manner that directly accounts for the injected noise in three settings: maximum likelihood estimation of undirected graphical models, Bayesian inference of exponential family models, and Bayesian inference of conditional regression models.

DOI

https://doi.org/10.7275/z2q6-6039

Creative Commons License

Creative Commons Attribution 4.0 License
This work is licensed under a Creative Commons Attribution 4.0 License.

Share

COinS