Off-campus UMass Amherst users: To download campus access dissertations, please use the following link to log into our proxy server with your UMass Amherst user name and password.

Non-UMass Amherst users: Please talk to your librarian about requesting this dissertation through interlibrary loan.

Dissertations that have an embargo placed on them will not be available to anyone until the embargo expires.

Document Type

Open Access Dissertation

Degree Name

Doctor of Philosophy (PhD)

Degree Program

Computer Science

Year Degree Awarded

2017

Month Degree Awarded

September

First Advisor

Neil Immerman

Subject Categories

Databases and Information Systems | Theory and Algorithms

Abstract

One focus area in data management research is to understand how changes in the data can affect the output of a view or standing query. Example applications are explaining query results and propagating updates through views. In this thesis we study the complexity of the Resilience problem, which is the problem of finding the minimum number of tuples that need to be deleted from the database in order to change the result of a query. We will see that resilience is closely related to the well-studied problems of deletion propagation and causal responsibility, and that analyzing its complexity offers important insight for solving those problems as well.

Our contributions include the definition of the concept of triads for conjunctive queries, which is a crucial tool on our analysis, and the characterization of a NP versus P dichotomy for the resilience problem considering the class of conjunctive queries without self-joins. Moreover, this result allowed us to show dichotomies for the same class of queries for both deletion propagation with source side-effects and causal responsibility problems. We also completely characterize how the presence of functional dependencies can change the complexity of such problems.

The class of conjunctive queries with self-joins is far richer and more complicated than the self-join-free ones. Therefore we focus on binary queries without variable repetition, which are queries formed by unary or binary relations only and each atom has only one occurrence of any variable. For this restricted case, we identify three main query structures that help us identify complexity: chains, permutations and confluences. Using those we are able to characterize classes of queries for which resilience is NP-complete and some for which it is P.

Share

COinS