Loading...
Citations
Altmetric:
Abstract
Experimentation increasingly drives everyday decisions in modern life, as it is considered by some to be the gold standard for determining cause and effect within any system. Digital experiments have expanded the scope and frequency of experiments, which can range in complexity from classic A/B tests to contextual bandits experiments, which share features with reinforcement learning. Although there exists a large body of prior work on estimating treatment effects using experiments, this prior work did not anticipate the new challenges and opportu- nities introduced by digital experimentation. Novel errors and threats to validity arise at the intersection of software and experimentation, especially when experimentation is in service of understanding humans behavior or autonomous black-box agents. We present several novel tools for automating aspects of the experimentation- analysis pipeline. We propose new methods for evaluating online field experimentation, automatically generating corresponding analyses of treatment effects. We then draw the connection between software testing and experimental design and argue that applying software testing techniques to a kind of autonomous agent—a deep reinforcement learning agent—to demonstrate the need for novel testing paradigms when a software stack uses learned components that may have emergent behavior. We show how our system may be used to evaluate claims made about the behavior of autonomous agents and find that some claims do not hold up under test. Finally, we show how to produce explanations of the behavior of black-box software-defined agents interacting with white-box environments via automated experimentation. We show how an automated system can be used for exploratory data analysis, with a human in the loop, to investigate a large space of possible counterfactual explanations.
Type
dissertation
Date
2020-12
Publisher
Degree
Advisors
License
License
http://creativecommons.org/licenses/by/4.0/