Off-campus UMass Amherst users: To download campus access dissertations, please use the following link to log into our proxy server with your UMass Amherst user name and password.
Non-UMass Amherst users: Please talk to your librarian about requesting this dissertation through interlibrary loan.
Dissertations that have an embargo placed on them will not be available to anyone until the embargo expires.
Author ORCID Identifier
https://orcid.org/0000-0003-4567-8627
AccessType
Open Access Dissertation
Document Type
dissertation
Degree Name
Doctor of Philosophy (PhD)
Degree Program
Computer Science
Year Degree Awarded
2023
Month Degree Awarded
February
First Advisor
Philip S. Thomas
Subject Categories
Artificial Intelligence and Robotics
Abstract
Scientific fields make advancements by leveraging the knowledge created by others to push the boundary of understanding. The primary tool in many fields for generating knowledge is empirical experimentation. Although common, generating accurate knowledge from empirical experiments is often challenging due to inherent randomness in execution and confounding variables that can obscure the correct interpretation of the results. As such, researchers must hold themselves and others to a high degree of rigor when designing experiments. Unfortunately, most reinforcement learning (RL) experiments lack this rigor, making the knowledge generated from experiments dubious. This dissertation proposes methods to address central issues in RL experimentation.
Evaluating the performance of an RL algorithm is the most common type of experiment in RL literature. Most performance evaluations are often incapable of answering a specific research question and produce misleading results. Thus, the first issue we address is how to create a performance evaluation procedure that holds up to scientific standards.
Despite the prevalence of performance evaluation, these types of experiments produce limited knowledge, e.g., they can only show how well an algorithm worked and not why, and they require significant amounts of time and computational resources. As an alternative, this dissertation proposes that scientific testing, the process of conducting carefully controlled experiments designed to further the knowledge and understanding of how an algorithm works, should be the primary form of experimentation.
Lastly, this dissertation provides a case study using policy gradient methods, showing how scientific testing can replace performance evaluation as the primary form of experimentation. As a result, this dissertation can motivate others in the field to adopt more rigorous experimental practices.
DOI
https://doi.org/10.7275/32956067
Recommended Citation
Jordan, Scott M., "Rigorous Experimentation For Reinforcement Learning" (2023). Doctoral Dissertations. 2760.
https://doi.org/10.7275/32956067
https://scholarworks.umass.edu/dissertations_2/2760
Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.