Off-campus UMass Amherst users: To download campus access dissertations, please use the following link to log into our proxy server with your UMass Amherst user name and password.
Non-UMass Amherst users: Please talk to your librarian about requesting this dissertation through interlibrary loan.
Dissertations that have an embargo placed on them will not be available to anyone until the embargo expires.
Author ORCID Identifier
AccessType
Open Access Dissertation
Document Type
dissertation
Degree Name
Doctor of Philosophy (PhD)
Degree Program
Computer Science
Year Degree Awarded
2020
Month Degree Awarded
May
First Advisor
Philip S. Thomas
Second Advisor
Erik Learned-Miller
Third Advisor
Shlomo Zilberstein
Fourth Advisor
Melinda D. Dyar
Subject Categories
Artificial Intelligence and Robotics
Abstract
In this dissertation we develop techniques to leverage prior knowledge for improving the learning speed of existing reinforcement learning (RL) algorithms. RL systems can be expensive to train, which limits its applicability when a large number of agents need to be trained to solve a large number of tasks; a situation that often occurs in industry and is often ignored in the RL literature. In this thesis, we develop three methods to leverage the experience obtained from solving a small number of tasks to improve an agent's ability to learn on new tasks the agent might face in the future. First, we propose using compression algorithms to identify macros that are likely to be generated by an optimal policy. Because compression techniques identify sequences that occur frequently, they can be used to identify action patterns that are often required to solve a task. Second, we address some of the limitations present in the first method by formalizing an optimization problem that allows an agent to learn a set of options that are appropriate for the tasks. Specifically, we propose an analogous objective to compression by minimizing the number of decisions an agent has to make to generate the observed optimal behavior. This technique also addresses a question that is often ignored in the option literature: how many options are needed? Finally, we show that prior experience can also be leveraged to address the exploration-exploitation dilemma; a central problem in RL. We propose a framework in which a small number of tasks are used to train a meta-agent on how to explore. After being trained, any agent facing a new task can query the meta-agent on what action it should take for exploration. We show empirically that, when facing a large number of tasks, leveraging prior experience can be an effective way of improving existing reinforcement learning techniques. At present, the application of RL in the industry setting remains rather limited. One of the reasons being how costly and time consuming training large scale systems can be. We hope this work provides some guidance for future work, and that it inspires new research in exploiting existing knowledge to make RL a practical alternative to tackle large scale real-world problems.
DOI
https://doi.org/10.7275/q4mw-sh77
Recommended Citation
Garcia, Francisco M., "Improving Reinforcement Learning Techniques by Leveraging Prior Experience" (2020). Doctoral Dissertations. 1887.
https://doi.org/10.7275/q4mw-sh77
https://scholarworks.umass.edu/dissertations_2/1887
Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.