ScholarWorks@UMassAmherst
We are now able to accept submissions directly in ScholarWorks. For submissions that are not doctoral dissertations or masters theses, please log in with your NetID, click the + (plus) in to the top left corner, and select the Submit Research option.
Graduate students filing for February 2025 degrees: We are now accepting submissions directly to ScholarWorks. Directions for submissions can be found in this guide. Please email scholarworks@library.umass.edu if you have any questions.
Request forms appear to be functional. If you do not receive a reply to a submitted request, please email scholarworks@library.umass.edu.
This site is still under construction, please see our ScholarWorks guide for updates.
Recent Submissions
Publication 2 Factors Challenging Faculty's Sense of Inclusion(2024-08-08)Pandemic-related caregiving burdens and health concerns have played a particularly large role, write Shuyin Liu, Dessie Clark, Laurel Smith-Doerr and Joya Misra.Publication COVID’s Lasting Impacts on Faculty Inclusion(2024-08-01)Think the pandemic is well behind us? Survey data shows feelings of inclusion have continued dropping as a result of it, write Laurel Smith-Doerr, Joya Misra, Shuyin Liu and Dessie Clark.Publication Exploiting Structures in Interactive Decision Making(2024-09)In this thesis we study several problems in interactive decision making. Interactive decision making plays an important role in many applications such as online advertisement and autonomous driving. Two classical problems are multi-armed bandits and reinforcement learning. Here and more broadly, the central challenge is the \emph{exploration-exploitation} tradeoff, whereby the agent must decide whether to explore uncertain actions that could potentially bring high reward or to stick to the known good actions. Resolving this challenge is particularly difficult in settings with large or continuous state and action spaces. For reinforcement learning, function approximation is a prevalent structure to manage large state and action spaces. However, misspecification of the function classes can have a detrimental effect on the statistical outcomes. These structured settings are the focus of this thesis. First we study the combinatorial pure exploration problem in the multi-arm bandit framework. In this problem, we are given $K$ distributions and a collection of subsets $\Vcal \subset 2^{[K]}$ of these distributions, and we would like to find the subset $v \in \Vcal$ that has largest mean, while collecting, in a sequential fashion, as few samples from the distributions as possible. We develop new algorithms with strong statistical and computational guarantees by leveraging precise concentration-of-measure arguments and a reduction to linear programming. Second we study reinforcement learning in continuous state and action spaces endowed with a metric. We provide a refined analysis of a variant of the algorithm of Sinclair, Banerjee, and Yu (2019) and show that its regret scales with the \emph{zooming dimension} of the instance. Our results are the first provably adaptive guarantees for reinforcement learning in metric spaces. Finally, we study a more fundamental problem of \emph{distribution shift}, where training and deployment conditions for a machine learning model differ. We study the effect of distribution shift in the presence of model misspecification, specifically focusing on $L_{\infty}$-misspecified regression and \emph{adversarial covariate shift}, where the regression target remains fixed while the covariate distribution changes arbitrarily. We develop a new algorithm---inspired by robust optimization techniques—that avoids misspecification amplification while still obtaining optimal statistical rates. As applications, we use this regression procedure to obtain new guarantees in offline and online reinforcement learning with misspecification and establish new separations between previously studied structural conditions and notions of coverage.Publication Improving Variational Inference through Advanced Stochastic Optimization Techniques(2024-09)Black-box variational inference (VI) is crucial in probabilistic machine learning, offering an alternative method for Bayesian inference. By requiring only black-box access to the model and its gradients, it recasts complex inference tasks into more manageable optimization problems, aiding in the approximation of intricate posterior distributions across a wide range of models. However, black-box VI faces a fundamental challenge: managing the noise introduced by using stochastic gradient optimization methods, which limits efficient approximations. This thesis presents new approaches to enhance the efficiency of black-box VI by improving different aspects of its optimization process. The first part of this thesis focuses on the importance-weighted evidence lower bound (IW-ELBO), an objective used in the VI optimization problem. The IW-ELBO, by incorporating importance sampling, augments the expressive power of the approximating distributions used in VI. However, it also introduces increased variance in gradient estimation, complicating the optimization process. To mitigate this, our thesis applies the theory of U-statistics, an approach that significantly reduces variance. Since fully implementing U-statistics can be impractical due to exponential growth in computation, we introduce approximate methods that effectively reduce variance with minimal computational overhead. The second part of this thesis addresses a central issue within black-box VI: its stochastic optimization process, i.e., Stochastic Gradient Descent or its variations, is highly sensitive to user-specified hyperparameter choices, often leading to poor results. We address this issue by introducing an algorithm specifically designed for VI, based on the sample average approximation (SAA). This method, SAA for VI, transforms the stochastic optimization task into a sequence of deterministic problems that can be easily solved using standard optimization techniques. As a result, it simplifies and automates the optimization process, reduces the burden of hyperparameter tuning, and exhibits robust performance, particularly in complex statistical models involving hundreds of latent variables. In the third part of this thesis, we shift our focus from the objective and optimization process to the approximating distributions used in VI and their gradient estimation. Specifically, we explore how to use reparameterization---a key technique in black-box VI---for mixture distributions. Due to the discrete nature of choices involved in sampling from mixture distributions, the standard reparameterization trick is not directly applicable. Although prior work has proposed several gradient estimators that use some form of reparameterization, there remains a noticeable lack of clarity regarding which estimators are available, in which contexts they are applicable, and how they compare. To address this gap, we introduce and evaluate the most relevant gradient estimators for mixture distributions using a consistent mathematical framework and, through this framework, we extend existing estimators to new settings. We then give a comprehensive performance comparison of different estimators---theoretically, where we can sometimes compare variance, and empirically, where we assess the estimators across different setups. Finally, we address the often overlooked computational aspect by introducing novel, efficient algorithms for some of the estimators. This thesis contributes to both the theoretical understanding and practical implementation of VI. By introducing new methods and techniques, we aim to enhance the accuracy and efficiency of VI and broaden its applicability.Publication
Communities in ScholarWorks
Select a community to browse its collections.