ScholarWorks@UMassAmherst

Recent Submissions

  • Publication
    2 Factors Challenging Faculty's Sense of Inclusion
    (2024-08-08) Liu, Shuyin; Clark, Dessie; Smith-Doerr, Laurel; Misra, Joya
    Pandemic-related caregiving burdens and health concerns have played a particularly large role, write Shuyin Liu, Dessie Clark, Laurel Smith-Doerr and Joya Misra.
  • Publication
    COVID’s Lasting Impacts on Faculty Inclusion
    (2024-08-01) Smith-Doerr, Laurel; Misra, Joya; Liu, Shuyin; Clark, Dessie
    Think the pandemic is well behind us? Survey data shows feelings of inclusion have continued dropping as a result of it, write Laurel Smith-Doerr, Joya Misra, Shuyin Liu and Dessie Clark.
  • Publication
    Exploiting Structures in Interactive Decision Making
    (2024-09) Cao, Tongyi
    In this thesis we study several problems in interactive decision making. Interactive decision making plays an important role in many applications such as online advertisement and autonomous driving. Two classical problems are multi-armed bandits and reinforcement learning. Here and more broadly, the central challenge is the \emph{exploration-exploitation} tradeoff, whereby the agent must decide whether to explore uncertain actions that could potentially bring high reward or to stick to the known good actions. Resolving this challenge is particularly difficult in settings with large or continuous state and action spaces. For reinforcement learning, function approximation is a prevalent structure to manage large state and action spaces. However, misspecification of the function classes can have a detrimental effect on the statistical outcomes. These structured settings are the focus of this thesis. First we study the combinatorial pure exploration problem in the multi-arm bandit framework. In this problem, we are given $K$ distributions and a collection of subsets $\Vcal \subset 2^{[K]}$ of these distributions, and we would like to find the subset $v \in \Vcal$ that has largest mean, while collecting, in a sequential fashion, as few samples from the distributions as possible. We develop new algorithms with strong statistical and computational guarantees by leveraging precise concentration-of-measure arguments and a reduction to linear programming. Second we study reinforcement learning in continuous state and action spaces endowed with a metric. We provide a refined analysis of a variant of the algorithm of Sinclair, Banerjee, and Yu (2019) and show that its regret scales with the \emph{zooming dimension} of the instance. Our results are the first provably adaptive guarantees for reinforcement learning in metric spaces. Finally, we study a more fundamental problem of \emph{distribution shift}, where training and deployment conditions for a machine learning model differ. We study the effect of distribution shift in the presence of model misspecification, specifically focusing on $L_{\infty}$-misspecified regression and \emph{adversarial covariate shift}, where the regression target remains fixed while the covariate distribution changes arbitrarily. We develop a new algorithm---inspired by robust optimization techniques—that avoids misspecification amplification while still obtaining optimal statistical rates. As applications, we use this regression procedure to obtain new guarantees in offline and online reinforcement learning with misspecification and establish new separations between previously studied structural conditions and notions of coverage.
  • Publication
    Improving Variational Inference through Advanced Stochastic Optimization Techniques
    (2024-09) Burroni, Javier
    Black-box variational inference (VI) is crucial in probabilistic machine learning, offering an alternative method for Bayesian inference. By requiring only black-box access to the model and its gradients, it recasts complex inference tasks into more manageable optimization problems, aiding in the approximation of intricate posterior distributions across a wide range of models. However, black-box VI faces a fundamental challenge: managing the noise introduced by using stochastic gradient optimization methods, which limits efficient approximations. This thesis presents new approaches to enhance the efficiency of black-box VI by improving different aspects of its optimization process. The first part of this thesis focuses on the importance-weighted evidence lower bound (IW-ELBO), an objective used in the VI optimization problem. The IW-ELBO, by incorporating importance sampling, augments the expressive power of the approximating distributions used in VI. However, it also introduces increased variance in gradient estimation, complicating the optimization process. To mitigate this, our thesis applies the theory of U-statistics, an approach that significantly reduces variance. Since fully implementing U-statistics can be impractical due to exponential growth in computation, we introduce approximate methods that effectively reduce variance with minimal computational overhead. The second part of this thesis addresses a central issue within black-box VI: its stochastic optimization process, i.e., Stochastic Gradient Descent or its variations, is highly sensitive to user-specified hyperparameter choices, often leading to poor results. We address this issue by introducing an algorithm specifically designed for VI, based on the sample average approximation (SAA). This method, SAA for VI, transforms the stochastic optimization task into a sequence of deterministic problems that can be easily solved using standard optimization techniques. As a result, it simplifies and automates the optimization process, reduces the burden of hyperparameter tuning, and exhibits robust performance, particularly in complex statistical models involving hundreds of latent variables. In the third part of this thesis, we shift our focus from the objective and optimization process to the approximating distributions used in VI and their gradient estimation. Specifically, we explore how to use reparameterization---a key technique in black-box VI---for mixture distributions. Due to the discrete nature of choices involved in sampling from mixture distributions, the standard reparameterization trick is not directly applicable. Although prior work has proposed several gradient estimators that use some form of reparameterization, there remains a noticeable lack of clarity regarding which estimators are available, in which contexts they are applicable, and how they compare. To address this gap, we introduce and evaluate the most relevant gradient estimators for mixture distributions using a consistent mathematical framework and, through this framework, we extend existing estimators to new settings. We then give a comprehensive performance comparison of different estimators---theoretically, where we can sometimes compare variance, and empirically, where we assess the estimators across different setups. Finally, we address the often overlooked computational aspect by introducing novel, efficient algorithms for some of the estimators. This thesis contributes to both the theoretical understanding and practical implementation of VI. By introducing new methods and techniques, we aim to enhance the accuracy and efficiency of VI and broaden its applicability.
  • Publication
    PIT Regional Hub Development Guide
    (2024) Basiliere, Colette

Communities in ScholarWorks

Select a community to browse its collections.