Loading...
Citations
Abstract
The goal of this thesis is to develop novel statistical and computational techniques for inference in graphical models and game-theoretic statistics, with a particular emphasis on tackling challenges in scenarios where data is expensive and not easily available. Recent breakthroughs in computational resources have dramatically reshaped the field of statistics, enabling the analysis of problems at scales that were once infeasible. The rise of high-performance computing clusters and the advent of specialized hardware like Graphics Processing Units (GPUs) and Tensor Processing Units (TPUs) have redefined the limits of computational feasibility. This evolution has paved the way for tackling complex tasks, such as maximum a posteriori (MAP) inference, which were previously considered computationally prohibitive. In light of these advancements, the techniques developed in this thesis harness the power of modern computational resources while specifically addressing the challenges posed by costly and limited data collection.
This thesis advances the field in three major ways. First, we introduce a novel algorithm designed for MAP inference in general Bayesian factor models. Our approach employs Benders’ decomposition to iteratively incorporate constraints into the fully relaxed dual problem, systematically refining the solution space. This method not only provides a certificate of convergence but also preserves essential domain constraints, thereby delivering more accurate and reliable inferential outcomes.
Second, we extend the testing by betting framework to multi-agent environments in which multiple agents, each with their own betting strategy, operate under the umbrella of a single firm—the Principal—whose objective is to maximize overall wealth. Our extension unfolds in two distinct phases. Initially, we allow the agents to act independently while the Principal employs a dynamic wealth redistribution method that adjusts allocations at each time step. This proactive strategy minimizes exposure to significant fluctuations and helps prevent ruin, fostering consistent growth. We also enhance the model by accounting for the correlation among the agents’ betting behaviors. Leveraging an optimization approach inspired by modern portfolio theory, we enable the Principal to balance risk and return more effectively, resulting in a smoother and more stable wealth trajectory over time. Collectively, these advancements capture the nuanced interplay between individual incentives and collective dynamics, significantly broadening the framework’s applicability to complex, real-world scenarios where strategic interactions and effective risk management are critical.
Third, we tackle pivotal challenges at the intersection of sequential multiple hypothesis testing, high-dimensional variable selection, and genomic data simulation. Recognizing the critical need for efficient resource management in sequential testing, we introduce a Cost-Aware Expected $\alpha$-Wealth Reward (CAERO) framework. This novel approach is engineered to optimize sample allocation by incorporating finite-horizon constraints, thereby balancing immediate testing outcomes with long-term experimental goals in scenarios where data collection is costly. Complementing this framework, we present a novel method that fuses the model-X knockoffs technique with deep neural networks to accurately identify perturbation-responsive genes in large-scale biological experiments. This hybrid strategy enhances our ability to discern significant genetic responses in high-dimensional settings while rigorously controlling for false discoveries. Finally, we introduce SCSIM, a simulation tool designed to generate hierarchically structured single-cell and bulk DNA sequencing data. SCSIM addresses a critical gap by providing realistic synthetic datasets that are essential for developing and validating genomic analysis tools. Collectively, these contributions advance statistical methodologies and support robust, scalable analysis in modern genomic research.
Type
Dissertation (Open Access)
Date
2025-05
Publisher
Degree
Advisors
License
Attribution 4.0 International
License
http://creativecommons.org/licenses/by/4.0/
Research Projects
Organizational Units
Journal Issue
Embargo Lift Date
2026-05-16