Off-campus UMass Amherst users: To download campus access dissertations, please use the following link to log into our proxy server with your UMass Amherst user name and password.

Non-UMass Amherst users: Please talk to your librarian about requesting this dissertation through interlibrary loan.

Dissertations that have an embargo placed on them will not be available to anyone until the embargo expires.

Author ORCID Identifier

https://orcid.org/0000-0002-1203-6022

AccessType

Campus-Only Access for One (1) Year

Document Type

dissertation

Degree Name

Doctor of Philosophy (PhD)

Degree Program

Mathematics

Year Degree Awarded

2024

Month Degree Awarded

February

First Advisor

Ted Westling

Subject Categories

Statistical Methodology | Statistical Theory

Abstract

This dissertation concerns nonparametric estimation and inference in the presence of infinite dimensional nuisance parameters. The first chapter focuses on assessing variable selection and ranking algorithms. The proposed parameters of interest are model agnostic and permit the use of data adaptive nuisance estimators. Several approaches to inference are considered, including a novel partial bootstrap procedure. Motivated by the success of the bootstrap procedure in the first chapter, the second chapter studies the theoretical properties of bootstrap procedures in the presence of data-adaptive nuisance estimators in a general setting. The proposed framework covers a range of estimator constructions, nuisance estimation methods, and bootstrap sampling distributions. The third chapter provides theoretical results of bootstrap procedures when cross-fitting is used, which weakens the conditions presented in Chapter 2. In the first chapter of the thesis, we study methods of comparing variable selection and ranking algorithms. In many scientific fields, researchers might be interested in selecting from or ranking a set of candidate variables in terms of their capacity for predicting an outcome of interest. We propose methods of comparing variable selection and ranking algorithms. We first introduce measures of the quality of variable selection and ranking algorithms. We then define estimators of our proposed measures and establish asymptotic results for our estimators in the regime where the dimension of the covariates is fixed as the sample size grows. We use our results to conduct large-sample inference for our measures, and we propose a computationally efficient partial bootstrap procedure to potentially improve finite-sample inference. We assess the properties of our proposed methods using numerical studies, and we illustrate our methods with an analysis of data for predicting wine quality from its physicochemical properties. In the second chapter of the thesis, we study the consistency of the bootstrap in the context of estimation of a differentiable functional in a nonparametric or semiparametric model when nuisance functions are estimated using machine learning. We provide general conditions for consistency of the bootstrap in such scenarios. Our results cover a range of estimator constructions, nuisance estimation methods, bootstrap sampling distributions, and bootstrap confidence interval types. Our results show that the bootstrap can work when standard methods do not due to excess bias in the candidate estimator. We provide refined results for the empirical bootstrap and smoothed bootstraps, and for one-step estimators, plug-in estimators, empirical mean plug-in estimators, and estimating equations-based estimators. We illustrate the use of our general results by demonstrating the asymptotic validity of bootstrap confidence intervals for the average density value parameter and compare their performance in finite samples using numerical studies. In the third chapter of the thesis, we study the consistency of the bootstrap in the context of a nonparametric or semiparametric model when the nuisance functions are estimated using machine learning and cross-fitting. Cross-fitting, also known as cross-validation or sample-splitting, is used to remove constraints on the complexity of nuisance estimators. Hence, by using cross-fitting, we are able to relax some of the conditions presented in Chapter 2.

DOI

https://doi.org/10.7275/36512728

Creative Commons License

Creative Commons Attribution-Noncommercial 4.0 License
This work is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 4.0 License.

Available for download on Saturday, February 01, 2025

Share

COinS