Off-campus UMass Amherst users: To download campus access dissertations, please use the following link to log into our proxy server with your UMass Amherst user name and password.

Non-UMass Amherst users: Please talk to your librarian about requesting this dissertation through interlibrary loan.

Dissertations that have an embargo placed on them will not be available to anyone until the embargo expires.

Author ORCID Identifier

https://orcid.org/0000-0003-0402-3771

AccessType

Open Access Dissertation

Document Type

dissertation

Degree Name

Doctor of Philosophy (PhD)

Degree Program

Public Health

Year Degree Awarded

2022

Month Degree Awarded

February

First Advisor

Raji Balasubramanian

Second Advisor

Laura B. Balzer

Third Advisor

Roberta De Vito

Fourth Advisor

Susan E. Hankinson

Fifth Advisor

Laura D. Kubzansky

Subject Categories

Biostatistics

Abstract

Gaussian graphical models (GGMs) are useful network estimation tools for modeling direct dependencies that characterize multivariate data. The GGM modeling framework is one way to elucidate complex systems-level properties that can be difficult to detect in univariate analyses. In this dissertation, we begin by presenting a tutorial and review of the current state of the field of GGM theory and application. Next, we present a motivating application of GGMs in a study of metabolomic networks associated with chronic distress in women in the Women's Health Initiative (WHI) and in the Nurses' Health Study cohorts. In the third chapter, we present a tool called SpiderLearner, a SuperLearner-based ensemble method for GGM estimation that utilizes a range of existing GGM estimation approaches together with K-fold cross-validation to optimize a likelihood-based loss function. We show via simulation that SpiderLearner performs as well as or better than each individual method and present an application to risk prediction in ovarian cancer genomic data. In the fourth chapter, we present a factor analysis-based method that we have developed to estimate direct dependencies (GGMs) that are shared across studies (or conditions) and those that are study-specific in settings of multi-study data. We apply this method to analyze maternal response to an oral glucose tolerance test as assessed by targeted metabolomic profiles collected in the Hyperglycemia and Adverse Pregnancy Outcomes (HAPO) study. We investigate differences in glucose metabolism across ancestry groups, constructing a GGM that is shared across four ancestry groups and a GGM specific to each of these.

DOI

https://doi.org/10.7275/25702304.0

Included in

Biostatistics Commons

Share

COinS