Off-campus UMass Amherst users: To download campus access dissertations, please use the following link to log into our proxy server with your UMass Amherst user name and password.

Non-UMass Amherst users: Please talk to your librarian about requesting this dissertation through interlibrary loan.

Dissertations that have an embargo placed on them will not be available to anyone until the embargo expires.


Access Type

Open Access Thesis

Document Type


Degree Program

Public Health

Degree Type

Master of Science (M.S.)

Year Degree Awarded


Month Degree Awarded



Over 6 million people are estimated to have been living with Alzheimer’s Disease (AD) in 2020, with another 12 million living with Mild Cognitive Impairment (MCI). Research has been conducted to evaluate genetic links to AD, but more research is needed on the subject. The Alzheimer’s Disease Neuroimaging Initiative (ADNI) has been conducting a longitudinal study of AD and MCI since 2004 and offering their data to research teams around the world. Diagnostic and demographic data was collected from participants, as well as data regarding single nucleotide polymorphisms (SNPs). SNP data was transformed to a binary format regarding whether the SNP contained the alternative allele for that particular SNP. We performed cross-validation to determine the ideal alpha and lambda values to use in elastic net regularization, which called for LASSO regression, in order to perform feature selection on the SNPs and other predictors, which were systolic and diastolic blood pressure, age, gender, years of education, race, marital status, and handedness. The LASSO regression reduced the number of SNPs from 55,106 to 13 and removed all non-SNP predictors except years of education and marital status.

We used simple logistic regression to assess the relationship between variations in the significant SNPs (as well as years of education and marital status) and diagnosis of AD/MCI, utilizing a separate LASSO regression with conditional selective inference to more accurately calculate the significance of the variables. The adjusted odds ratios for the SNPs are 1.59 (95% CI 1.23, 2.05), 2.37 (95% CI 1.81, 3.12), 0.71 (95% CI 0.54, 0.93), 1.59 (95% CI 1.21, 2.09), 0.55 (95% CI 0.38, 0.79), 2.03 (95% CI 1.27, 3.23), 0.31 (95% CI 0.18, 0.50), 0.43 (95% CI 0.30, 0.60), 0.69 (95% CI 0.53, 0.89), 1.95 (95% CI 1.46, 2.60), 1.89 (95% CI 1.22, 2.90), 1.47 (95% CI 1.13, 1.90), and 0.52 (95% CI 0.37, 0.72) for SNPs rs11086694, rs2075650, rs2094277, rs2261682, rs31887, rs4745514, rs4816158, rs4826619, rs6640551, rs6809370, rs7312407, rs919751, and rs9857853, respectively. The SNPs are located in genes that have clinical significance and may be associated with various diseases that affect cognitive performance. The results propose that the alternative alleles for seven SNPs are associated with an increased risk of Alzheimer’s Disease/Mild Cognitive Impairment diagnosis while six SNPs are associated with a decreased risk of diagnosis. This research may have clinical implications and should be further studied.


First Advisor

Zhengqing Ouyang

Second Advisor

Chi Hyun Lee

Third Advisor

Jing Qian

Creative Commons License

Creative Commons Attribution 4.0 License
This work is licensed under a Creative Commons Attribution 4.0 License.

Included in

Biostatistics Commons