Off-campus UMass Amherst users: To download dissertations, please use the following link to log into our proxy server with your UMass Amherst user name and password.

Non-UMass Amherst users, please click the view more button below to purchase a copy of this dissertation from Proquest.

(Some titles may also be available free of charge in our Open Access Dissertation Collection, so please check there first.)

Covariate adjustment, model misspecification, and goodness-of-fit in logistic regression

Annette M Bucher, University of Massachusetts Amherst


A commonly used method for confounder selection is to determine the percent difference between the crude and adjusted odds ratio of the covariate of interest, and to include the adjusting variable if the difference is greater than 10-15%. However, in logistic regression the crude and adjusted odds ratio may be different even in the absence of confounding, a phenomenon called modification. This research shows through simulations that the change in odds ratio rule often leads to incorrect inclusion or exclusion of a covariate. Alternative ways for covariate selection are suggested that take confounding and modification as well as bias and variability of the estimated odds ratio into account. In addition, this research investigates the theoretical performance of the logistic regression model in terms of model fit by examining the discrepancy between misspecified logistic and true models using the Kullback-Leibler discrepancy (KLIC) and the Pearson $\chi\sp2.$ It is found that even though the discrepancy measures increase with the degree of model misspecification, large increases in misspecification often result in small changes in the discrepancy measures. The results suggest that statistics measuring lack of fit are large only if the misspecification is severe. The use of an empirical estimator of the Kullback-Leibler discrepancy based on non-parametric kernel estimation is examined. Its performance in approximating the KLIC is compared to the performance of the empirical Pearson $\chi\sp2$ statistic and the Hosmer-Lemeshow statistic as estimators of the true Pearson $\chi\sp2$ discrepancy. It is shown that the empirical estimator of the KLIC approximates the true discrepancy more closely than the other two statistics, but that it can only distinguish between highly different levels of model fit.

Subject Area

Biostatistics|Public health|Statistics

Recommended Citation

Bucher, Annette M, "Covariate adjustment, model misspecification, and goodness-of-fit in logistic regression" (1997). Doctoral Dissertations Available from Proquest. AAI9809313.