Off-campus UMass Amherst users: To download dissertations, please use the following link to log into our proxy server with your UMass Amherst user name and password.

Non-UMass Amherst users, please click the view more button below to purchase a copy of this dissertation from Proquest.

(Some titles may also be available free of charge in our Open Access Dissertation Collection, so please check there first.)

Linearization, variable selection and diagnostics in generalized linear models

Borko D Jovanovic, University of Massachusetts Amherst


In this thesis we develop a method for efficient model building in nonlinear members of the GLM family, with emphasis on best subset selection and on utilization of existing linear regression software. The method is based on a "linearized" estimator of a subset of regression parameters, under the assumption that the remaining parameters are zero. It has been introduced by Lawless and Singhal (1978) in a form which requires special software. We define an estimator which has the same functional form as the maximum likelihood estimator of regression parameters obtained from the IRLS procedure, and show that it is identical with the estimator proposed by Lawless and Singhal. The same estimator has been discussed by Hosmer, Jovanovic and Lemeshow (1989) in the context of best subset logistic regression, by Nordberg (1982) in the context of stepwise selection and by Gilks (1986) in a broader context of model selection. Asymptotic results are developed for quadratic forms, F-statistics and Mallows' C used in weighted linear regression best subset selection on the vector of pseudo-data. Based on these, practical guidelines for the use of the method in nonlinear GLM members are provided. It is shown how the linearized estimator can be used to obtain diagnostic measures and to estimate the bias in regression parameter estimates, for nonlinear GLM members, from existing linear regression software. Simulation results are provided for the logistic and Poisson regression models with uniformly distributed independent regressors, for sample sizes 100, 200 and 400. Simulation results closely follow theoretical results developed for quadratic forms, F-statistics and Mallows' C, and the upper percentiles of F-statistics are well approximated by the percentiles of the corresponding F distributions. Correction factors for the moments of the Pearson Chi-square statistic as discussed by McCullagh and Nelder (1989) are examined. Evidence shows that the correction factors depend on the true value of the underlying parameter but not on the sample size. Simulation results for the Poisson regression model Pearson Chi-square statistic show closer adherence to theoretical moments than they do for the logistic regression model. ^

Subject Area

Biostatistics|Statistics|Public health|Computer science

Recommended Citation

Jovanovic, Borko D, "Linearization, variable selection and diagnostics in generalized linear models" (1991). Doctoral Dissertations Available from Proquest. AAI9120897.