Off-campus UMass Amherst users: To download dissertations, please use the following link to log into our proxy server with your UMass Amherst user name and password.

Non-UMass Amherst users, please click the view more button below to purchase a copy of this dissertation from Proquest.

(Some titles may also be available free of charge in our Open Access Dissertation Collection, so please check there first.)

Empirical comparisons of logistic regression, Poisson regression, and Cox proportional hazards modeling in analysis of occupational cohort data

Peter W Callas, University of Massachusetts Amherst


Three multiplicative models commonly used in the analysis of occupational cohort studies are logistic, Poisson, and Cox proportional hazards regression. Although the underlying theories behind these are well known, this has not always led to clear decisions for selecting which to use in practice. This research was conducted to examine the effect model choice has on the epidemiologic interpretation of occupational cohort data.^ The three models were applied to a National Cancer Institute historical cohort of formaldehyde-exposed workers. Samples were taken from this dataset to create scenarios for model comparisons, varying the study size (n = 600, 3000, 6000), proportion of subjects experiencing the outcome (2.5%, 10%, 50%), strength of association between exposure and outcome (weak, moderate, strong), follow-up length (5, 15, 30 years), and proportion of subjects lost to follow-up (0%, 10%, 17.5%). Other factors investigated included how to handle subjects lost to follow-up in logistic regression. Models were compared on risk estimates, confidence intervals, and practical issues such as ease of use.^ The Poisson and Cox models yielded nearly identical relative risks and confidence intervals in all situations except when confounding by age could not be closely controlled in the Poisson analysis, which occurred when the sample size was small or outcome was rare. Logistic regression findings were more variable, with risk estimates differing most from the Cox results when there was a common outcome or strong relative risk. Logistic was also less precise than the others. Thus, although logistic was the easiest model to implement, it should only be used in occupational cohort studies when the outcome is rare (5% or less), and the relative risk is less than about 2. Even then, since it does not account for follow-up time differences between subjects or changes in risk factors values over time, the Cox or Poisson models are better choices. Selecting between these can usually be based on convenience, except when confounding cannot be closely controlled in the Poisson model but can in the Cox model, or when the Poisson assumption of exponential baseline survival is not met. In these cases Cox should be used. ^

Subject Area

Biostatistics|Occupational safety|Public health

Recommended Citation

Callas, Peter W, "Empirical comparisons of logistic regression, Poisson regression, and Cox proportional hazards modeling in analysis of occupational cohort data" (1994). Doctoral Dissertations Available from Proquest. AAI9510451.