Loading...
The choice of reference subclass in categorial regression models matters
Date
2014
Abstract
In the parametric regression models with categorical covariates, it is well known that many key quantities of interest are invariant to the choice of reference subclass. However, surprisingly, not all quantities are invariant and some choices may lead to models which have inferior properties when judged against particular criteria. We propose a set of secondary criteria upon which the choice of reference subclass may be based. This, secondary, set comprises: (a) precision of the estimates, (b) a measure of multi-collinearity and (c) subject matter considerations. The elements of this set are clearly inter-related. We explore the development and use of the proposed criteria in generalized linear models (GLMs) with categorical covariates. Our approach is based on analysis, simulation studies and a detailed analysis of a real data set. The results show clearly that it is possible to improve the characteristics of the model by selecting the reference subclass judiciously. This findings is based on the close relationship between the measure of precision of the estimates and the measure of multicollinearity. So that it is natural to wish to evaluate any choice based on subject matter considerations in terms of the former two criteria. Our approach is to develop a measure of the precision of the regression estimates, Vr, the total variance, and adopt a measure of the condition of Vr, namely Kr, and to consider the dependence between the pair (Vr,Kr) as we vary r
Supervisor
Description
peer-reviewed
Publisher
Citation
29th International Workshop on Statistical Modelling;
Files
ULRR Identifiers
Funding code
Funding Information
Science Foundation Ireland (SFI)
