Interactions:log-linear models in sparse contingency tables with stepwise alogorithms and penalised likelihood
This thesis deals with hierarchical log-linear models in sparse high-dimensional contingency tables formed with binary variables.
We constructed new stepwise search algorithms for a hierarchical log-linear model in a multi-dimensional contingency table, in R, where they can be used with any number of variables. The algorithms are Backwards Elimination (with variants) and Forward Selection. Backwards Elimination is based on a reimplementation in R of the automatic SPSS algorithm. We tested successfully the performance of the algorithms with clinical comorbidity data and more comprehensively with a simulation study.
We also prove a result first conjectured by MacKenzie, concerning the identification of effects with nonexistent maximum likelihood estimates. Furthermore, we prove that, out of the two considered coding schemes, Yates’ and binary, the former is D-optimal.
We highlight with diverse examples the importance of considering the hierarchical class.
For sparse contingency tables, we constructed a new more intelligent search algorithm: MacKenzie-Conde Backwards Elimination (MCBE, with its variant for a larger number of variables), which removes all the nonexistent effects.
We also used a least absolute shrinkage and selection operator (LASSO)-type penalised likelihood with: a LASSO penalty, a LASSO penalty only in the interactions, and a smooth parametric approximation to the LASSO, the latter being a new development. Among the stepwise algorithms solutions and those from the LASSO, the former always found sparser (and hierarchical) models, and in the case of MCBE, models that are free from inestimable effects.
In summary, we have improved existing methodology for analysing multivariate binary data.
History
Faculty
- Faculty of Science and Engineering
Degree
- Doctoral
First supervisor
Gilbert MacKenzieSecond supervisor
Peter EggerDepartment or School
- Mathematics & Statistics