Missing data occur regularly when data are collected for a variety of reasons such
as participants refusing to answer question in surveys or machines failing to record
measurements in a manufacturing process. The fact that data are missing cannot be
ignored. Removing observations and analysing only a complete dataset can a ect
the results of any subsequent analysis. Many methods have been developed to deal
with the problems that arise as a result of having missing values including the widely
used method of multiple imputation.
This thesis examines one such method of generating imputed datasets using multiple
imputation and a distance measure known as the Mahalanobis distance. Using the
Mahalanobis distance identi es similar observations, which are fully observed, to
those with missing values from which to draw estimates of those missing values.
Amendments to a currently used method are proposed, the results compared to
simulated data and applied to a real dataset. It also outlines the importance and
usefulness of visualisation in missing data analysis.
Additional to this missing data work, a study was carried out on Growing Up in
Ireland data and the ability of both children and their primary care givers at rating
their BMI whilst simultaneously accounting for the missing data that exists in this
dataset.