Loading...
Thumbnail Image
Publication

A statistical approach to model selection and uncertainty quantification in neural networks

Date
2024
Abstract
Feedforward neural networks (FNNs) can be viewed as non-linear regression models, where covariates enter the model through a combination of weighted summations and non-linear functions. Although these models have some similarities to the approaches used within statistical modelling, they are typically regarded as pure prediction algorithms, and the majority of neural network research has been conducted outside of the field of statistics. This has resulted in a lack of statistically-based methodology, and, in particular, there has been little emphasis on uncertainty quantification and model parsimony. Supplementing FNNs with methods of statistical inference, and statistically-based model selection, can shift the focus away from black-box prediction and make FNNs more akin to traditional statistical models. For the interpretation of FNNs, we propose a new Wald test in the context of penalised FNNs, and also covariate-effect plots that emulate regression coefficients. This can allow for more inferential analysis, and, hence, make FNNs more accessible within the statistical-modelling context. In terms of model selection, two procedures are proposed. The first is a stepwise BIC-based procedure that performs both input- and hidden-node selection, and is analogous to traditional model selection methods. The choice of BIC over out-of-sample performance as the model selection objective function leads to an increased probability of recovering the true model, while parsimoniously achieving favourable out-of-sample performance. The second is a more modern smooth-information-criterion (SIC) approach, which can be optimised directly and avoids the computationally demanding search for the tuning parameter. Our SIC procedure incorporates structured sparsity, allowing for automatic variable selection among the input nodes and the determination of model complexity through the hidden nodes. Simulation studies are used throughout to extensively evaluate the performance of the proposed methods, and several real data applications are used to demonstrate this statistical-based approach to FNNs.
Supervisor
Kevin Burke
Description
Publisher
Citation