Date of Award
Doctor of Philosophy
Robert E. Uhrig
J. Wesley Hines, Hamparsum Bozdogan, Laurence F. Miller
Learning from data is fast becoming the rule rather than the exception for many science and engineering research problems, particularly those encountered in nuclear engineering. Problems associated with learning from data fall under the more general category of inverse problems. A data-drive inverse problem involves constructing a predictive model of a target system from a collection of input/output observations. One of the difficulties associated with constructing a model that approximates such unknown causes based solely on observations of their effects is that collinearities in the input data result in the problem being ill-posed. Ill-posed problems cause models obtained by conventional techniques, such as linear regression, neural networks and kernel techniques, to become unstable, producing unreliable results. Methods of regularization using ordinary ridge regression (ORR) and kernel regression (KR) have been proposed as viable solutions to ill-posed problems. Successful application of ORR and KR require the selection of optimal parameter values—ridge parameters for ORR and bandwidth parameters for KR. The common practice for both methods is to select a single parameter based on minimizing an objective function which is an estimate of empirical risk. The single parameter value is then applied to all predictor variables indiscriminately, in a sort of one-size-fits-all fashion. Versions of ORR and KR have been proposed that make use of individual localized ridge and a matrix of localized bandwidth parameters that are optimally selected based on the relevance of their associated predictor variables to reducing empirical risk. While the practical and theoretical value of both localized regression techniques is recognized they have obtained limited use because of the difficulties associated with selecting multiple optimal ridge parameters for localized ridge regression (LRR)—defined as the localized ridge regression problem—and multiple optimal bandwidth parameters for localized kernel regression (LKR)—defined as the localized kernel regression problem—particularly for multivariate predictor data with more than four variables.
This dissertation introduces a method of selecting optimal ridge parameters for LRR and a method of selecting a matrix of optimal bandwidth parameters for LKR based on the use of Differential Evolution (DE), a population based direct search global optimization technique. Three different objective functions, selected as prediction risk estimators, were developed and evaluated for LRR: Mallows' CL, an Information Complexity (ICOMP) based method of regression parameter selection (ICOMPRPS), and Generalized Cross-Validation (GCV). Leave-one-out cross-validation (LOO-CV) was used as the objective function for LKR.
Including the two methods of selecting optimal localized regression parameters the original contributions to the field of learning from data described in this dissertation are: i) DE automated selection of localized ridge regression parameters (DEALRR) using an objective function selected as an estimator of prediction risk, ii) DE automated selection of a vector of bandwidth parameters for localized kernel regression (DEALKRD) using LOO-CV as an objective function, iii) a method of optimizing the full bandwidth matrix for localized kernel regression using DE (DEALKRF), iv) automatic selection of relevant input variables using DEALKRD, and v) a method of detecting local minima (ModelM) in multidimensional data.
Case studies based on several practical examples using both artificial and real world data demonstrate i) the superior performance of DEALRR over ordinary least-squares and ordinary ridge regression for a) variable prediction and b) inferential sensing, ii) the superior performance of DEALKRD and DEALKRF over global kernel regression, and a hybrid conjugate gradient plus line search method for selecting a vector of LKR bandwidth parameters, for a) output target variable prediction, b) function approximation and c) inferential sensing, iii) the superior performance of DEALKRD for input variable selection, iv) the viability of optimally selecting the full bandwidth matrix LKR using DEALKRF and v) the ability to detect the presence of local minima in multidimensional data using ModelM
Buckner, Mark A., "Learning from Data with Localized Regression and Differential Evolution. " PhD diss., University of Tennessee, 2003.