150 likes | 297 Views
OLS REGRESSION VS. NEURAL NETWORKS VS. MARS A COMPARISON. R. J. Lievano E. Kyper University of Minnesota Duluth. Research Questions. Are new data mining regression techniques superior to classical regression?
E N D
OLS REGRESSION VS. NEURAL NETWORKS VS. MARS A COMPARISON R. J. Lievano E. Kyper University of Minnesota Duluth
Research Questions • Are new data mining regression techniques superior to classical regression? • Can data analysis methods implemented naively (through default automated routines) yield useful results consistently?
Assessment of 3x23 factorial experiment • Regression methods (3): OLS forward stepwise regression, feedforward neural networks, Multivariate Adaptive Regression Splines (MARS). • Type of function (2): linear and nonlinear. • Noise Size (2): small, large. • Sample Size (2): small, large.
FORWARD STEPWISE REGRESSION • Given a set of responses Y and predictors X such that Y = F(x) + ε • where ε is an error (noise) structure: • Find a subset XR of X which satisfies a set of conditions such as • goodness-of-fit or simplicity. • Fit a set of successive models of the type Yi = Σj βjXj + εi • Stop when a specified criterion has been achieved. e.g. • Maximum adjusted R2 • No remaining significant predictors
MULTIVARIATE ADAPTIVE REGRESSION SPLINES (MARS) • Given a set of responses Y and predictors X such that Y = F(x) + ε • where ε is an error (noise) structure: • Find a set of basis functions Wj (spline transformations of Xj) which • describe intervals of varying relationships between Xj and Y • Fit these basis functions with a stepwise regression procedure to models of the type until a stopping criterion has been achieved.
x1 x2 x3 x4 x5 x0 0.3 0.7 -0.2 (I) 0.4 -0.5 Input (I) Output (y) 0.8 I=0.8+.3x1+.7x2-.2x3+.4x4-.5x5 NEURAL NETWORKS COMPONENTS A Neuron To next Layer Sigmoidal Activation (transfer) Function
Input Layer Hidden Layer Output Layer Input from hidden node Output Overall (many Nodes) The resulting model is just a flexible non-linear regression of the response on a set of predictor variables.
Hypothesis • H1: The three methods are equivalent in accuracy (goodness-of-fit). • H2: The three methods are equivalent in ability to select valid predictors. • H2a: The three methods are equivalent in the degree of underfitting. • H2b: The three methods are equivalent in the degree of overfitting.
A SLICE OF Y = α + Σj βjXj + ε (Linear functional form modeled)
A SLICE OF Y = α + Σj LOGe(βjXj) + ε (Nonlinear functional form modeled
Results/Conclusions • H1 can be rejected (three methods are not equivalent in accuracy). • H2a can not be rejected, underfitting is more prevalent in nonlinear fits with large noise for smaller samples. • H2b can be rejected (three methods are equivalent in degree of overfitting).
Results Cont. • Linear PMSE: OLS regression • Linear over spec.: MARS • Nonlinear PMSE: NNW • Nonlinear over spec.: MARS • Need further study to answer research questions clearly.
Further Research Conducted • Kept the same three methods with only large samples. • Kept function as a factor but changed from two to three functions (1 linear, 2 nonlinear) • Replaced noise with contamination (contaminated and uncontaminated data) • Found that OLS regression performed best in all linear cases. • Unlike previous findings we now found that MARS performed the best in all nonlinear cases and that underspecfication is now significant.