160 likes | 294 Views
Data Mining in Finance. Andreas S. Weigend Leonard N. Stern School of Business, New York University. Nonlinear Models 8 February 1999. The seven steps of model building. 1. Task
E N D
Data Mining in Finance Andreas S. Weigend Leonard N. Stern School of Business, New York University Nonlinear Models 8 February 1999 RiskTeam/ Zürich, 6 July 1998 Andreas S. Weigend, Data Mining Group, Information Systems Department, Stern School of Business, NYU
The seven steps of model building • 1. Task • Predict distribution of portfolio returns, understand structure in yield curves, find profitable time scales, discover trade styles, … • 2. Data • Which data to use, and how to code/ preprocess/ represent them • 3. Architecture • 4. Objective/ Cost function (in-sample) • 5. Search/ Optimization/ Estimation • 6. Evaluation • 7. Analysis and Interpretation RiskTeam/ Zürich, 6 July 1998 Andreas S. Weigend, Data Mining Group, Information Systems Department, Stern School of Business, NYU
How to make predictions? • “Pattern” = Input + Output Pair • Keep all data • Nearest neighbor lookup • Local constant model • Local linear model • Throw away data, only keep model • Global linear model • Global nonlinear model • Neural network with hidden units • Sigmoids or hyperbolic tangents (tanh) • Radial basis functions • Keep only a few representative data point • Support vector machines RiskTeam/ Zürich, 6 July 1998 Andreas S. Weigend, Data Mining Group, Information Systems Department, Stern School of Business, NYU
Training data: Inputs and corresponding outputs output input2 input1 RiskTeam/ Zürich, 6 July 1998 Andreas S. Weigend, Data Mining Group, Information Systems Department, Stern School of Business, NYU
What is the prediction for a new input? output input2 input1 new input RiskTeam/ Zürich, 6 July 1998 Andreas S. Weigend, Data Mining Group, Information Systems Department, Stern School of Business, NYU
Nearest neighbor • Use output value of nearest neighbor in input space as prediction prediction nearest neighbor output input2 input1 new input RiskTeam/ Zürich, 6 July 1998 Andreas S. Weigend, Data Mining Group, Information Systems Department, Stern School of Business, NYU
Local constant model • Use average of the outputs of nearby points in input space output input2 input1 new input RiskTeam/ Zürich, 6 July 1998 Andreas S. Weigend, Data Mining Group, Information Systems Department, Stern School of Business, NYU
Local linear model • Find best-fitting plane (linear model) through nearby points in input space output input2 input1 new input RiskTeam/ Zürich, 6 July 1998 Andreas S. Weigend, Data Mining Group, Information Systems Department, Stern School of Business, NYU
Nonlinear regression surface • Minimize “energy” stored in the “springs” output input2 input1 RiskTeam/ Zürich, 6 July 1998 Andreas S. Weigend, Data Mining Group, Information Systems Department, Stern School of Business, NYU
Throw away the data… just keep the surface! output input2 input1 RiskTeam/ Zürich, 6 July 1998 Andreas S. Weigend, Data Mining Group, Information Systems Department, Stern School of Business, NYU
Modeling – an iterative process Step 1: Task/ Problem definition Step 2: Data and Representation Step 3: Architecture Step 4: Objective/ Cost function (in-sample) Step 5: Search/ Optimization/ Estimation Step 6: Evaluation (out-of-sample) Step 7: Analysis and Interpretation RiskTeam/ Zürich, 6 July 1998 Andreas S. Weigend, Data Mining Group, Information Systems Department, Stern School of Business, NYU
Modeling issues Step 1: Task and Problem definition Step 2: Data and Representation Step 3: Architecture • What are the “primitives” that make up the surface? Step 4: Objective/ Cost function (in-sample) • How flexible should the surface be? • Too rigid model: stiff board (global linear model) • Too flexible model: cellophane going through all points • Penalize too flexible models (regularization) Step 5: Search/ Optimization/ Estimation • How do we find the surface? Step 6: Evaluation (out-of-sample) Step 7: Analysis and Interpretation RiskTeam/ Zürich, 6 July 1998 Andreas S. Weigend, Data Mining Group, Information Systems Department, Stern School of Business, NYU
Step 3: Architecture – Example of neural networks • Project the input vector x onto a weight vector w • w * x • This projection is then be nonlinearly “squashed” to give a hidden unit activation • h = tanh (w * x) • Usually, a constant c in the argument allows the shifting of the location • h = tanh (w * x + c) • There are several such hidden units, responding to different projections of the input vectors • Their activations are combined with weights v to form the output (and another constant b can be added) • output = v * h + b RiskTeam/ Zürich, 6 July 1998 Andreas S. Weigend, Data Mining Group, Information Systems Department, Stern School of Business, NYU
Neural networks compared to standard statistics • Comparison between neural nets and standard statistics • Complexity • Statistics: Fix order of interactions • Neural nets: Fix number of features • Estimation • Statistics: Find exact solution • Neural nets: Focus on path • Dimensionality • Number of inputs: Curse of dimensionality • Points far away in input space • Number of parameters: Blessing of dimensionality • Many hidden units make it easier to find good local minimum • But need to control for model complexity RiskTeam/ Zürich, 6 July 1998 Andreas S. Weigend, Data Mining Group, Information Systems Department, Stern School of Business, NYU
Step 4: Cost function • Key problem: • Want to be good on new data... • ...but we only have data from the past • Always observation y = f(input) + noise • Assume • Large sudden variations in output are due to noise • Small variation (systematic) are signal, expressed as f(input) • Flexible models • Good news: can fit any signal • Bad news: can also fit any noise • Requires modeling decisions: • Assumptions about model complexity • Weight decay, weight elimination, smoothness • Assumptions about noise: error model or noise model RiskTeam/ Zürich, 6 July 1998 Andreas S. Weigend, Data Mining Group, Information Systems Department, Stern School of Business, NYU
Step 5: Determining the parameters • Search with gradient descent: iterative • Vice to virtue: path important • Guide network through solution space • Hints • Weight pruning • Early stopping • Weight-elimination • Pseudo-data • Add noise • … • Alternative approaches: • Model to match the local noise level of the data • Local error bars • Gated experts architecture with adaptive variances RiskTeam/ Zürich, 6 July 1998 Andreas S. Weigend, Data Mining Group, Information Systems Department, Stern School of Business, NYU