1 / 16

Nonlinear Models 8 February 1999

Data Mining in Finance. Andreas S. Weigend Leonard N. Stern School of Business, New York University. Nonlinear Models 8 February 1999. The seven steps of model building. 1. Task

betty_james
Download Presentation

Nonlinear Models 8 February 1999

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Data Mining in Finance Andreas S. Weigend Leonard N. Stern School of Business, New York University Nonlinear Models 8 February 1999 RiskTeam/ Zürich, 6 July 1998 Andreas S. Weigend, Data Mining Group, Information Systems Department, Stern School of Business, NYU

  2. The seven steps of model building • 1. Task • Predict distribution of portfolio returns, understand structure in yield curves, find profitable time scales, discover trade styles, … • 2. Data • Which data to use, and how to code/ preprocess/ represent them • 3. Architecture • 4. Objective/ Cost function (in-sample) • 5. Search/ Optimization/ Estimation • 6. Evaluation • 7. Analysis and Interpretation RiskTeam/ Zürich, 6 July 1998 Andreas S. Weigend, Data Mining Group, Information Systems Department, Stern School of Business, NYU

  3. How to make predictions? • “Pattern” = Input + Output Pair • Keep all data • Nearest neighbor lookup • Local constant model • Local linear model • Throw away data, only keep model • Global linear model • Global nonlinear model • Neural network with hidden units • Sigmoids or hyperbolic tangents (tanh) • Radial basis functions • Keep only a few representative data point • Support vector machines RiskTeam/ Zürich, 6 July 1998 Andreas S. Weigend, Data Mining Group, Information Systems Department, Stern School of Business, NYU

  4. Training data: Inputs and corresponding outputs output input2 input1 RiskTeam/ Zürich, 6 July 1998 Andreas S. Weigend, Data Mining Group, Information Systems Department, Stern School of Business, NYU

  5. What is the prediction for a new input? output input2 input1 new input RiskTeam/ Zürich, 6 July 1998 Andreas S. Weigend, Data Mining Group, Information Systems Department, Stern School of Business, NYU

  6. Nearest neighbor • Use output value of nearest neighbor in input space as prediction prediction nearest neighbor output input2 input1 new input RiskTeam/ Zürich, 6 July 1998 Andreas S. Weigend, Data Mining Group, Information Systems Department, Stern School of Business, NYU

  7. Local constant model • Use average of the outputs of nearby points in input space output input2 input1 new input RiskTeam/ Zürich, 6 July 1998 Andreas S. Weigend, Data Mining Group, Information Systems Department, Stern School of Business, NYU

  8. Local linear model • Find best-fitting plane (linear model) through nearby points in input space output input2 input1 new input RiskTeam/ Zürich, 6 July 1998 Andreas S. Weigend, Data Mining Group, Information Systems Department, Stern School of Business, NYU

  9. Nonlinear regression surface • Minimize “energy” stored in the “springs” output input2 input1 RiskTeam/ Zürich, 6 July 1998 Andreas S. Weigend, Data Mining Group, Information Systems Department, Stern School of Business, NYU

  10. Throw away the data… just keep the surface! output input2 input1 RiskTeam/ Zürich, 6 July 1998 Andreas S. Weigend, Data Mining Group, Information Systems Department, Stern School of Business, NYU

  11. Modeling – an iterative process  Step 1: Task/ Problem definition  Step 2: Data and Representation  Step 3: Architecture  Step 4: Objective/ Cost function (in-sample)  Step 5: Search/ Optimization/ Estimation  Step 6: Evaluation (out-of-sample)  Step 7: Analysis and Interpretation RiskTeam/ Zürich, 6 July 1998 Andreas S. Weigend, Data Mining Group, Information Systems Department, Stern School of Business, NYU

  12. Modeling issues  Step 1: Task and Problem definition  Step 2: Data and Representation  Step 3: Architecture • What are the “primitives” that make up the surface?  Step 4: Objective/ Cost function (in-sample) • How flexible should the surface be? • Too rigid model: stiff board (global linear model) • Too flexible model: cellophane going through all points • Penalize too flexible models (regularization)  Step 5: Search/ Optimization/ Estimation • How do we find the surface?  Step 6: Evaluation (out-of-sample)  Step 7: Analysis and Interpretation RiskTeam/ Zürich, 6 July 1998 Andreas S. Weigend, Data Mining Group, Information Systems Department, Stern School of Business, NYU

  13. Step 3: Architecture – Example of neural networks • Project the input vector x onto a weight vector w • w * x • This projection is then be nonlinearly “squashed” to give a hidden unit activation • h = tanh (w * x) • Usually, a constant c in the argument allows the shifting of the location • h = tanh (w * x + c) • There are several such hidden units, responding to different projections of the input vectors • Their activations are combined with weights v to form the output (and another constant b can be added) • output = v * h + b RiskTeam/ Zürich, 6 July 1998 Andreas S. Weigend, Data Mining Group, Information Systems Department, Stern School of Business, NYU

  14. Neural networks compared to standard statistics • Comparison between neural nets and standard statistics • Complexity • Statistics: Fix order of interactions • Neural nets: Fix number of features • Estimation • Statistics: Find exact solution • Neural nets: Focus on path • Dimensionality • Number of inputs: Curse of dimensionality • Points far away in input space • Number of parameters: Blessing of dimensionality • Many hidden units make it easier to find good local minimum • But need to control for model complexity RiskTeam/ Zürich, 6 July 1998 Andreas S. Weigend, Data Mining Group, Information Systems Department, Stern School of Business, NYU

  15. Step 4: Cost function • Key problem: • Want to be good on new data... • ...but we only have data from the past • Always observation y = f(input) + noise • Assume • Large sudden variations in output are due to noise • Small variation (systematic) are signal, expressed as f(input) • Flexible models • Good news: can fit any signal • Bad news: can also fit any noise • Requires modeling decisions: • Assumptions about model complexity • Weight decay, weight elimination, smoothness • Assumptions about noise: error model or noise model RiskTeam/ Zürich, 6 July 1998 Andreas S. Weigend, Data Mining Group, Information Systems Department, Stern School of Business, NYU

  16. Step 5: Determining the parameters • Search with gradient descent: iterative • Vice to virtue: path important • Guide network through solution space • Hints • Weight pruning • Early stopping • Weight-elimination • Pseudo-data • Add noise • … • Alternative approaches: • Model to match the local noise level of the data • Local error bars • Gated experts architecture with adaptive variances RiskTeam/ Zürich, 6 July 1998 Andreas S. Weigend, Data Mining Group, Information Systems Department, Stern School of Business, NYU

More Related