220 likes | 550 Views
Commodities Futures Price Prediction An Artificial Intelligence Approach. Thesis Defense. Commodities Markets. Commodity A good that can be processed and resold Examples – corn, rice, silver, coal Spot Market Futures Market. Futures Markets. Origin Motivation Hedgers Producers
E N D
Commodities Futures Price Prediction An Artificial Intelligence Approach Thesis Defense
Commodities Markets • Commodity • A good that can be processed and resold • Examples – corn, rice, silver, coal • Spot Market • Futures Market
Futures Markets • Origin • Motivation • Hedgers • Producers • Consumers • Speculators • Size and scope • CBOT (2002) • 260 million contracts • 47 different products
Profit in the Futures Market • Information • Supply • Optimal production • Weather • Labor • Pest damage • Demand • Industrial • Consumer • Time series analysis
Time Series Analysis - Background • Time Series – examples • River flow and water levels • Electricity demand • Stock prices • Exchange rates • Commodities prices • Commodities futures prices • Patterns
Time Series Analysis - Methods • Linear regression • Non-linear regressions • Rule based systems • Artificial Neural Networks • Genetic Algorithms
Data • Daily price data for soybean futures • Chicago Board of Trade • Jan. 1, 1980 – Jan. 1, 1990 • Datastream • Normalized
Why use an Artificial Neural Network (ANN)? • Excellent pattern recognition • Other uses of ANN and financial time series analysis • Estimate generalized option pricing formula • Standard & Poors 500 index futures day trading system • Standard & Poors 500 futures options prices
ANN Implementation • Stuttgart Neural Network Simulator, version 4.2 • Resilient propagation (RPROP) • Improvement over standard back propagation • Uses only the sign of the error derivative • Weight decay • Parameters • Number of inputs 10 and 100 • Number of hidden nodes 5, 10, 100 • Weight decay 5, 10, 20 • Initial weight range +/- 1.0, 0.5, 0.25, 0.125, 0.0625
ANN Data Sets • Training set Jan. 1, 1980 – May 2, 1983 • Testing set May 3, 1983 – Aug. 29, 1986 • Validation set Sept. 2, 1986 – Jan. 1, 1990
ANN Results • Mean Error • 100 input • 12.00 • 24.93 • 10 input • 10.62 • 25.88 • Cents per bushel
Why Evolve the parameters of an ANN? • Selecting preferred parameters is a difficult poorly understood task • Search space is different for each task • Trial and error is time consuming • Evolutionary techniques provide powerful search capabilities for finding acceptable network parameters.
Genetic Algorithm - Implementation • Galib, version 4.5 (MIT) • Custom code to implement RPROP with weight decay • Real number representation • Number of input nodes (1 – 100) • Number of hidden nodes (1 – 100) • Initial weight range (0.0625 – 2.0) • Initial step size (0.0625 – 1.0) • Maximum step size (10 – 75) • Weight decay (0 – 20)
Genetic Algorithm – Implementation (continued) • Roulette wheel selection • Single point crossover • Gausian random mutation • High mutation rate
Evaluation Function • Decode the parameters and instantiate a network using them • Train the ANN for 1000 epochs • Report the lowest total sum of squared error for both training and testing data sets • Fitness equals the inverse of the total error reported.
Parameter Evolution - Results • GANN Mean error 10.82 • NN Mean error 10.62 • Conclusions: • GANN performance is close and out performs the majority of networks generated via trial and error • Genotype / Phenotype issue • Other, possibly better GA techniques • Multipoint crossover • Tournament selection
Evolving the Weights of an ANN • Avoid local minima • Avoid tedious trial and error search for learning parameters • Perform search of broad, poorly understood solution space and maximize the values for function parameters
Weight evolution - Implementation • Galib, version 4.5 (MIT) • Custom written neural network code • Real number representation • Gausian Mutation • Two point crossover • Roulette wheel selection
Weight Evolution – objective function • Instantiate a neural network with the weight vector (I.e. the individual) • Feed one epoch of the training data • Fitness equals the inverse of the sum of the squared network error returned
Weight Evolution – keeping the best individual • Fitness function evaluates against training set only • Objective function evaluates against training set as well, but only for retention of candidate best network • Meta-fitness, or meta-elite individual
Weight Evolution - Results • Mean Error • GANN-Weight 10.67 • GANN 10.82 • NN 10.61 • Much faster • Fewer man hours
Summary • Pure ANN approach is very man hour intensive and expert experience is valuable • Evolving network parameters requires few man hours, but many hours of computational resources. • Evolving the network weights provides most of the performance for smaller cost in both human and computer time