1 / 16

Repository Method to suit different investment strategies

Repository Method to suit different investment strategies. Alma Lilia Garcia & Edward Tsang. Motivation. Motivation Repository Method Receiver Operating Characteristic (ROC) Experimental design Experimental results Conclusions.

Download Presentation

Repository Method to suit different investment strategies

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Repository Method to suit different investment strategies Alma Lilia Garcia & Edward Tsang

  2. Motivation Motivation Repository Method Receiver Operating Characteristic (ROC) Experimental design Experimental results Conclusions • Many machine learning techniques has been applied to financial problems. • Genetic Programming (GP) has been used to predict financial opportunities. • However, when the number of profitable opportunities is extremely small it is very difficult to detect those cases. Alma Lilia Garcia & Edward Tsang

  3. Confusion Matrix Prediction Reality

  4. Motivation Repository Method Receiver Operating Characteristic (ROC) Experimental design Experimental results Conclusions The problem with few opportunities Motivation Predictions Predictions Reality Moves from  to + Accuracy = 98.2% Precision = Recall = 10% (Accuracy dropped from 99%) Easy score on accuracy Accuracy = 99%, Precision = ? Recall = 0% Random move from  to + Accuracy = 98.02% Precision = Recall = 1% Ideal prediction Accuracy = Precision = Recall = 100% Alma Lilia Garcia & Edward Tsang

  5. 1 . . . 2 . . . 100 . . . Motivation Repository Method Receiver Operating Characteristic (ROC) Experimental design Experimental results Conclusions Repository Method The objective of repository method is to mine the knowledge acquired by the evolutionary process in order to compile several rules that detect the rare cases in diverse ways. GP systems spend a lot of computational resources evolving complete populations for several generations. Generation R1 = Var1>0.6 and Var2>Var3 R2 = Var2> 0.6 . . . . . . Rn = … Since the number of positive examples is very small, it is important to gather all available information about them. However, the standard procedure is to choose just the best individual of the evolution as the optimal solution of the problem. Alma Lilia Garcia & Edward Tsang

  6. Motivation Repository Method Receiver Operating Characteristic (ROC) Experimental design Experimental results Conclusions Repository Method In order to mine the knowledge acquired by the evolutionary process Repository Method performs the following steps: 1- Rule extraction 2- Rule simplification R1 R2 … Rn The rule Rk is selected by precision; Rk is simplified to R’k Evolve a GP to create a population of decision trees Rα … Rµ 3- New rule detection R’k is compared to the rules in the repository by similarity (genotype) R’k 4- Add rule to the repository If R’k is a novel rule R’k is added to the rule repository Alma Lilia Garcia & Edward Tsang

  7. Motivation Repository Method Receiver Operating Characteristic (ROC) Experimental design Experimental results Conclusions Repository Method An interesting question arises: Which is the best Precision Threshold to select rules? We propose to try with different precision thresholds in order to generate different classifications. … PT=1 PT=.90 PT=.05 Every Repository produces different classification … R1 R2 … Rt R1 R2 … Rn R1 R2 … Rs Alma Lilia Garcia & Edward Tsang

  8. Motivation Repository Method Receiver Operating Characteristic (ROC) Experimental design Experimental results Conclusions ROC space The Receiver Operating Characteristics (ROC) has been used extensively in Machine Learning to measure the performance of classifiers. A single classification produces a point in the ROC space. However, some classifiers are able to produce a range of classifications, in that cases a curve is produced, this moves from the liberal to the conservative area. Alma Lilia Garcia & Edward Tsang

  9. Motivation Repository Method Receiver Operating Characteristic (ROC) Experimental design Experimental results Conclusions ROC Curve The main advantages of using ROC are: • It is able to deal with imbalance classifications • It is able to deal with classifiers that produce a range of classification • Lets the user to calculate the best trade-off between misclassifications and false alarms The Area Under the ROC curve (AUC) has been used widely to measure compare the performance of different classifiers. Slope = μ (1ρ)/(β ρ) where ρ = the % of + cases Alma Lilia Garcia & Edward Tsang

  10. Motivation Repository Method Experimental design Receiver Operating Characteristic (ROC) Experimental results Conclusions Experimental design • The aims of this work are: • 1) to show that RM is able to produce a range of solutions capable to suit the investor requirements • 2) to analyze the influence of the evolutionary process in the RM performance. • For that purpose RM was tested with the following experiments: • Experiment 1:RM on random trees • RM collects rules from P0, a random population of decision trees. It is expected that the performance of RM will be low, because the decision trees were random. • Experiment 2:RM on partially evolved trees • RM gathers rules from P10, a population that has been evolved after 10 generations. • Experiment 3: RM ontrees from different generations • RM collects and accumulated rules from P10,P20, . . .P100, which means that after every ten generations, RM collected and accumulated rules generated so far. Alma Lilia Garcia & Edward Tsang

  11. Motivation Repository Method Receiver Operating Characteristic (ROC) Experimental design Experimental results Conclusions Experimental results The ROC curve using plotted by experiment 1,2 and 3 A standard GP result Recall= 14%, Precision=5%and Accuracy= 89%. This result is plotted in (0.09, 0.14) Alma Lilia Garcia & Edward Tsang

  12. Motivation Repository Method Receiver Operating Characteristic (ROC) Experimental design Experimental results Conclusions Conclusions It has been shown that RM offers a range of solutions to suit the risk guidelines of the investors. Thus the user can choose the best balance between miss-classification and false alarms according to his/her requirements. This makes RM a valuable tool for investors in balancing between not making mistakes and not missing opportunities. RM is able to extract predictive rules even from earliest stages of the evolutionary process is two folds: (a) RM an potentially shorten the time in evolutionary computation; and (b) effort in early part of the search are not wasted. However to create a wider range of solutions, it is advisable to evolve the solutions at least past the exploration phase, especially when the solution of the problem is complex. Alma Lilia Garcia & Edward Tsang

  13. Questions? Alma Lilia Garcia & Edward Tsang

  14. Motivation Repository Method Factors that work in favor of RM Experimental design Experimental results Conclusions Data set description The data set of Barclays stock is composed by the prices from March, 1998 to January, 2005. The attributes of each record are composed by indicators derived from financial technical analysis. Technical analysis has been used in financial markets to analyze the stock prices behaviour, this is mainly based on historical prices and volume trends. The indicators were calculated on the basis of the daily closing price. Alma Lilia Garcia & Edward Tsang

  15. Motivation Repository Method Factors that work in favor of RM Experimental design Experimental results Conclusions Experimental results The result of the standard GP is: recall =14%, precision=5%and accuracy= 89%. This result is plotted in (0.09, 0.14) in the ROC graph, which describes a conservative prediction. Figure 4 displays the ROC curves plotted by RM in the following experiments: ² Experiment 1 Using P0 the AUC = .69, as can be observed from figure 4 the majority of the points are clustered in the conservative part of the ROC curve because these did not classify any positive case. However, RM was able to generate an interesting choice for the investor, when PT=20%, recall =38%, precision=9% and accuracy= 87 (see table V) ² Experiment 2 Using P10 the performance of RM increased considerably, the AUC increased from 0.69 to 0.74. In this experiment RM offers two valuable choices when PT=30% and PT=20%. The latest option provides a recall = 63% and accuracy = 81%. However one of the choices is in the conservative side and the other in the liberal side of the ROC curve as table V shows. . Alma Lilia Garcia & Edward Tsang

  16. Confusion Matrix True Positive Rate (recall) = TP/(TP+FN) = 350/(350+200) = 63.6% False Positive Rate = FP/(FP+TN) = 50/(50+400) = 11.1% Precision = TP/(TP+FP) = 350/(350+50) = 87.5% Predictions TN FP Reality FN TP

More Related