340 likes | 436 Views
http://www.aiecon.org/. Pretests for genetic-programming evolved trading programs : “zero-intelligence” strategies and lottery trading. Nicolas NAVET INRIA - AIECON NCCU http://www.loria.fr/~nnavet joint work with Shu-Heng Chen AIECON NCCU. ICONIP 2006 – Hong Kong – October 4, 2006.
E N D
http://www.aiecon.org/ Pretests for genetic-programming evolved trading programs :“zero-intelligence” strategies and lottery trading Nicolas NAVET INRIA - AIECON NCCU http://www.loria.fr/~nnavet joint work with Shu-Heng Chen AIECON NCCU ICONIP 2006 – Hong Kong – October 4, 2006
Conclusions are - most often - inconclusive regarding the efficiency of GP • Is it because market is efficient ? (i.e. nothing to learn on the training data) • Further efforts are meaningless ! • Or learning algorithm is inefficient ? • Consider another ML algorithm / improvement of the GP scheme: fitness function, search intensity, high level function sets, overfitting avoidance, better genetic operators, data division scheme, etc, etc, … • If results are not very convincing : • Pretesting may provide some evidence !
Evaluate their quality (“fitness”) Solution • Create better programs by applying genetic operators, eg • mutation • combination (“crossover”) Genetic programming : a recap • GP is the process of evolving a population of computer programs, that are candidate solutions, according to the evolutionary principles (e.g. survival of the fittest) Generate a population of random programs
functions terminals In GP, programs are represented by trees (1/3) • Trees are a very general representation form : • E.g. of a trading rule : buy if expression is “true”
Training interval Validation interval Out-of-sample interval 1 ) Creation of the trading rules using GP 2) Selection of the best resulting strategies Further selection on unseen data - One strategy is chosen for out-of-sample Performance evaluation GP for financial trading • Predicting price evolution(not discussed here) • Inducing technical trading rules :
Why GP is an appealing technique for financial trading ? • Easy to implement / robust evolutionary technique • Trading rules (TR) should adapt to a changing environment – GP may simulate this evolution • Solutions are produced under a symbolic form that can be understood and analyzed • GP may serve as a knowledge discovery tool (e.g. evolution of the market)
But one may cast doubts on GP efficiency .. • Highly heuristic – no theory ! Problems on which GP has been shown not to be significantly better than random search • Few clear-cut successes reported in the financial literature • GP embeds little domain specific knowledge yet .. • Doubts on the efficiency of GP to use the available computing time : • code bloat • bad at finding numerical constants • best solutions are sometimes found very early in the run .. • Variability of the results ! e.g. returns: -0.160993, 0.0526153, 0.0526153, 0.0526153, 0.0526153, -0.0794787, 0.0526153, -0.0794787, 0.132354, 0.364311, -0.0990995, -0.0794787, -0.0855786, -0.094433, 0.0464288, -0.140719, 0.0526153, 0.0526153, -0.0746189, 0.418075, ….
Possible pretest : measure of predictability of the financial time-series • Actual question : how predictable for a given horizon with a given cost function? • Serial correlation • Kolmogorov complexity • Lyapunov exponent • Unit root analysis • Comparison with results on surrogate data : “shuffled” series(e.g. Kaboudan statistics) • ...
In practice, some predictability does not imply profitability .. • Prediction horizon must be large enough! • Volatility may not be sufficient to cover round-trip transactions costs! • Not the right trading instrument at hand .. typically short selling not available
Pretest methodology • Compare GP with several variants of • Random search algorithms • “Zero-Intelligence Strategies” - ZIS • Random trading behaviors • “Lottery trading” - LT Issue : how to best constraint randomness ? • Statistical hypotheses testing • Null : GP does not outperform ZIS • Null : GP does not outperform LT
Pretest 1 : GP versus Zero-Intelligence strategies(=“Equivalent search intensity” Random Search (ERS) with validation stage) • Null hypothesis H1,0 : GP does not outperform equivalent random search - Alternative hypothesis is H1,1
Training interval Validation interval Out-of-sample interval 1 ) Creation of the trading rules using GP 2) Selection of the best resulting strategies Further selection on unseen data - One strategy is chosen for out-of-sample Performance evaluation Pretest 1 : GP vs zero-intelligence strategies • H1,0 cannot be rejected – interpretation : • There is nothing to learn or GP is not very effective ERS
Pretest 4 : GP vs lottery trading • Lottery trading (LT)= random trading behavior according the outcome of a r.v. (e.g. Bernoulli law) • Issue 1 : if LT tends to hold positions (short, long) for less time that GP, transactions costs may advantage GP .. • Issue 2 : it might be an advantage or an disadvantage for LT to trade much less or much more than GP. • ex: downward oriented market with no short-sell
Frequency and intensity of a trading strategy • Frequency :average number of transactions per unit of time • Intensity :proportion of time where a position is held • For pretest 4 : • We impose that average frequency and intensity of LT is equal to the ones of GP • Implementation : generate random trading sequences having the right characteristics 0,0,1,1,1,0,0,0,0,0,1,1,1,1,1,1,0,0,1,1,0,1,0,0,0,0,0,0,1,1,1,1,1,1,…
Training interval Validation interval Out-of-sample interval Lottery trading 1 ) Creation of the trading rules using GP 2) Selection of the best resulting strategies Further selection on unseen data - One strategy is chosen for out-of-sample Performance evaluation 0,0,1,1,1,0,0,0,0,0,1,… Pretest 4 : implementation
Answering question 1 :is there anything to learn on the training data at hand ?
Question 1 : pretests involved • Starting point: if a set of search algorithms do not outperform LT, it gives evidence that there is nothing to learn .. • Pretest 4 : GP vs Lottery Trading • Null hypothesis H4,0 : GP does not outperform LT • Pretest 5 : Equivalent Random Search (ZIS) vs Lottery Trading • Null hypothesis H5,0 : ERS does not outperform LT
Question 1 : some answers ... • R means that the null hypothesis Hi,0 cannot be rejected – R means we should favorHi,1 there is nothing to learn there is something to learn there may be something to learn -ERS might not be powerful enough there may be something to learn – GP evolution process is detrimental
Question 2 : some answers ... • Question 2 cannot be answered if there is nothing to learn (case 1) • Case 4 provides us with a negative answer .. • In case 2 and 3, run pretest 1 : GP vs Equivalent random search • Null hypothesis H1,0 : GP does not outperform ERS • If one cannot reject H1,0 GP shows no evidence of efficiency…
Pretests at work Methodology :Draw conclusions from pretests using our own programs and compare with results in the literature [ChKuHo06] on the same time series
Setup : statistics, data, trading scheme • Hypothesis testing with student t-test with a 95% confidence level • Pretests with samples made of 50 GP runs, 50 ERS runs and 100 LT runs • Data : indexes of 3 stock exchanges Canada, Taiwan and Japan • Daily trading with short selling • Training of 3 years – Validation of 2 years • Out-of-sample periods: 1999-2000, 2001-2002, 2003-2004 • Data normalized with a 250 days moving average
Results on actual data (1/2) • Evidence that there is something to learn : 4 markets out of 9 (C3,J2,T1,T3) • Experiments in [ChKuHo06], with another GP implementation, show that GP performs very well on these 4 markets • Evidence that there is nothing to learn : 3 (C1,J3,T2) • In [ChKuHo06], there is only one (C1) where GP has positive return (but less than B&H)
Results on actual data (2/2) • GP effective : 3 markets out of 6 • In these 3 markets, GP outperforms Buy and Hold – same outcome as in [ChKuHo06] • Preliminary conclusion : one can rely on pretests .. • When there is nothing to learn, no GP implementation did good (except in one case) • When there is something to learn, at least one implementation did good (always) • When our GP is effective, GP in [ChKuHo06] is effective too (always)
Further conclusion • Our GP implementation is • is more efficient than random search : no case where ERS outperform LT and GP did not • But only slightly more efficient … one would expect much more cases where GP does better than LT and not ERS • Our GP is actually able to take advantage of regularities in data … but only of “simple” ones • Ongoing work : study the correlation between predictability measures and GP performances
References • [ChKuHo06] S.-H. Chen and T.-W. Kuo and K.-M. Hoi, “Genetic Programming and Financial Trading: How Much about What We Know”. Handbook of Financial Engineering, Kluwers, 2006. • [Kab00] M. Kaboudan, “Genetic Programming Prediction of Stock Prices”, Computational Economics, vol16, 2000.
Conclusions are - most often - inconclusive regarding the efficiency of GP “Annual returns with the GP induced technical trading rules is x%” • If negative, market is efficient or GP ineffective ?? • If positive, mere luck or GP is effective ?? • Good/bad wrt to other search techniques ?? • Worth to further improve/ optimize GP ?? • Pretests provide some evidence whether • There is something to be learned from the data • GP is effective at this task
Equivalent search intensity • Starting point :2 search algorithms have similar search intensity if they create the same number of solutions over the course of their execution • Problem :same solutions tends to be rediscovered over time and are not re-evaluated – rate of discovery strongly depends on the search technique / implementation • Refined definition :similar search intensity if same number of “truly” different solutions – here truly means syntactically different
Other zero-intelligence (but less meaningful) strategies one can think of • Pretest 2 : GP versus equivalent random search without validation • May give some insight into effectiveness of validation to fight overfitting .. but little overfitting with random search thus usefulness is dubious .. • Pretest 3 : GP versus equivalent random search without training and validation : random trees applied directly out-of-sample • Bias in randomness induced by the GP language ..
Pretest 1 : GP vs zero-intelligence strategies • Implementation : • Execute multiple GP runs – record average number of syntactically different individuals • Random search is implemented with the initial population of GP – adjust size of population to obtain “equivalent search intensity” • H1,0 cannot be rejected – interpretation : • There is nothing to learn or GP is not effective • H1,1 should be rejected – interpretation : • There may be something to learn and GP may be effective ..
Empty slide • XXX