320 likes | 490 Views
Pattern Discovery of Fuzzy Time Series for Financial Prediction -IEEE Transaction of Knowledge and Data Engineering. Presented by Hong Yancheng For COMP630P, Spring 2009. Outline. Introduction and target problem Background knowledge and related work Modeling the candlestick pattern
E N D
Pattern Discovery of Fuzzy Time Series for Financial Prediction-IEEE Transaction of Knowledge and Data Engineering Presented by Hong Yancheng For COMP630P, Spring 2009
Outline • Introduction and target problem • Background knowledge and related work • Modeling the candlestick pattern • Candlestick pattern for financial prediction • Experiments and applications • Conclusion and Discussion
Problems with existing stock prediction tools • A lot of tools exists for predicting stock price • Artificial Neural Network, SVM, NeuroFuzzy, Naïve Bayes and so on • Three major problems with these tools • Training process is nontrivial and training result cannot be further used for other target • Prediction results are incomprehensible • Hard for user to tuning the parameters • Gap exists between prediction result and investment decision • Improving prediction VS buy/sell decision
Target problem • Data preprocessing are needed before applying various of techniques • Data mining, machine learning & pattern recognition • Good knowledge representation method can assist investors • Knowledge-based method to transfer financial data to comprehensible rules and visual patterns
Outline • Introduction and target problem • Background knowledge and related work • Modeling the candlestick pattern • Candlestick pattern for financial prediction • Experiments and applications • Conclusion and Discussion
Japanese Candlestick Theory • Four general ways of represent stock price fluctuation • Original daily fluctuation • Single close price • Bar chart • Candlestick chart • More visual information
Fuzzy Time Series • Fuzzy time series • Assume U is the universe of discourse, where U = {x1, x2,…, xn}. A fuzzy set Ai of U is defined by Ai = µAi (x1)/x1 + µAi (x2)/x2 + … + µAi (xn)/xn where µAi (xk) is membership function of the fuzzy set Ai ,µAi: U -> [0,1]
Outline • Introduction and target problem • Background knowledge and related work • Modeling the candlestick pattern • Candlestick pattern for financial prediction • Experiments and applications • Conclusion and Discussion
Fuzzy candlestick pattern • A fuzzy candlestick pattern is composed of related fuzzy candlestick lines in a period • A fuzzy candlestick line has seven parts • Sequence, open style, close style, upper shadow, body, body color and lower shadow • Sequence defines the location of the candlestick • Open/Close style model the relationship between consecutive candlestick lines
Candlestick line modeling • Modeling the length of shadow and body • Four linguistic variables EQUAL, SHORT, MIDDLE and LONG indicate the fuzzy sets of length • Lupper = ([high – MAX(open, close)]/open) * 100 • Llower = ([MIN(open, close) - low]/open) * 100 • Lbody = ([MAX(open, close) – MIN(open, close)]/open) * 100
Candlestick line modeling • The membership function of four fuzzy sets are shown as follows • The range is set to (0, 14) because the Taiwan stock price limitation
Candlestick line modeling • The body color is defined by three terms BLACK, WHITE and CROSS • If open–close > 0 then body color is BLACK • If open–close < 0 then body color is WHITE • If open–close = 0 then body color is CROSS
Candlestick line modeling • The open/close style is another important feature • Five linguistic variables LOW, EQUAL_LOW, EQUAL, EQUAL_HIGH, HIGH indicate fuzzy sets of open/close style
Trend modeling • Two linguistic variables are used to model the trends before and after the candlestick pattern • previous trend is represented by weekly candlestick line • Six fuzzy sets are used to define the trend • CROSS, EQUAL, WEAK, NORMAL, STRONG, and EXTREME • BEARISH and BULLISH define the body color
Trend modeling • Following trend is derived from the variation of close price (Closet+n – Closet)/ Closet * 100 • Closet+n and Closet mean the close price at day t+n and day t respectively • n is a user-defined parameter
Outline • Introduction and target problem • Background knowledge and related work • Modeling the candlestick pattern • Candlestick pattern for financial prediction • Experiments and applications • Conclusion and Discussion
Three major pattern recognition problems • Sensing problem • Measured values are open, close, high, low • Feature extraction problem • Fuzzy candlestick patterns • Pattern classification problem • Can be determined by user
Forecast procedure • Step 1 • Calculate the variation percentage between two close prices. • Use the minimum increase Imin and maximum increase Imax to define the universe of discourse • UoD = [Imin –D1, Imax +D2] • E.g. Imin = -5.83, Imax = 7.66 then UoD = [-6, 8] • Step 2 • Partition UoD into several intervals • E.g. partition [-6, 8] into seven intervals [-6, -4], [-4, -2], …, [6, 8]
Forecast procedure • Step 3 • Define fuzzy sets on the UoD associate with the intervals in step 2 • Step 4 • Fuzzifying the values calculated in step 1 • If v ∈ ux, and there is Ay in which maximum membership function occurs at ux, v is translate to Ay
Forecast procedure • Step 5 • Calculate all the candlestick patterns • Step 6 • Refine extracted patterns, identify important attributes • Step 7 • Select pattern for forecasting based on probability P(Ax |Py ) • Statistic T = Count(Py ∩ Ax)/Count(Py) as the threshold to select the patterns
Forecast procedure • Step 8 • Forecast the trend follows • Rule 1: test pattern not found, set variation v to 0 • Rule 2: test pattern found, set variation v to arithmetic average of midpoints of matched patterns • Forecast = close + close * v • Step 9 • Evaluate the forecasting • MSE = ∑ (Forecasti - Actuali)2 / N
Outline • Introduction and target problem • Background knowledge and related work • Modeling the candlestick pattern • Candlestick pattern for financial prediction • Experiments and applications • Conclusion and Discussion
Experiments and Applications • The experiments are conducted based on TAIEX index from 2004-01-02 to 2005-01-31 and 2330(TSMC) from 1997-10-23 to 2002-12-25
Experiments and Applications • Experiment for TAIEX index
Experiments and Applications • Experiment results for TAIEX
Problems with existing stock prediction tools • Three major problems with these tools • Training process is nontrivial and training result cannot be further used for other target • Prediction results are incomprehensible • Hard for user to tuning the parameters • Gap exists between prediction result and investment decision • Improving prediction VS buy/sell decision
Experiments and Applications • Experiment with 2330 (TSMC) • The focus is to find the buying time of the stock • The rule is: IF T>0.5 and the following trend is STRONG_INCREASE or EXTREME_INCREASE THEN select the pattern • 5-day return is 2.9% on average
Experiments and Applications • Fuzzy modifier can be implemented to help user tuning the parameters • ABOVE, BELOW, PLUS, VERY, EXTREMELY, MORE_OR_LESS, SOMEWHAT, and NOT • E.g. STRONG_BEARISH and EXTREME_BEARISH can be merged by ABOVE STRONG_BEARISH
Outline • Introduction and target problem • Background knowledge and related work • Modeling the candlestick pattern • Candlestick pattern for financial prediction • Experiments and applications • Conclusion and Discussion
Conclusion and Discussion • Pros • Knowledge-based method to represent the financial time series and to facilitate the knowledge discovery • Comprehensible, computable and visual • Can be used directly or as data preprocess • Cons • Time complexity • How many candlestick lines for a pattern