Huynh Gia Huy (G1000176L) Le Hoang Thai (G1000293G)

FE8827 Quantitative Trading Strategies projectHigh Frequency Trading Using Regime Switching Strategy Huynh GiaHuy (G1000176L) Le Hoang Thai (G1000293G)

Reference • “Developing High-Frequency Equities Trading Models”, Leandro Rafael Infantino, SavionItzhaki, MBA thesis, MIT, June 2010.

Contents • Simply Guide • Model Building: motivation, trading ideas, potential problems and implementation. • Simulation Results: without/with transaction costs, development from the thesis. • Weaknesses of Trading Signals in the Thesis and Proposed Improvement: mean reverting signals and regime switching signals.

Simple Guide • To display performance statistics, run rsperf.m file: rsperf(index, rf, lossthreshold) • Input: index = 1: regime switching with transaction costs. index = 2: regime switching without transaction costs. index = 3: mean reverting with transaction costs. index = 4: mean reverting without transaction costs. rf: risk free rate lossthreshold: omega loss threshold • Ouput: Omega ratio, Sharpe ratio, Omega Sharpe ratio, MaxDD (max drawdown) and MaxDDD (max drawdown duration).

Simple Guide • To start trading strategies, run regswitch.m file. • In this file, turn on/off regime switching by setting regimeSwitchingvariable to 1 to turn on and 0 to turn off. regimeSwitching = 1; % default value • Change line 32 in this file to run simulation for a period of time. Default is whole year running from week 1 to week 52. forweek=1:52

Model building - Motivation • In high frequency environment, high precision of the stock return prediction is not required. • Fundamental Law of Active Management IR: Information Ratio IC: Information Coefficient (our skill) Breadth: “the number of independent forecasts of exceptional return made per year” Breadth very big => IC can be relatively small => prediction can be less precise

Model building - Trading ideas • Use Principle Component Analysis (PCA) as the basis to compute cumulative returns • Depending on the current strategy (mean-reversion or momentum), trading signals are generated if observed cumulative returns differ from model cumulative returns. • Predicted – observed > threshold => buy (sell) • Predicted – observed < -threshold => sell (buy)

Differences between this project and the paper • Number of stocks used in the simulation: only first 10 stocks (instead of 50) are chosen to represent the stock universe. • Data source: the primary data used in this project is obtained from Thomson Reuters Tick History database, and from two major exchanges: NASDAQ and NYSE; whereas prices used in the thesis are from top of the book bid-ask quotes. • Model parameters: various model parameters are not specified in the thesis, thus simulation results can be affected by the choice of different parameter values. • Transaction costs are included. • Signal in the thesis is found insufficient and has been modified to improve stabilty and performance of returns.

Model building - Potential problems - data volume • Huge volume of data: more than 15Gbs of one year tick data to process (and that’s only for 10 stocks). • Simulation time can be very long because of this. • Code optimization is important!

Implementation • Matlab is chosen as it provides many built-in mathematics functions suitable for rapid model development. • Pre-process data before running simulation: • Only one mid-price per second per stock • Need to duplicate values for missing seconds in raw data. • Processed data saved in separate .csv files. • Run simulation based on processed data. • All daily returns are saved in .csv files.

Some code optimization techniques • Always pre-allocate memory for matrix and avoid changing matrix size constantly. • Avoid loop as much as possible and make use of Vectorization(performance increased dramatically!) • Use Matlab profiler to identify areas for improvement.

Simulation results - thesis • Mean-reversion strategy, without transaction cost.

Simulation results – this project • Mean-reversion strategy, without transaction cost.

Simulation results – this project • Mean-reversion strategy, without transaction cost. • Omega: 2.6741 (Loss threshold: 0) • Sharpe: 0.3507 • Omega Sharpe: 0.0022 • Max Drawdown: 0.1566 • Max Drawdown duration: 73 days.

Simulation results - thesis • Regime switching strategy, without transaction cost.

Simulation results – this project • Regime switching strategy, without transaction cost.

Simulation results – this project • Regime switching strategy, without transaction cost. • Omega: 0.7381 (Loss threshold: 0) • Sharpe: -0.0903 • Omega Sharpe: -0.0014 • Max Drawdown: 0.5569 • Max Drawdown duration: 226 days.

Comments • The different simulation results between this project and thesis can be due to: • Different stock universe • Different data source (therefore different mid-prices) • Parameters used (thresholds). • In the thesis, transaction cost is not taken into account and it is a important factor to consider in high frequency trading model. • All profits can be erased by transaction cost. • Next step for this project: include transaction cost! • To be realistic, transaction cost from Interactive Brokers is used; that is, $0.005 / share / trade

Transaction costs: Modifications to existing strategy • In an attempt to factor in the transaction cost, the trading model is modified. Two potential places: • Mark-up the threshold by the transaction cost, i.e. new threshold = threshold + transaction cost • Lower the log returns. Use this new log returns as input for Principal Components Analysis log_return = log[(current_price - cost) / previous_price + cost)] • This project uses the second approach.

Simulation result – modified model • Mean-reversion strategy, with transaction cost.

Simulation results – this project • Mean-reversion strategy, with transaction cost. • Omega: 1.0131 (Loss threshold: 0) • Sharpe: 0.0192 • Omega Sharpe: 0.0002315 • Max Drawdown: 0.2373 • Max Drawdown duration: 68 days.

Simulation result – modified model • Modified regime switching strategy, with transaction cost.

Simulation results – this project • Mean-reversion strategy, with transaction cost. • Omega: 0.7805 (Loss threshold: 0) • Sharpe: -0.0708 • Omega Sharpe: -0.0010 • Max Drawdown: 0.4941 • Max Drawdown duration: 248 days.

Weaknesses of the trading signals in the thesis – Proposed Improvements • Mean reverting signal: • In the thesis, if mean reverting signal > 0, we buy and sell when signal < 0. It includes noise due to rounding or computational issues. • Solution: set a threshold to filter away noise that creates fault trades. • This threshold after trial and error has been determined to be 0.0001 (parameter 1)

Weaknesses of the trading signals in the thesis - Proposed Improvements • Regime Switching Signals: • The authors use difference in two consecutive Euclidean distance to signal the regime switching: • If EH(t) - EH(t-1) > 0: momentum regime • If EH(t) - EH(t-1) < = 0: mean reverting regime • This signals introduce noise and cause fault signals and consequently fault trades.

Weaknesses of the trading signals in the thesis - Proposed Improvements • Regime Switching Signals –Noise • Diagram of Euclidean distance difference (EH(t) - EH(t-1)) is shown above. • According to the authors, the strategy keeps changing the regimes as signals swing around 0 from positive to negative.

Weaknesses of the trading signals in the thesis - Proposed Improvements • Regime Switching Signals –Fault Signals • Diagram of Euclidean distance difference (EH(t) - EH(t-1)) is shown above and Euclidean distance(EH(t)) below. • One strong pulse in E distance creates one regime switching signal but it contains one up pulse and one down pulse in E distance difference. According to the authors, it creates two signals that causes the system to switch back and forth (mean reverting -> momentum -> mean reverting) within <150 seconds.

Weaknesses of the trading signals in the thesis - Proposed Improvements • Regime Switching Signals –Noise – Solution • We set a threshold of 2 standard deviation to filter out noise (parameter 2). Magnitude falls between ± 2 stddev is considered insignificant. • Instead of using differences of 2 consecutive E distance (EH(t) - EH(t-1)), we use EH(t) – ewma5(EH(t)) where ewma5(EH(t)) is equally weighted moving averages value of previous 5 seconds of E distance (parameter 3). • Ignore 2 consecutive regime switching signals fall into a timespan of less than 250 seconds to remove fault signals (parameter 4).

Conclusions • Simulation returns are very sensitive to parameters used. • All 4 parameters can be improved by employing optimization. • Due to the time constraint, we leave this part for future development.

Huynh Gia Huy (G1000176L) Le Hoang Thai (G1000293G)