50 likes | 65 Views
This project involves building classifiers to predict whether a stock will go up, down, or remain flat based on news articles. The study experimented with Naïve Bayes and MaxEnt classifiers using news articles from WSJ and Reuters from 1994-1996 and daily quotes from Yahoo Finance. The trading strategy was based on signals from news articles and compared to a baseline buy-and-hold strategy, showing improved returns with MaxEnt. The analysis prioritized signal precision over recall and suggested future work on building a generic model before stock-specific training and predicting real-value returns.
E N D
Stock Price Prediction with News Articles Qicheng Ma CS224n Final Project
What is built • Build classifiers to predict {News Articles} => {Stock Up/Down/Flat on one day} • Actually build {paragraph mentioning Stock} => {+/-/0} • Tried Naïve Bayes and Max Ent classifier. • Using WSJ/ReuterFF news 1994-1996, daily quotes from Y!Finance • Trading based on Up/Down signals, fixed-value daily long/short weighted by Up/Down signal. • More paragraphs with +/- signal for one stock => trade stock more • distribute fixed investment amount for one day proportional to |sum(signals)| • Compare to baseline buy-and-hold strategy, 5% vs 2% monthly • Based on last year’s project (Timmons and Lee) and other research papers • some subtle differences, tried different things.
Results • Training on 29 months data, test on 1 month. Repeat.
Results – monthly returns • NB not as good as Baseline (same return, higher volatility) • MaxEnt 5.2% vs baseline 2.6%, statistically significant at 95% confidence • Precision more important than Recall => • want most signals to be profitable, less important to discover all profitable opportunities with risk of false positive.
Future Work • Build a generic model first (e.g. “STOCK earning exceeds expectation”), then train stock-specific on features • Predict real-value returns, instead of discrete classes • Look at excess return (stock minus market), adjust for beta, inflation, etc. • Use more frequent data. Real time stock quotes, news article streams (search engine quotes and feeds) • Profit!