70 likes | 236 Views
Stock Volatility Prediction using Earnings Calls Transcripts and their Summaries . Naveed Ahmad Aram Zinzalian. Setup – SVM Text Regression. Output : Future Log Return Volatility, where log returns = ln (P(t+1)/P(t)) Baseline:
E N D
Stock Volatility Prediction using Earnings Calls Transcripts and their Summaries Naveed Ahmad Aram Zinzalian
Setup – SVM Text Regression Output: Future Log Return Volatility, where log returns = ln(P(t+1)/P(t)) Baseline: Historical Volatility – i.e. volatility from previous quarter Predictors/Transcript + Summary Features: N-gram TF/IDF scores, POS tags, Target word frequency
Setup TF/IDF Scores: give weight to transcript specific words - need to ignore analyst names POS Tags: Is higher sentence adjective frequency indicative of risk/uncertainty? Target Words: Handpick words we give more weight to, e.g. risk, uncertain, variable, decline, etc. Summaries: Do we really need the whole text? Why not just extract most informative sentences
Experiments Vary: • Prediction time frame i.e. look-back and look-forward periods by 20, 40, and 60 days. • Train set size from 25 to 386 transcripts. • NLP Models between • Unigram with TF/IDF • Bigram and Unigram with TF/IDF • Unigram and Bigram with Sentence Summaries
Results ……………………………………………………….
Results • Bigram and Unigram Models Improve over Historical Baseline • POS Tags (JJ) and Target Words also slightly lower MSE • Regression on summarized text does not help but the generated summaries were informative
Future Work • More Data! • Different regression model (CART?) • Q/A Based Summarization • Feature Selection • Better heuristic for choosing bigrams • SO-PMI scores with target words