590 likes | 725 Views
Wednesday 18 February 2009. The Languages of Emotion and Financial News Ann Devitt Khurshid Ahmad. Sentiment and the Markets. Sentiment and the Markets. Specialised Language of Financial News.
E N D
Wednesday 18 February 2009 The Languages of Emotion and Financial News Ann Devitt Khurshid Ahmad Ann Devitt Trinity College Dublin
Sentiment and the Markets Ann Devitt Trinity College Dublin
Sentiment and the Markets Ann Devitt Trinity College Dublin
Specialised Language of Financial News Global chipmakers, battling slower technology demand, are betting size matters as they pin their hopes for future growth on small and easy to carry mobile devices such as netbooks and smartphones. Bloomberg.com, 18/2/09 Ann Devitt Trinity College Dublin
Specialised Language of Financial News Global chipmakers, battlingslower technology demand, are betting size matters as they pin their hopes for future growth on small and easy to carry mobile devices such as netbooks and smartphones. Bloomberg.com, 18/2/09 Ann Devitt Trinity College Dublin
Specialised Language of Financial News Global chipmakers, battling slower technology demand, are betting size matters as they pin their hopes for future growth on small and easy to carry mobile devices such as netbooks and smartphones. Bloomberg.com, 18/2/09 Ann Devitt Trinity College Dublin
Specialised Language of Financial News Global chipmakers, battling slower technology demand, are betting size matters as they pin their hopes for future growth on small and easy to carry mobile devices such as netbooks and smartphones. Bloomberg.com, 18/2/09 Ann Devitt Trinity College Dublin
Sentiment and the Markets Ann Devitt Trinity College Dublin
Sentiment and the Markets Ann Devitt Trinity College Dublin
Engle Ng (1993) Asymmetry Curve Ann Devitt Trinity College Dublin
Outline • Current psychological theory of emotion • Evaluation of lexical “emotion” resources • Corpus analysis of language of “emotion” Ann Devitt Trinity College Dublin
Outline • Current psychological theory of emotion • Evaluation of lexical “emotion” resources • Corpus analysis of language of “emotion” Ann Devitt Trinity College Dublin
Cognitive Theory of Emotion:Categorical Ekman (1975) Ann Devitt Trinity College Dublin
Osgood / Russell Evaluation Activity Potency Mehabrian PAD Pleasure Activation Dominance Cognitive Theory of Emotion:Dimensions Ann Devitt Trinity College Dublin
Cognitive Theory of Emotion Watson and Tellegen (1985) Ann Devitt Trinity College Dublin
Outline • Current psychological theory of emotion • Evaluation of lexical “emotion” resources • Corpus analysis of language of “emotion” Ann Devitt Trinity College Dublin
Lexical Resource Evaluation SentiWordNet Whissel General Inquirer WNA Ann Devitt Trinity College Dublin
Lexical Resource Evaluation Senti WordNet SentiWordNet Whissel General Inquirer WNA Ann Devitt Trinity College Dublin
Lexical Resource Evaluation Senti WordNet Word PositiveVal NegativeVal Happy 0.9 0.0 Sad 0.0 0.9 • 39066 terms • Evaluation dimension scale: 0 - 1 • Low average: Pos=0.18, Neg=0.23 • More extreme Neg values • Error-prone: rude (pos 0.875), gladsome (neg 0.875) Ann Devitt Trinity College Dublin
Lexical Resource Evaluation General Inquirer SentiWordNet Whissel General Inquirer WNA Ann Devitt Trinity College Dublin
Lexical Resource Evaluation General Inquirer ECSTATIC Pos Pleasure SORROWFUL Neg Pain • Hand-coded, content analysis basis • 8641 terms • 184 binary categories (including MAB dimensions) • Negative > Positive • Active > Passive • Strong > Weak Ann Devitt Trinity College Dublin
Lexical Resource Evaluation Whissel Dictionary of Affect SentiWordNet Whissel General Inquirer WNA Ann Devitt Trinity College Dublin
Lexical Resource Evaluation Whissel Dictionary of Affect Word Eval Activ Imag great 2.6250 2.1250 1.0 disastrous 1.4444 2.4000 2.0 • Corpus selection, hand-coded • 8742 terms • Dimensional representation: 1-3 scale • Evaluation, Activation, Imagery Ann Devitt Trinity College Dublin
Lexical Resource Evaluation WordNet Affect SentiWordNet Whissel General Inquirer WNA Ann Devitt Trinity College Dublin
Lexical Resource EvaluationWordNet Affect Word BinaryFeatures Loneliness cognitive state, emotion Happiness cognitive state, emotion • 5432 terms • Domains of emotional experience • No Polarity • Short-term: Mood, Manner • Long-term: Attribute, Trait Ann Devitt Trinity College Dublin
Lexical Resource EvaluationLexical Overlap • Are the lexica consistent? • Are they mutually exclusive? • Dice, Jaccard, Asymmetric coefficients Ann Devitt Trinity College Dublin
Lexical Resource EvaluationLexical Overlap SentiWordNet Whissel General Inquirer WNA Ann Devitt Trinity College Dublin
Lexical Resource EvaluationLexical Overlap SentiWordNet • Statistically significant agreement for Polarity Assignment (Chi square test) • Very weak correlation for activation features. General Inquirer Whissel WNA Ann Devitt Trinity College Dublin
Lexical Resource EvaluationLexical Overlap • Weak correlation of SWN with Whissel evaluation • 2. No correlation with Whissel activation dimension • 3. SWN positive negatively correlated with imageability SentiWordNet Whissel General Inquirer WNA Ann Devitt Trinity College Dublin
Lexical Resource EvaluationLexical Overlap • SWN tends to negative for short term WNA features • SWN tends to positive for long-term WNA features SentiWordNet WNA Whissel General Inquirer Ann Devitt Trinity College Dublin
Lexical Resource EvaluationLexical Overlap SentiWordNet Whissel General Inquirer WNA Ann Devitt Trinity College Dublin
Lexical Resource EvaluationLexical Overlap • WNA feature division: Short-term Long-term Negative Positive Physical Cognitive More active Less active Internal External Less abstract More concrete Ann Devitt Trinity College Dublin
Lexical Resource EvaluationSome conclusions • The lexica: • Are quite consistent • Can be used in combination • SentiWN: Largely unexplored territory Ann Devitt Trinity College Dublin
Outline • Current psychological theory of emotion • Evaluation of lexical “emotion” resources • Corpus analysis of language of “emotion” General Language Ann Devitt Trinity College Dublin
Emotion in General LanguageCorpus Study Aims • Does “emotion” constitute a distinct sub-language? • Is there a polarity bias in General Language? (the Polyanna Hypothesis of Boucher and Osgood) • What is the impact of using different lexica? Ann Devitt Trinity College Dublin
BNC 100 million words Balanced, broad corpus Corpus AnalysisThe Data Ann Devitt Trinity College Dublin
Corpus AnalysisMethodology Is emotion a distinct sub-language? • Examine distribution type • Examine distribution spread • Bootstrap sampling distribution Ann Devitt Trinity College Dublin
Corpus AnalysisDistribution Type • Zipfian: BNC=Emotion Lexica Ann Devitt Trinity College Dublin
Corpus AnalysisDistribution shape • Comparison of means: student t-test BNC ≠ Emotion Lexica (p<0.000) • Different sample means • 5-30 times more frequent than gen. language • Assumptions of test? Ann Devitt Trinity College Dublin
Corpus AnalysisBootstrap Sampling Distribution • Are sentiment-bearing terms a statistically distinct and highly frequent subset of English? • 1000 random samples of terms from BNC • Sample size = size of sentiment lexicon • H0: Observed sample falls inside within 95% of bootstrap random sampling distribution of means Ann Devitt Trinity College Dublin
Corpus AnalysisBootstrap Sampling Distribution • Are sentiment-bearing terms a statistically distinct and highly frequent subset of English? • For all lexica: • Mean term frequency of lexicon well outside 95% • Sentiment lexica are not representative of BNC (p<0.05) Ann Devitt Trinity College Dublin
Corpus AnalysisSentiment Features • Is there a polarity bias in General Language? • Positive polarity bias • Statistically significant for all lexica (χ2 test of independence) Ann Devitt Trinity College Dublin
Corpus AnalysisSentiment Features • Is there a polarity bias in General Language when you include intensity of polarity? • Positive polarity bias • Statistically significant for all lexica • χ2 = 158.5, df=1, p<0.0001 for General Inquirer • χ2 = 63.6, df=1, p<0.0001 for Whissel Ann Devitt Trinity College Dublin
Corpus AnalysisSome conclusions • Sentiment-bearing terms are a distinct subset of English • Positive polarity bias in BNC • General Inquirer and Whissel: • Low coverage and high frequency • SentiwordNet: • Wide coverage and much lower frequency Ann Devitt Trinity College Dublin
Outline • Current psychological theory of emotion • Evaluation of lexical “emotion” resources • Corpus analysis of language of “emotion” Comparative Ann Devitt Trinity College Dublin
Comparative Corpus AnalysisAims • Examine affective term use • Identify statistically different distributions • Is there a dominant feature/polarity? Ann Devitt Trinity College Dublin
Financial Language 2 million words On-line financial news: Reuters, CNN, Bloomberg Newspapers General Language BNC 100 million words Balanced, broad corpus Comparative Corpus AnalysisThe Data Ann Devitt Trinity College Dublin
Comparative Corpus Analysis The Data • BNC sub-corpora • Imaginative written English • 16 million words • Informative written English • 70 million words Ann Devitt Trinity College Dublin
Comparative Corpus AnalysisMethodology • Compare proportions of Sentiment Features • χ2 Test of Independence • H0: πFinCorpus = πBNC Ann Devitt Trinity College Dublin
Comparative Corpus Analysis Methodology • Statistical significance of different proportion • χ2 > 7.8794 • p >= 0.005 • Features: • 41 Lexicon Sentiment Features from 4 lexica • Frequency per million words Ann Devitt Trinity College Dublin