1 / 59

Wednesday 18 February 2009

Wednesday 18 February 2009. The Languages of Emotion and Financial News Ann Devitt Khurshid Ahmad. Sentiment and the Markets. Sentiment and the Markets. Specialised Language of Financial News.

magar
Download Presentation

Wednesday 18 February 2009

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Wednesday 18 February 2009 The Languages of Emotion and Financial News Ann Devitt Khurshid Ahmad Ann Devitt Trinity College Dublin

  2. Sentiment and the Markets Ann Devitt Trinity College Dublin

  3. Sentiment and the Markets Ann Devitt Trinity College Dublin

  4. Specialised Language of Financial News Global chipmakers, battling slower technology demand, are betting size matters as they pin their hopes for future growth on small and easy to carry mobile devices such as netbooks and smartphones. Bloomberg.com, 18/2/09 Ann Devitt Trinity College Dublin

  5. Specialised Language of Financial News Global chipmakers, battlingslower technology demand, are betting size matters as they pin their hopes for future growth on small and easy to carry mobile devices such as netbooks and smartphones. Bloomberg.com, 18/2/09 Ann Devitt Trinity College Dublin

  6. Specialised Language of Financial News Global chipmakers, battling slower technology demand, are betting size matters as they pin their hopes for future growth on small and easy to carry mobile devices such as netbooks and smartphones. Bloomberg.com, 18/2/09 Ann Devitt Trinity College Dublin

  7. Specialised Language of Financial News Global chipmakers, battling slower technology demand, are betting size matters as they pin their hopes for future growth on small and easy to carry mobile devices such as netbooks and smartphones. Bloomberg.com, 18/2/09 Ann Devitt Trinity College Dublin

  8. Sentiment and the Markets Ann Devitt Trinity College Dublin

  9. Sentiment and the Markets Ann Devitt Trinity College Dublin

  10. Engle Ng (1993) Asymmetry Curve Ann Devitt Trinity College Dublin

  11. Outline • Current psychological theory of emotion • Evaluation of lexical “emotion” resources • Corpus analysis of language of “emotion” Ann Devitt Trinity College Dublin

  12. Outline • Current psychological theory of emotion • Evaluation of lexical “emotion” resources • Corpus analysis of language of “emotion” Ann Devitt Trinity College Dublin

  13. Cognitive Theory of Emotion:Categorical Ekman (1975) Ann Devitt Trinity College Dublin

  14. Osgood / Russell Evaluation Activity Potency Mehabrian PAD Pleasure Activation Dominance Cognitive Theory of Emotion:Dimensions Ann Devitt Trinity College Dublin

  15. Cognitive Theory of Emotion Watson and Tellegen (1985) Ann Devitt Trinity College Dublin

  16. Outline • Current psychological theory of emotion • Evaluation of lexical “emotion” resources • Corpus analysis of language of “emotion” Ann Devitt Trinity College Dublin

  17. Lexical Resource Evaluation SentiWordNet Whissel General Inquirer WNA Ann Devitt Trinity College Dublin

  18. Lexical Resource Evaluation Senti WordNet SentiWordNet Whissel General Inquirer WNA Ann Devitt Trinity College Dublin

  19. Lexical Resource Evaluation Senti WordNet Word PositiveVal NegativeVal Happy 0.9 0.0 Sad 0.0 0.9 • 39066 terms • Evaluation dimension scale: 0 - 1 • Low average: Pos=0.18, Neg=0.23 • More extreme Neg values • Error-prone: rude (pos 0.875), gladsome (neg 0.875) Ann Devitt Trinity College Dublin

  20. Lexical Resource Evaluation General Inquirer SentiWordNet Whissel General Inquirer WNA Ann Devitt Trinity College Dublin

  21. Lexical Resource Evaluation General Inquirer ECSTATIC Pos Pleasure SORROWFUL Neg Pain • Hand-coded, content analysis basis • 8641 terms • 184 binary categories (including MAB dimensions) • Negative > Positive • Active > Passive • Strong > Weak Ann Devitt Trinity College Dublin

  22. Lexical Resource Evaluation Whissel Dictionary of Affect SentiWordNet Whissel General Inquirer WNA Ann Devitt Trinity College Dublin

  23. Lexical Resource Evaluation Whissel Dictionary of Affect Word Eval Activ Imag great 2.6250 2.1250 1.0 disastrous 1.4444 2.4000 2.0 • Corpus selection, hand-coded • 8742 terms • Dimensional representation: 1-3 scale • Evaluation, Activation, Imagery Ann Devitt Trinity College Dublin

  24. Lexical Resource Evaluation WordNet Affect SentiWordNet Whissel General Inquirer WNA Ann Devitt Trinity College Dublin

  25. Lexical Resource EvaluationWordNet Affect Word BinaryFeatures Loneliness cognitive state, emotion Happiness cognitive state, emotion • 5432 terms • Domains of emotional experience • No Polarity • Short-term: Mood, Manner • Long-term: Attribute, Trait Ann Devitt Trinity College Dublin

  26. Lexical Resource EvaluationLexical Overlap • Are the lexica consistent? • Are they mutually exclusive? • Dice, Jaccard, Asymmetric coefficients Ann Devitt Trinity College Dublin

  27. Lexical Resource EvaluationLexical Overlap SentiWordNet Whissel General Inquirer WNA Ann Devitt Trinity College Dublin

  28. Lexical Resource EvaluationLexical Overlap SentiWordNet • Statistically significant agreement for Polarity Assignment (Chi square test) • Very weak correlation for activation features. General Inquirer Whissel WNA Ann Devitt Trinity College Dublin

  29. Lexical Resource EvaluationLexical Overlap • Weak correlation of SWN with Whissel evaluation • 2. No correlation with Whissel activation dimension • 3. SWN positive negatively correlated with imageability SentiWordNet Whissel General Inquirer WNA Ann Devitt Trinity College Dublin

  30. Lexical Resource EvaluationLexical Overlap • SWN tends to negative for short term WNA features • SWN tends to positive for long-term WNA features SentiWordNet WNA Whissel General Inquirer Ann Devitt Trinity College Dublin

  31. Lexical Resource EvaluationLexical Overlap SentiWordNet Whissel General Inquirer WNA Ann Devitt Trinity College Dublin

  32. Lexical Resource EvaluationLexical Overlap • WNA feature division: Short-term Long-term Negative Positive Physical Cognitive More active Less active Internal External Less abstract More concrete Ann Devitt Trinity College Dublin

  33. Lexical Resource EvaluationSome conclusions • The lexica: • Are quite consistent • Can be used in combination • SentiWN: Largely unexplored territory Ann Devitt Trinity College Dublin

  34. Outline • Current psychological theory of emotion • Evaluation of lexical “emotion” resources • Corpus analysis of language of “emotion” General Language Ann Devitt Trinity College Dublin

  35. Emotion in General LanguageCorpus Study Aims • Does “emotion” constitute a distinct sub-language? • Is there a polarity bias in General Language? (the Polyanna Hypothesis of Boucher and Osgood) • What is the impact of using different lexica? Ann Devitt Trinity College Dublin

  36. BNC 100 million words Balanced, broad corpus Corpus AnalysisThe Data Ann Devitt Trinity College Dublin

  37. Corpus AnalysisMethodology Is emotion a distinct sub-language? • Examine distribution type • Examine distribution spread • Bootstrap sampling distribution Ann Devitt Trinity College Dublin

  38. Corpus AnalysisDistribution Type • Zipfian: BNC=Emotion Lexica Ann Devitt Trinity College Dublin

  39. Corpus AnalysisDistribution shape • Comparison of means: student t-test BNC ≠ Emotion Lexica (p<0.000) • Different sample means • 5-30 times more frequent than gen. language • Assumptions of test? Ann Devitt Trinity College Dublin

  40. Corpus AnalysisBootstrap Sampling Distribution • Are sentiment-bearing terms a statistically distinct and highly frequent subset of English? • 1000 random samples of terms from BNC • Sample size = size of sentiment lexicon • H0: Observed sample falls inside within 95% of bootstrap random sampling distribution of means Ann Devitt Trinity College Dublin

  41. Corpus AnalysisBootstrap Sampling Distribution • Are sentiment-bearing terms a statistically distinct and highly frequent subset of English? • For all lexica: • Mean term frequency of lexicon well outside 95% • Sentiment lexica are not representative of BNC (p<0.05) Ann Devitt Trinity College Dublin

  42. Corpus AnalysisSentiment Features • Is there a polarity bias in General Language? • Positive polarity bias • Statistically significant for all lexica (χ2 test of independence) Ann Devitt Trinity College Dublin

  43. Corpus AnalysisSentiment Features • Is there a polarity bias in General Language when you include intensity of polarity? • Positive polarity bias • Statistically significant for all lexica • χ2 = 158.5, df=1, p<0.0001 for General Inquirer • χ2 = 63.6, df=1, p<0.0001 for Whissel Ann Devitt Trinity College Dublin

  44. Corpus AnalysisSome conclusions • Sentiment-bearing terms are a distinct subset of English • Positive polarity bias in BNC • General Inquirer and Whissel: • Low coverage and high frequency • SentiwordNet: • Wide coverage and much lower frequency Ann Devitt Trinity College Dublin

  45. Outline • Current psychological theory of emotion • Evaluation of lexical “emotion” resources • Corpus analysis of language of “emotion” Comparative Ann Devitt Trinity College Dublin

  46. Comparative Corpus AnalysisAims • Examine affective term use • Identify statistically different distributions • Is there a dominant feature/polarity? Ann Devitt Trinity College Dublin

  47. Financial Language 2 million words On-line financial news: Reuters, CNN, Bloomberg Newspapers General Language BNC 100 million words Balanced, broad corpus Comparative Corpus AnalysisThe Data Ann Devitt Trinity College Dublin

  48. Comparative Corpus Analysis The Data • BNC sub-corpora • Imaginative written English • 16 million words • Informative written English • 70 million words Ann Devitt Trinity College Dublin

  49. Comparative Corpus AnalysisMethodology • Compare proportions of Sentiment Features • χ2 Test of Independence • H0: πFinCorpus = πBNC Ann Devitt Trinity College Dublin

  50. Comparative Corpus Analysis Methodology • Statistical significance of different proportion • χ2 > 7.8794 • p >= 0.005 • Features: • 41 Lexicon Sentiment Features from 4 lexica • Frequency per million words Ann Devitt Trinity College Dublin

More Related