1 / 21

LIWC

LIWC. L inguistic I nquiry & W ord C ount. Jeff Spicer & Matthew Egizii. The Pennebaker Dictionary. LIWC uses Dictionaries of Categories to define its search terms. The Pennebaker Dictionary is built in, but others can be imported. The Pennebaker Dictionary (2001)

tmaskell
Download Presentation

LIWC

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. LIWC Linguistic Inquiry & Word Count Jeff Spicer & Matthew Egizii

  2. The Pennebaker Dictionary • LIWC uses Dictionaries of Categories to define its search terms. • The Pennebaker Dictionary is built in, but others can be imported. • The Pennebaker Dictionary (2001) • LIWC's default set of psychologically meaningful categories • 74 subdictionaries (categories) • {80 in LIWC 2007} • Each subdictionary is comprised of words chosen and assessed by a set of judges who then agreed upon a set of subdictionary scales (93%-100% of the time). • Many of these words are in multiple categories.

  3. The Pennebaker Dictionary • If you are able, use the Pennebaker 2007 rather than 2001: • It removes several categories that had, “Consistently low base rates and were rarely used: Optimism, Positive Feelings, Communication Verbs, Other References, Metaphysical, Sleeping, Grooming, School, Sports, Television, Up, and Down. The category of unique Words (also known as Type/Token ratio) has also been removed.” • It adds the categories of Conjunctions, Adverbs, Quantifiers, Auxiliary Verbs, Commonly-used Verbs, Impersonal Pronouns, Total Function Words, and Total Relativity Words. • Also, the categories themselves are much more fleshed out: • Religion is not strictly, “Catholicism,” as it was before (seemed a tad biased).

  4. The Pennebaker Dictionary • The LIWC website has a page with comparisons between the scores of each dictionary based on its library. • Means, SDs, Correlations • Comparing LIWC2007 with LIWC2001 Dictionaries

  5. Preparing Text • LIWC uses .txt or ASCII files for analysis. • Files should be checked for: • Correct U.S. Spelling • Spelled-out meaningful abbreviations • Removal of “Non-Fluency” words

  6. Reading the Results • Results are given as a % of the total text. • Except for: • Word Count • Words Per Sentence • Sentences Ending with a Question Mark (?) • Results are placed in a .xls file (Spreadsheet) • The file is “Tab-Delimited” meaning that importing it into an SPSS data file is quite simple.

  7. Opening & Processing Files • Opening: Allows you to read/edit the text within LIWC • Processing: Runs the text analysis

  8. Setting Dictionaries & Categories • Each of the categories can be turned on/off with a checkbox.

  9. Analyze Function • Segmenting the File • Segmenting the Selection allows you to divide the text into multiple parts for analysis.

  10. Analysis of Epic Texts • We decided to use the power of CATA on several huge literary blocks of text: • The Odyssey • The Aeneid • Beowulf

  11. Analysis of Epic Texts • Textualization of Oral Epic Tradition • Attempt to capture the Ekphrasis of the original medium. • Some elements are lost in translation. • Question: Which elements are both difficult to describe and also necessary to pass on to a culture?

  12. Analysis of Epic Texts • Primarily we were interested in references to Gods, Religious Tradition and Worship • We chose the (now defunct) category, “Metaphysical,” rather than, “Religion.” • Its word choices are more in line with spirituality rather than modern, formalized religion. • We also used the Standard Information category • Word Count, Words/Sentence, Sentences ending with ?, LIWC dictionary words, Unique words, Words longer than 6 characters

  13. Source of Error? • We had some trouble with the Psychological Processes group. • Several categories wouldn’t shut off, even after de-selecting them. • ??? • So we decided torun them too!

  14. Processed Files • After hitting Process & choosing where to save the .xls file, it will open in plain text within LIWC.

  15. Results & Graphs • Total Word Counts: • Odyssey: 117643 • Aeneid: 101370 • Beowulf: 23726

  16. Results & Graphs • Unique Words (%) • Odyssey: 5.76 • Aeneid: 8.54 • Beowulf: 15.76 • Words/Sentence • Odyssey: 37.43 • Aeneid: 32.74 • Beowulf: 25.19

  17. Results & Graphs • Question Marks (% of Sentences) • Odyssey: 0.23 • Aeneid: 0.41 • Beowulf: 0.03 • Exclamation Marks (% of Sentences) • Odyssey: 0.01 • Aeneid: 0.1 • Beowulf: 0.72 • Metaphysical(% of Words) • Odyssey: 0.97 • Aeneid: 1.37 • Beowulf: 1.42

  18. Findings • Odyssey is: • MASSIVE • Has the longest sentences • Has the least % of unique words • Has the least % of exclamations • Is the least interested in the Metaphysical.

  19. Findings • Aeneid is: • Also pretty big • Has a larger amount of Metaphysical text • Also isn’t interested in exclamations • And asks the most questions.

  20. Findings • Beowulf is: • Pretty short (comparatively) • Much shorter sentences • Filled with many unique words • Asks few questions • Is the most interested in the Metaphysical • And is very excitable!!!!!

More Related