1 / 29

Peter Grzybek & Ernst Stadlober

Peter Grzybek & Ernst Stadlober. Quantitative Text Typology. http://www-gewi.uni-graz.at/quanta  http://quanta.uni-graz.at Austrian Research Fund  Project #15485. Let‘s suppose there is …. … A Universe of Texts. Is the Universe Structured ? Or Can We Structure it ?.

anne-beard
Download Presentation

Peter Grzybek & Ernst Stadlober

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Peter Grzybek & Ernst Stadlober Quantitative Text Typology http://www-gewi.uni-graz.at/quanta http://quanta.uni-graz.at Austrian Research FundProject #15485

  2. Let‘s suppose there is … … A Universe of Texts

  3. Is the Universe Structured ? Or Can We Structure it ? How Can the Text Universe Be Structured?

  4. Corpus Analysis vs.Text Analysis • (Re-)Construction • of a norm • of a standard • of „language“ Text As a Homogeneous Entity „Text Mixture“ Self-regulating System („Quasi Text“) Complete Text

  5. What is a Text ? • Complete novel, composed of books ? • Complete book of a novel, consisting of several chapters ? • Individual chapters ? • Dialogical vs. narrative sequences within a text ? • Two Major Problems: • Data Homogeneity • Definition of Basic Analytical Units

  6. Both problems relevant for quantitative approaches WHY QUANTITATIVE APPROACHES ? • ASSUMPTION: • If a ‚text‘ is governed by synergetic processes, these processes can and must be quantitatively described. • The descriptive models obtained for each ‚text‘, can be compaired to each other, possibly resulting in one or more general model(s). • Thus, a quantitative typology of texts can be obtained.

  7. WHY WORD LENGTH ? Synergetics In a Nutshell – Frequencies and Dependencies Word Length: Graphemes, Phonemes, Syllables, Morphemes,…

  8. TYPES OF TEXT TYPOLOGIES • I. Qualitative • II. Quantitative-Qualitative • Tabula Rasa Principle (Clustering Methods) • A-priori  A-posteriori Principle (Discrimination Methods)

  9. Structuring the Text Universe (Ia): Text Sorts

  10. Structuring the Text Universe (Ib): Functional Styles

  11. In a qualitative approach, the text universe is structured with regard to external (pragmatic) factors („with reference to the world“) • general communicative functions of language (functional styles) • specific situational functions (text sorts)

  12. Top-Down Bottom-Up

  13. Bottom-Up Top-Down First and Second Order Cross Comparisons

  14. Intended Emphasis on Letters • ‚Letter‘ as a Prototype of Language • Located between Oral and Written Communication • Result of One Homogeneous Process of Text Generation

  15. Textbasis (398 Slovenian Texts)

  16. A Small World of Texts Word Length Frequencies (in %) of Four Texts Literary Prose Text (#256) Versified Poetic Text (#359) Journalistic Comment (#324) Private Letter (#1)

  17. Post-Hoc-Tests (Text Sorts) Groups without significant differences form „homogeneoussubgroups“ • Homogeneous subgroups do exist • All four letter types in different subgroups !

  18. Post-Hoc-AnalysesHomogeneousSubgroups DiscriminantanalysesCases are attributed to groups, on the basis of specific predictor variables Thevariablesare submitted to linear transformations in order to arrive at an optimal discriminationof the individual cases

  19. DiscriminantAnalysis: Eight Text Sorts Discrimination variables: m1, m2, v, p1 (56.30%)

  20. Discriminant Analysis: Four Letter Types (n=213) {Private L.} {Ep. Novel} {Readers‘ L.} {Open L.} Discrimination variables: m1, v 70.40 %

  21. Discriminant Analysis: Three Letters Types (n=213) {Private L., Ep. Novel} {Readers‘ L.} {Open L.} Discrimination variables: m1, p2 86.90 %  Distinction of Literary Letters Irrelevant ?

  22. Discriminant Analysis: Private vs. Public Letters (n=213) {Private L., Ep. Novel}, {Readers‘ & Open L.} Discrimination variables: m1, p2 92.00 %  Distinction of Private vs. Public Styles ?

  23. Discriminant Analysis: Private vs. Public Texts (n=248) {Private L., Ep. Novel}, {Readers‘ & Open L., Comments} Discrimination variables: m1, p2 91.10 %  Public vs. Private Styles ?

  24. Discriminant Analysis: Private/Oral vs. Public/Written Texts (n=290) {Private L., Ep. Novel, Drama}, {Readers‘ & Open L., Comments} Discrimination variables: m1, p2 92.40 %  Oral vs. Written Styles ?

  25. Discriminant Analysis: Three Text Types (n=330) {Private / Oral} {Public / Written} {Verse} Discrimination variables: m1, p2, v 91.20 %  Towards a New Typology ?

  26. Discriminant Analysis: Four Text Types (n=398) {Private / Oral} {Public / Written} {Prose} {Verse} Discrimination variables: m1, p2, v 79.90 %

  27. Discriminant Analysis: Three Text Types (n=398) {Private / Oral} {Public / Written / Prose} {Verse} Discrimination variables: m1, p2, v 92.70 %

  28. This is the End …

More Related