340 likes | 482 Views
Peter Grzybek. Estonian Proverb s : S earching for re gularities. www.peter-grzybek.eu. How long is a proverb ? How long are words in proverbs ? Does word length depend on proverb length ? Is word length independent of within-text position ?.
E N D
Peter Grzybek Estonian Proverbs: Searchingforregularities www.peter-grzybek.eu Kriq 75, August 18/19, 2014
How long is a proverb ? • How long are words in proverbs ? • Does word length depend on proverb length ? • Is word length independent of within-text position ? Kriq 75, August 18/19, 2014
How to measure the length of linguistic units and entities ? • Memo: „There are no positive facts in language.“ (Saussure) • There is always more than one definition. • Define the entity you want to measure. • If you want to measure sentence length, define ‚sentence‘. • If you want to measure word length, define ‚word‘. • Determine the measuring units in which you want to measure. • E.g., sentence length: number of clauses, phrases, words, syllables, morphemes, … ? • E.g., word length: number of syllables, morphemes, letters, graphemes, of phonemes, … ? • Define the measuring units. • Define ‘clause’, ‘phrase’, ‘syllable’, ‘morpheme’, ‘phoneme’, ‘grapheme’, ‘letter’, … ? Rule in Quantitative Linguistics: Take directconstituentsasmeasuringunits Kriq 75, August 18/19, 2014
Howlongareproverbs ? Sentencelength:Oneproverbonesentence Kriq 75, August 18/19, 2014
Orthographicproblems: Mother-in-law - Isn‘tthat a problem ? В этом доме. в кратцу - вкратце Phonological word (tactgroup): Námostu. In agglutinative languages … … stems do not change, … affixes do not fuse with other affixes, … affixes do not change form conditioned by other affixes. Kriq 75, August 18/19, 2014
Howlongarewords ? Kriq 75, August 18/19, 2014
Estonianphonemes: Threedegreesofphonemiclength (consonantsandvowels) [o] (short o) koli = „Müll“ [oˈ] (long o) kooli = „Schule“ [oː] (extra long o) kooli" = „schulen“ Kriq 75, August 18/19, 2014
Decisions / Definitions (In accordancewithKriq 1967) Kriq 75, August 18/19, 2014
Üksriisubrihaga, teinepühibluuaga.(EV 15016) [Der eine recht mit dem Rechen, der andere kehrt mit dem Besen.] Wo:6 – St:6 – Sy:13 Üksrii-subri-ha-ga, tei-ne pü-hibluu-a-ga. Isipuu, isipuuke.(EV 2245) [Das eine ist der Baum, das andere ist das Bäumchen] Wo:4 – St:4 – Sy:7 I-si puu, i-si puu-ke. Kriq 75, August 18/19, 2014
Erna Normann(1955) Valimikeestivanasõnu 3576 proverbs Ca. end 19th, early 20thcentury Kriq 75, August 18/19, 2014
Comparisons Old (17th/18thcentury) and Contemporary Bimodaldistributions: Additional Peaks ( 6 / 8 ) Question: Doestheword-stemdistinctionexplainthe bi-modality? Kriq 75, August 18/19, 2014
Eestivanasõnad (12921 proverbs) Words per proverb Stems per proverb Words stems: Linear relation ! Concentration on words Kriq 75, August 18/19, 2014
Some In-between conclusions Bi-modality seems to originate in the proverb material‘s characteristics; this phenomenon needs more detailed study It seems reasonable to assume the overall picture to be a result of differences between syntactically different provers: e.g., „simple“ (uni-partite proverbs without hypotaxis) vs. „complex“ (n-partite proverbs with hypotaxis). As long as we do not have relevant data available, data pooling seems to be an appropriate procedure, to make the forest visible before the trees. Pooling data: Intervals 2-3, 4-5, 6-7,… Kriq 75, August 18/19, 2014
Is there a way to find a theoretical model for sentence length frequencies ? Assumptions: The distribution of length is organized in a law-like manner. It is sufficient to make assumptions about the difference D of two neighboring frequencies (probabilities) Which factors influence D ? a language-specific factors b production-specific factors c norming forces d level-specific factors (words vs. phrases) Hyperpascaldistribution (Beta-binomial d.) Kriq 75, August 18/19, 2014
Eestivanasõnad Testingthehyperpascaldistribution k = 1.21 m = 0.07 q = 0.39 C = X²/N = 0.0193 LengthofEstonianproverbsisregularlyorganized. The well-knownhyperpascaldistributionis a goodmodel. Kriq 75, August 18/19, 2014
Isthere a regularityofwordlength in Estonianproverbs ? Normann (21038 words) Kriq 75, August 18/19, 2014
In searchof a wordlengthmodel Poisson-distribution 1-displaced Poisson-distribution („Fucks distribution“) C = X²/N = 0.08 No good model ! Kriq 75, August 18/19, 2014
An alternative modelforwordlength in Estonian (proverbs) Geometricdistribution 1-displaced geometricdistribution 1-displaced Shenton-Skeesgeometricdistribution Word stems Orthographicwords p = 0.85 a = 3.49 C = 0.0062 p = 0.88 a = 4.71 C = 0.0023 Kriq 75, August 18/19, 2014
Word length in Eestivanasõnad (88296 words) p = 0.84 a = 3.30 C = 0.0074 Kriq 75, August 18/19, 2014
Proverb Length Word Length (Normann) Kriq 75, August 18/19, 2014
Menzerath-Altmann law (Altmann1980) »The longer (more complex) a linguistic construct, the shorter (less complex) its constituents.« Example: The longer a sentence the shorter the clauses constituting the sentence. NB: Direct relations (in the classical structuralist paradigm) only, i.e., the relation of a construct to its immediate constituents; the relation between entities from indirectly related levels (e.g., between sentences and words, leapfrogging the intermediate level of sub-sentential constructs like clauses or phrases) is expected to show different (more complex) tendencies. Basic form: y: construct = dependent variable, x: constituent independent variable K: integration constant, a: parameter determining the steepness of the decrease (for a < 0). Full form Extended form (Wimmer-Altmann law) Kriq 75, August 18/19, 2014
Proverb Length Word Length Normann K= 1.68 c= –0.84 R² = 0.90 Eesti vanasõnad K=1.71 a = 0.18 c=–1.05 R² = 0.98 Kriq 75, August 18/19, 2014
Word Length Syllable Length Eesti vanasõnad K=2.02 c=0.42 R² = 0.96 Kriq 75, August 18/19, 2014
Positional aspects of word length Fourier series: R² = 0.99 Kriq 75, August 18/19, 2014
In the two approaches discussed above, analyses concerned: • the dependence of word length on sentence length no attention to within-sentence position, • the dependence of word length on within-proverb position ignoring the specific proverb length. Unipartite proverbs with length T3–T5 Decrease – increase Minimum at 2nd position Maximum at last position Bipartite proverbs with length T6–T10 Cycle I: unipartite proverbs (T6) Cycle II: T7, T9, and T10 T6, T8 unipartiteproverbs = monotonous increase Kriq 75, August 18/19, 2014
Whatcausesproverbstobelong(er) orshort(er) ? Frominternalsynergetictoexternalfactors Kriq 75, August 18/19, 2014
... Tänan teid kannatlikkuse ja tähelepanu ... Kriq 75, August 18/19, 2014
FamiliarityFrequency • German data • American data SentenceLengthandFamiliarity (German data: N= 11.355; excluding zero-familiarity, f >100) SeL= 8.40 Frq-0.09 R² = 0.89 Kriq 75, August 18/19, 2014
Desiderata forEstonianParemiology • Variants vs. Types • Frequency • Familiarity • Linguistic forms of variants • Frequency • of variants • of types • Familiarity • of variants • of types “It seems preposterous even to ask where the 'variants of one proverb' end and the 'variants of another proverb' begin, or how many 'different proverbs' could be found within such a thicket.” Kriq 75, August 18/19, 2014
Frequencydistributionof ‚variants‘ (Unreliabledataforf > 10) Zipf distribution Right-truncated Zipf distribution a=1.91 R =9 C=X²/N = 0.0032 a=2.08 C=X²/N = 0.06 Kriq 75, August 18/19, 2014
K=6.52 c=0.07 R² = 0.96 Kriq 75, August 18/19, 2014
July 21, 1939: ArvoArnol‘dovič Krikmann Belgian National Day Village Pudivere (German: Poidifer) Estonian Writer Eduard Vilde (1865-1933) Simuna Parish Important point in F.G.W. Struve‘s Geodatic arc, A chain of triangulations (1827) July 21, 1940: President Konstantin Päts affirmed the government of Johannes Vares (appointed by Andrej Ždanov), accompanied by the arrival of Soviet demonstrators and Red Army troops, replacement of the Flag of Estonia by the Red flag on Pikk Hermann, meeting of the newly elected parliament Riigikogu on July 21. July 21, 1944: Graf Claus von Stauffenberg and his fellow conspirators were executed in Berlin for the plot to assassinate Adolf Hitler. July 21, 1944: The United States Senate ratifies the North Atlantic Treaty. Kriq 75, August 18/19, 2014