510 likes | 2.23k Views
Brief introduction to morphology. Morphology is the study of word structure and word formation processes in language. Terminology. A morpheme is the smallest meaning-bearing unit in a language. Free morphemes are independent words.
E N D
Brief introduction to morphology • Morphology is the study of word structure and word formation processes in language.
Terminology A morpheme is the smallest meaning-bearing unit in a language. • Free morphemes are independent words. • Bound morphemes, called affixes, cannot stand on their own. • Types of affixes: • prefix attaches at front of word • suffix attaches at end of word • circumfix attaches around word • infix attaches inside word
References for examples Examples are either from our textbook or from Linguistics, second edition, by Akmajian, Demers and Harnish, MIT press.
Morphological processes • There are different kinds of morphological processes, in particular: • inflectional morphology • derivational morphology
English inflectional morphology • nouns • plural marker: -s (dog + s = dogs) • possessive marker: -’s (dog + ’s = dog’s) • verbs • 3rd person present singular: -s (walk + s = walks) • past tense: -ed (walk + ed = walked) • progressive: -ing (walk + ing = walking) • past participle: -en or -ed (eat + en = eaten) • adjectives • comparative: -er (fast + er = faster) • superlative: -est (fast + est = fastest)
Properties of (English) inflectional morphology • Inflectional morphology does not change grammatical category • In English, all inflectional affixes are suffixes (they attach to the end of a word) • Inflectional affixes are attached after any derivational affixes: • modern + ize + s = modernizes (OK) • modern + s + ize (NOT OK) • modern + ize + s + able (NOT OK) • Inflectional morphology carries a regular meaning transformation.
English derivational morphology • Derivational morphology can (but need not) change grammatical category. • un + do = undo (both verbs) • program + able = programmable (verb, adjective) • Derivational morphology does not always induce a regular/predictable meaning change: there is “drift” • fix + able = fixable (able to be fixed) • read + able = readable (more than just “able to be read”) • wash + able = washable (more than just “able to be washed”)
Concatenative vs. nonconcatenative morphology • Concatenative morphology combines morphemes by concatentation (prefixes and suffixes demonstrate this) • Non-concatentative morphology combines in a non-concatenative manner • circumfixes and infixes • templatic morphology
Circumfix example German: • sagen (“to say”) • ge-t (past particple circumfix) • sagen + ge-t gesagt (“said”)
Infix example Bontoc Igorot (Philippine language) • kayu (“wood”) • -in- (“product of a completed action) • kayu + -in- kinayu (“gathered wood”) English abso-bloody-lutely (emphasis)
Templatic morphology • Semitic languages (Arabic, Hebrew) • stem (root), e.g. ktb (write) • consonant-vowel (CV) template: CVCCVC (causative) • vocalization: ui (perfect passive) • Combination: consonants in stem map onto Cs in template, vowels in vocalization map onto Vs to yield surface form: kuttib (“will have been written”)
Example detail k t b C V C C V C kuttib u i
Parsing • refers to the recovery of structure from analysis of input • often refers to the processing of sentences • can also refer to the processing of words • Stemming refers to the recovery of a word stem given a surface form of the word: uncharacteristically = un + character + istic + ally
Lexicons • One approach: list all words • difficult in English because some morphology is productive (applies to new words of language too): (table adapted from page 62 of text)
Issues • Not only do some affixes attach to large numbers of stems, • they also attach to new words in the language • spam, spams, spamming, spammed, spammer • Idea: encode morphological rules to generate all forms of words from a minimal set of word stems.
Applications • lexicons • stemming • generating correct surface word forms
Try it out! http://www.xrce.xerox.com/competencies/content-analysis/arabic/input/keyboard_input.html Try: kutib