1 / 24

Applying Embodied Construction Grammar:

Applying Embodied Construction Grammar:. a description of some Afrikaans morphological constructions. Gerhard B van Huyssteen Potchefstroom University for CHE South Africa Acknowledgement: Sulené Pilon. ICLC 2003. Overview. HLT and CL in South Africa

lindsey
Download Presentation

Applying Embodied Construction Grammar:

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Applying Embodied Construction Grammar: a description of some Afrikaans morphological constructions Gerhard B van Huyssteen Potchefstroom University for CHE South Africa Acknowledgement: Sulené Pilon ICLC 2003

  2. Overview • HLT and CL in South Africa • Project: Automatic Morphological Analysis of Afrikaans • Requirements of a Formalism • Two Afrikaans Constructions • Plural Construction • Nominalising Construction • Concluding remarks ICLC 2003

  3. HLT in South Africa • CL and NLP: • well-established research fields in USA, Europe, and other parts of the world • unexplored territory in South Africa • no catholic HLT projects for many years • Since 2000: • awareness of importance of HLT • governmental level – advisory committee of DACST (2002) • academic level – new projects & programmes ICLC 2003

  4. CL at the PUCHE • Since 2001- prioritised CL as strategically important • establish research focus area “Language and Technology” • establish first complete graduate study programme in CL in South Africa • set up dedicated HLT laboratory • acquire text and speech corpora for: • Afrikaans • South African English • Setswana • Two related Afrikaans projects: • Spelling Checker project (funded by University) • Automatic Morphological Analysis of Afrikaans project (funded by NRF) ICLC 2003

  5. AMAA project • Aim: to develop efficient, reusable modules for the automatic morphological analysis of Afrikaans • tokeniser –hyphenator • word segmenter – POS tagger • compound analyser –stemmer • Project team includes 4 linguists, 1 computational linguist (from University of Tilburg, Netherlands), 2 computer scientists • Problem: communication between: • different disciplines • different languages ICLC 2003

  6. In Search of a Formalism • A formalism is a set of features used to precisely and rigorously interpret linguistic analysis (i.e. rules, principles, conditions, etc.) in logical or mathematical terms, in order to develop a calculus (cf. Crystal, 1997: 156) • Looking for: • a formal rule system (i.e. formal grammar or formalism) • for declarative purposes • not for more procedural purposes (like parsing and generation) • to represent Afrikaans morphological structure • not particularly interested in syntax, semantics, pragmatics ICLC 2003

  7. Requirements: Formalisms • Accessibility • Transparent • Supported by literature • Efficiency • Linguistically efficient • Must be able to capture all linguistic phenomena accurately • Computationally efficient • To be implemented in a computer environment • Flexibility • Describe language structure with ease • Represent the underlying linguistic theory • Reusability • apply in different environments and applications ICLC 2003

  8. Some specific requirements • Must represent regexp’s • developing a rule-based stemmer, using PERL • Must rank the rules • exceptions (i.e. low-level instantiations) are ranked higher than rules (i.e. schemas) • “longer” rules are ranked higher than “shorter” rules • DIM construction: -tjie is removed before –jie paaltjie hondjie • Must be compatible with CG ICLC 2003

  9. Procedure • Identify main morphological processes • Inflection • Derivation • Compounding • Identify constructions • PLURAL construction • PAST construction • NOMINALISING construction • REDUPLICATION construction • Draw categorisation networks • Translate into ECG • Implement in stemmer ICLC 2003

  10. Afrikaans Plural Construction • Inflectional process, realised by means of suffixation • 2 prototypical constructions: • -e: hond – honde [dogs]; bal – balle [balls] • -s: venster – vensters [windows]; tafel – tafels [tables] • Elaborations of the general schema • ’e: 3 – 3’e [3’s] • ’s: ma – ma’s [mothers] • Extensions of the general schema • -a: datum – data ICLC 2003

  11. Categorisation Network GB van Huyssteen (PUCHE) ICLC 2003 ICLC 2003

  12. PLURAL construction I construction SUFFIXATION subclass of AFFIXATION constructional constituents root suffix constraints constituency : [rootm/rootf]  [[suffixm/suffixf]] form constraints rootfmeets suffixf suffixf .dependency  dependent rootf .dependency  autonomous | dependent meaning constraints profile-det  suffix ICLC 2003

  13. PLURAL construction II construction PLURAL subclass of SUFFIXATION constructional evokes INFLECTION constituents root : NOUN-SG; LET; NUM; ABBR suffix : PLURAL-SUF constraints rootm.scope-of-pred  BOUNDED-REGION suffixm.scope-of-pred  UNBOUNDED-REGION form meaning constraints scope-of-pred  UNBOUNDED-REGION ICLC 2003

  14. PLURAL construction III construction PLURAL-s subclass of PLURAL constructional constituents root : NOUN-SG-CN suffix : s constraints rootf: /^($C)?$V($C)$V[a-z]*$/ suffixf: /s/ rootm.profile  THING ranking : 16 form constraints s /^($C)?$V($C)$V[a-z]*$/^($C)?$V($C)$V[a-z]*s$/ meaning constraints profile  THING ICLC 2003

  15. PLURAL construction IV construction PLURAL-’s subclass of PLURAL-s constructional constituents root : NOUN-SG-PROPER; NOUN-SG-CN; LETT; NUM; ABBR suffix : ’s constraints rootf : /%PROPN($V)$/ /%CN([iouá])$/ /^([a-z][^lmnrsxz])$/ /^([1-9]+[^123456])$/ /^%ABBR($V)$/ rootm.profile  THING | SAR suffixf : /’s/ ranking : 13 form constraints s /%PROPN($V)$/%PROPN($V)’s$/ s /%CN($V)$/%CN($V)’s$/ s /^(/[a-z][^lmnrsxz]/)$/^([a-z][^lmnrsxz]’s)$/ s /^([1-9]+[^123456])$/^([1-9]+[^123456])’s$/ s /%ABBR($V)$/%ABBR($V)’s$/ meaning constraints profile  THING ICLC 2003

  16. PLURAL construction V construction PLURAL-specified subclass of PLURAL constructional constituents root: pad sambreel hemp seun bod Aardklop (l|spr)eeu man (m)?eeu vrou voël kasteel bal oom suffix: PLURAL-SUF constraints ranking : 1 form constraints s/pad/paaie/ s/sambreel/sambrele/ s/hemp/hemde/ s/seun/seuns/ s/bod/botte/ s/Aardklop/(Aardkloppe|Aardklops) s/(l|spr)eeu/(l|spr)eeus/ s/man/(manne|mans) s/(m)?eeu/(m)?eeue/ s/vrou/(vroue|vrouens) s/voël/(voëls|voële) s/kasteel/kastele/ s/bal/(balle|ballas) s/oom/ooms/ meaning constraints profile  THING ICLC 2003

  17. Categorisation Network GB van Huyssteen (PUCHE) ICLC 2003 ICLC 2003

  18. NOMINALISING construction I construction NOMINALISING subclass of AFFIXATION constructional evokes DERIVATION constituents root : VERB|ADJ|ADV affix : NOM-PREFIX|NOM-SUFFIX|NOM-CIRCUMFIX constraints rootm.profile  PROCESS|SAR|CAR affixm.profile  THING form meaning constraints profile  THING ICLC 2003

  19. NOMINALISING construction II construction NOMINALISING-ge()[+$C]ery subclass of NOMINALISING-ge()ery constructional constituents root : VERB circumfix : ge()ery constraints rootf: /%VERB([áéíóú]$C$/ rootm.profile  PROCESS circumfixf: /ge()[+$C]ery/ ranking : 1 form constraints s/%VERB([áéíóú]$C$/ge(%VERB)([áéíóú]$C$Cery$/ meaning constraints ICLC 2003

  20. NOMINALISING construction III construction NOMINALISING-[-$V]$Cing subclass of NOMINALISING-ing constructional constituents root : VERB suffix : ing constraints rootf : /%VERB($V$V$C)/ rootm.profile  PROCESS suffixf : /[-$V]$Cing/ ranking : 10 form constraints s/%VERB($V$V$C)/%VERB($V$C)ing/ meaning constraints ICLC 2003

  21. NOMINALISING construction IV construction NOMINALISING-er subclass of NOMINALISING-SUF constructional constituents root : VERB suffix : er constraints rootf: /^(%VERB)$/ rootm.profile  PROCESS suffixf: /er/ ranking : 12 form constraints s/^(%VERB)$/^(%VERB)er$/ meaning constraints attr  +HUMAN ICLC 2003

  22. Summary of adaptations • Our adaptations provided for our needs • added regexp’s as form constraints • added ranking as constructional constraints • added attributes as meaning constraints • added more CG concepts/constructs: • profile • valence factors: • profile determinacy • conceptual and phonological autonomy and dependency • constituency • ¿correspondence? • Make it therefore more accessible for us ICLC 2003

  23. Evaluation: ECG as a Declarative Formalism • Accessible? • very little ECG material (specifically on morphology) available • isolated – “do whatever we want to do…” • Efficient • Linguistically efficient? • handled our data beautifully • Computationally efficient? • not our primary concern • improved communication with computational linguist and computer scientists • Flexibility • represents essence of Cognitive Linguistics beautifully • easy to add features/adaptations • Reusable? • not our primary concern • Main Advantage: • compatibility with Cognitive Grammar ICLC 2003

  24. Conclusion • Your conclusion: • What are we doing wrong? • What are we missing? • Are we “abusing” ECG? ICLC 2003

More Related