1 / 22

Towards Parsing Croatian Complex Sentences: Dependent Noun Clauses

Towards Parsing Croatian Complex Sentences: Dependent Noun Clauses. Vanja Štefanec, Kristina Vučković, Zdravko Dovedan University of Zagreb, Faculty of Humanities and Social Science s { vstefane, kvuckovi, zdovedan } @ ffzg.hr NooJ20 10 Komotini. Our goal.

zohar
Download Presentation

Towards Parsing Croatian Complex Sentences: Dependent Noun Clauses

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Towards Parsing Croatian Complex Sentences: Dependent Noun Clauses Vanja Štefanec, Kristina Vučković, Zdravko Dovedan University of Zagreb, Faculty of Humanities and Social Sciences {vstefane, kvuckovi, zdovedan}@ffzg.hr NooJ2010Komotini

  2. Our goal • to determine the boundaries of dependent clauses within the complex sentence • focusing the parser • performing disambiguation of chunks • improving the chunker • to test the adequacy of this model as a pre-parsing method for complex sentences NooJ2010 Komotini

  3. Overview of the work • grammar that can recognize the dependent nounclause (objectclause) in the complex sentence • both simple object clause and coordination of object clauses • by defining the co-text in which object clause can occur • NOT by describing its structure • relying on • output of the chunker • conjunctions, complementizers, punctuations, ... NooJ2010 Komotini

  4. Object clauses in Croatian • very frequent • refer to their superordinateclause predicate as a direct object • three types (according to grammars) • relative (odnosne) • interrogative (zavisnoupitne) • declarative (izrične) NooJ2010 Komotini

  5. Relative object clauses • introduced by relative pronouns and adjectives Jeste li našli [što ste tražili]? Have you found [what you’ve been looking for]? Kupit ću [kakvog nađem]. *I will buy [of the kind I’ll find]. NooJ2010 Komotini

  6. Interrogative object clauses • general (općeupitne) • introduced by interrogative conjunctions ‘li’, ‘da li’ or by interrogative pronouns (‘tko’, ‘koji’, ‘čiji’, ‘što’, …) Još ne shvaćaš [što se dogodilo]. You still don’t understand [what happened]. Zaboravio sam [koji je danas dan]. I forgot [which day it is]. NooJ2010 Komotini

  7. Interrogative object clauses • of place (mjesne) • introduced by interrogative adverbs of place Recite [kamo ste se zaputili]. Tell us [where you are headed]. • of time (vremenske) • introduced by interrogative adverbs of time Nisu rekli [kad će doći]. They didn’t say [when they’ll be coming]. NooJ2010 Komotini

  8. Interrogative object clauses • of manner (načinske) • introduced by interrogative adverb ‘kako’ Još nismo saznali [kako se to dogodilo]. We still haven’t found out [how that happened]. • qualitative (kvalitativne) • introduced by interrogative adjectives ‘kakav’, ‘kakva’, ‘kakvo’ Ne znam [kakav si ti to čovjek]. I don’t know [what kind of a person you are]? NooJ2010 Komotini

  9. Interrogative object clauses • of amount (količinske) • introduced by interrogative adverb ‘koliko’ Znaš li [koliko si već popio]? Do you know [how much you drank already]? • of cause (uzročne) • introduced by interrogative adverbs of cause or prepositional expressions ‘zašto’, ‘zbog čega’, … Ne razumijem [zašto si zakasnio]. I don’t understand [why you are late]. NooJ2010 Komotini

  10. Declarative object clauses • introduced by conjunctions • ‘da’ (most common) • ‘kako’ (less frequent; stylistic variant of ‘da’) • ‘gdje’ (extremely rare; very stylistically marked) Obećao si [da ćeš doći]. You promised [that you’ll come]. Rekli su [kako ga nije briga]. They said [that he doesn't care]. NooJ2010 Komotini

  11. Object clauses in Croatian • have to be preceded by a transitive verb in an active voice form • impossible to predict their function by observing only the structure (Vidio sam)PRED([da se igra])OBJ. I saw that he’s playing. object-clause (Vidio sam)PRED(ga)OBJ([da se igra])ATTR. I saw him playing.  adjective clause (Izišao je)PRED(van)ADV([da se igra])ADV. He went out to play.  purpose clause NooJ2010 Komotini

  12. Object clauses in Croatian • can be easily confused with subjectclauses • subjectclauses refer either to the nominal predicate or verbal predicate in passive voice forms (Poznato je)PRED([da pušenje uzrokuje rak])SUBJ. It is well known that smoking causes cancer. (Kaže se)PRED([da je bolje spriječiti nego liječiti])SUBJ. It is said that it is better to be safe than sorry. NooJ2010 Komotini

  13. 1. 2. 3. 4. The model • can be divided into four parts • the predicate • what can appear between the predicate and object clause • object clause • what can appear after the object clause NooJ2010 Komotini

  14. 1. the predicate NooJ2010 Komotini

  15. 2. between predicate and the clause NooJ2010 Komotini

  16. 3. object clause - conjunctions NooJ2010 Komotini

  17. 3. object clause - body NooJ2010 Komotini

  18. 4. after the object clause NooJ2010 Komotini

  19. Examples Dodao je ([da približavanje Hrvatske EU ima dvije faze]). Pretpostavimo ([da imate visoke demokratske standarde], [da manjine imaju puna prava], [da su medijske slobode savršene])... Zato savjetuje svima koji namjeravaju podići kredite ([da malo pričekaju, ako to mogu]). Odgovarajući na pitanje hoće li na dogovore iz Mokrica djelovati skorašnji slovenski lokalni izbori, Maštruko je rekao ([kako u to ne vjeruje] te [da bi u slučaju kad bi države svaki put čekale ([da prođu izbori]), pregovaranje bilo nemoguće]). NooJ2010 Komotini

  20. Problems • chunker can not identify the whole VP • undisambiguated chunks • subject clauses • some verbs can take two arguments in accusative case • ‘pitati’ (to ask), ‘učiti’ (to teach), ... • adjective clauses, purpose clauses • identifying the level of subordination • often problem beyond syntax • rules of orthography • proper use of punctuation marks (comma, dash) NooJ2010 Komotini

  21. Evaluation • performed in ideal circumstances • predicate is correctly identified (i.e. chunked) • information about verb valency is present • corpus consists of 174 sentences with 215 object clauses NooJ2010 Komotini

  22. Evaluation • low precision • BUT correct identification in 91% of the cases • average number of results per clause is 2,15 • disambiguation! • high recall • confirms the adequacy of the model • AND we have identified the critical cases so improvements can also be expected NooJ2010 Komotini

More Related