1 / 20

Hindi Analysis System

Hindi Analysis System. Sunil Kumar Dubey Indian Institute of Technology Bombay. Format of Discussion. Enconversion Overview Working of Enconverter Examples Ambiguity resolution. Morphological. Syntactic. Semantic. Enconversion Overview. Enconverter Engine. Hindi Analysis Rules.

aleshanee
Download Presentation

Hindi Analysis System

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Hindi Analysis System Sunil Kumar Dubey Indian Institute of Technology Bombay

  2. Format of Discussion • Enconversion Overview • Working of Enconverter • Examples • Ambiguity resolution

  3. Morphological • Syntactic • Semantic Enconversion Overview • Enconverter Engine • Hindi Analysis Rules • Dictionary

  4. Morphological Analysis Study of word transformation and extract information about the Tense, Mood, Gender. • Noun Morphology • Verb Morphology • Adjective Morphology

  5. Engine Algorithm 1) Start scanning from left 2) Picks all morphemes from dictionary 3) Choose rule according to candidate word 4) Apply analysis rule and action performed according the type of rule 5) Process ends when only the predicate remains Output in UNL format

  6. Analysis Rules Enconverter Dictionary ni-1 ni+3 Node List ni ni+1 ni+2 C C C A A A D Node-net C B E Working of Enconverter

  7. Working of Enconverter Contd… • Condition Window Check two neighboring nodes on both sides of analysis window to judge whether analysis rule is applicable or not. • Analysis window to apply one of the analysis rule.

  8. Universal word Headword Attribute list Flags Dictionary [QaIro] {} “slow(icl>how)” (ADV,MAN) <H,0,0>; [AcC] {} “good(aoj>thing)” (ADJ,AdjA,QUAL) <H,0,0>; [Ka] {} “eat(icl>do)” (V,VINT,VA) <H,0,0>; [jaapana ] {} “Japan(icl>place)” (N,P,PLACE,INANI,3SG) <H,0,0>;

  9. Semantic relation can be generated by this rule > {N,ANI : : agt :}{V,^AGTRES :+AGTRES : :} (STAIL)P20 Priority Left analysis Window Right analysis Window Condition window Rule type What is a rule? For example : Syaama jaata hO.

  10. plc(play(icl>do).@entry.@present.@progress, field(icl>ground)) agt(play(icl>do).@entry.@present.@progress, Mohan(icl>person)) cag(play(icl>do).@entry.@present.@progress, Shyam(icl>person)) @entry play obj(play(icl>do).@entry.@present.@progress, football) plc agt obj cag field Mohan Shyam football Simple sentence maaohna maOdana maoM Syaama ko saaqa fuTbaa^la Kola rha hO.

  11. see read @entry @entry obj:01(read(icl>do).@entry.@present.@progress, book) agt:01(read(icl>do).@entry.@present.@progress, Mohan(icl>person)) agt agt obj obj agt(see(icl>event).@entry.@past, I(icl>person)) obj(see(icl>event).@entry.@past, :01) book :01 :01 I I Mohan Clausal Sentence Noun Clause maOMnao doKa ik maaohna iktaba pZ, rha hO.

  12. Long Sentence [sa ]_oSya ko ilae‚ Aa[- TI yaU ek bahupxaIya gaaoYzI p`dana krtI hO jahaÐ sarkarI AaOr gaOr–sarkarI saMsqaaeÐ AapsaI ihtaoM ko xao~aoM mao samaJaaOtaoM pr baatcaIt krnao ko ilae imala sakoM AaOr eosao maanadNDaoM kao gaZ, sako jaao dUrsaMcaar saMsqaanaaoM ko inaiva-Qna pircaalana kao sauinaiScat kroM AaOr saBaI doSaaoM maoM [nakI phuÐca kao baZ,avaa do sakoM. obj(provide(icl>do).@entry.@present, forum(icl>seminar)) pur(provide(icl>do).@entry.@present,purpose(icl>intention)) aoj(provide(icl>do).@entry.@present, ITU(icl>International Telecommunication Union)) mod(purpose(icl>intention), this:00) scn(forge.@past.@ability, forum(icl>seminar)) qua(forum(icl>seminar), one)

  13. Long Sentence Contd… aoj(multilateral, forum(icl>seminar)) obj(forge.@past.@ability, standard(icl>measure).@pl) obj(forge.@past.@ability, meet(icl>event).@past.@ability) aoj(meet(icl>event).@past.@ability, institute(icl>facilities)) pur(meet(icl>event).@past.@ability, discuss(icl>talk)) obj(discuss(icl>talk), agreement(icl>pact).@pl) scn(agreement(icl>pact).@pl, field(icl>category).@pl) mod(field(icl>category).@pl, benefit(icl>advantage).@pl) mod(benefit(icl>advantage).@pl, mutual(icl>)) mod(institute(icl>facilities), government) and(private, government) aoj(ensure, standard(icl>measure).@pl) mod(standard(icl>measure).@pl, such) and(promote(icl>do).@past.@ability, ensure)

  14. Long Sentence Contd… obj(ensure, operation(icl>action)) mod(operation(icl>action), resource(icl>abstract thing).@pl) aoj(smooth, operation(icl>action)) mod(resource(icl>abstract thing).@pl, telecommunication(icl>communication)) obj(promote(icl>do).@past.@ability, access(icl>)) scn(access(icl>), country(syn>nation,equ>team).@pl) mod(access(icl>), these) aoj(all(icl>quantity), country(syn>nation,equ>team).@pl)

  15. Inclusion Of Tag • To clarify the syntax structure of sentence Syaama nao Kato hue baccao kao doKa. • To clarify the role of component of a sentence Aapkao imaza[- iKlaanaI pD,ogaI.

  16. Syntax Structure tags <s> </s> sentence start and sentence end <p> </p> phrase start and phrase end <c> </c> conjunction start and conjunction end

  17. @entry @entry See see agt obj agt obj Shyam child coo Shyam child agt eat eat Syaama nao Kato hue baccao kao doKa. Syaama nao <p> Kato hue baccao kao </p> doKa. Phrase Tag

  18. Role Component tag <&[part of speech]> Specify part of speech <#[UW][.attribute]> Specify UW and/or attribute <-[relation]> Specify relation

  19. @entry give give @entry agt obj ben obj you you sweet sweet Aapkao <-ben> imaza[- iKlaanaI pD,ogaI. Aapkao <-agt> imaza[- iKlaanaI pD,ogaI. Relation Tag

  20. Conclusion • handle all the relation labels in the UNL specification. • Can deal with simple, clausal and interrogative sentences. • We have handled different corpuses e.g Agriculture corpus, ITU corpus • There are around 6000 rules in the rule file

More Related