1 / 25

CS5545: Realisation

CS5545: Realisation. Background Reading: Building Natural Language Generation Systems, chap 6. Realisation. Third (last) NLG stage Generate actual text Take care of details of language Syntactic details Eg Agreement (the dog runs vs the dogs run ) Morphological details

otto-burns
Download Presentation

CS5545: Realisation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CS5545: Realisation Background Reading: Building Natural Language Generation Systems, chap 6

  2. Realisation • Third (last) NLG stage • Generate actual text • Take care of details of language • Syntactic details • Eg Agreement (the dog runs vs the dogs run) • Morphological details • Eg, plurals (dog/dogs vs box/boxes) • Presentation details • Eg, fit to 80 column width

  3. Realisation • Problem: There are lots of finicky details of language which most people developing NLG systems don’t want to worry about • Solution: Automate this using a realiser

  4. Syntax • Sentences must obey the rules of English grammar • Specifies which order words should appear in, extra function words, word forms • Many aspects of grammar are somewhat bizarre

  5. Syntactic Details: Verb Group • Verb group is the main verb plus helping words (auxiliaries). • Encodes information in fairly bizarre ways, eg tense • John will watch TV (future – add will) • John watches TV (present - +s form of verb for third-person singular subjects) • John is watching TV (progressive – form of BE verb, plus +ing form of verb) • John watched TV (past – use +ed form of verb)

  6. Verb group • Negation • Usually add “not” after first word of verb group • John will not watch TV • Exception: add “do not” before 1-word VG • Inflections on do, use infinitive form of main verb • John did not watch TV • Exception to exception: use first strategy if verb is form of BE • John is not happy

  7. Realiser • Just tell realiser verb, tense, whether negated, and it will figure out the VG • (watch, future) -> will watch • (watch, past, negated) -> did not watch • Etc • Similarly automate other “obscure” encodings of information

  8. Morphology • Words have different forms • Nouns have plural • Dog, dogs • Verbs have 3ps form, participles • Watch, watches, watching, watched • Adjectives have comparative, superlative • Big, bigger, biggest

  9. Formation of variants • Example: plural • Usually add “s” (dogs) • But add “es” if base noun ends in certain letters (boxes, guesses) • Also change final “y” to “i” (tries) • Many special cases • children (vs childs), people (vs persons), etc

  10. Realiser • Calculates variants automatically • (dog, plural) -> dogs • (box, plural) -> boxes • (child, plural) -> children • etc

  11. Punctuation • Rules for structures • Sentences have first word capitalised, end in a full stop • My dog ate the bone, the meat, etc.. • Lists have conjunction (eg, and) between last two elements, comma between others • I saw Tom, Sue, Zoe, and Ciaran at the meeting. • Etc • Realiser can automatically insert appropriate punctuation for a structure

  12. Punctuation • Rules on combinations of punc • Don’t end full stop if sentence already ends in a full stop • He lives in Washington D.C. • He lives in Washington D.C.. • Brackets absorb some full stops • John lives in Aberdeen (he used to live in Edinburgh). • John lives in Aberdeen (he used to live in Edinburgh.). • Again realiser can automate

  13. Pouring • Usually we insert spaces between tokens, but not always • My dog • Mydog • I saw John, and said hello. • I saw John , and said hello • Automated by realiser

  14. Pouring • Often want to insert line breaks to make text fit into a page of given width • Breaks should go between words if possible • Breaks should go between words if poss ible • If not possible, break between syllables and add a hyphen • Realiser automates

  15. Output formatting • Many possible output formats • Simple text • HTML • RTF (MS Word) • Realiser can automatically add appropriate markups for this

  16. Summary • Realiser automates the finicky details of language • So NLG developer doesn’t have to worry about these • One of the advantages of NLG

  17. simplenlg classes • Realiser – realise objects • SyntaxPhraseSpec • SPhraseSpec – represents a sentence as verb, subjects, complements • NPPhraseSpec – represents a noun phrase as determiner, modifiers, noun • can be subject or complement in SPhraseSpec • PPPhraseSpec – represents prepositional phrase • Can be modifier in SPhraseSpec, NPPhraseSpec

  18. Realiser API class Realiser { public void Realiser (); // default realiser (text) public void Realiser (boolean HTML); // HTML output ? public String realiser(Object spec); } Usage // create text spec ts Realiser r = new Realiser(); String output = r.realise(ts);

  19. SPhraseSpec example SPhraseSpec p = new SPhraseSpec(); p.addSubject("my dog"); p.addSubject("your cat"); p.addVerb("like"); p.addComplement("balls"); p.setTense(SPhraseSpec.Tense.FUTURE); p.setNegated(true); Results in “My dog and your cat will not like balls.”

  20. NPPhraseSpec example NPPhraseSpec subject = new NPPhraseSpec("dog"); subject.addModifier("big"); subject.addDeterminer("the"); Results in the big dog

  21. PPhraseSpec Example PPPhraseSpec pp = new PPPhraseSpec("in", "the park")); p = new SPhraseSpec("I", "be"); p.addModifier(pp); Results in I am in the park

  22. Lexicon • Lexicon class allows you to directly do morphology • Usually don’t need to call explicitly

  23. Lexicon example Lexicon lex = new Lexicon(); System.out.println(lex.getPlural("child")); System.out.println(lex.getPast("eat")); System.out.println(lex.getPastParticiple("eat")); System.out.println(lex.getSuperlative("happy")); This will print out children ate eaten happiest

  24. Other feature: Lexicaliser • Lexicaliser (different from lexicon) allows you to specify templates which produce SPhraseSpec from Protégé instances. • Won’t discuss here, see doc

  25. Simplenlg • Being developed, currently working in • Referring expressions • Better documentation • Let me know if you have any comments or suggestions!

More Related