1 / 9

Robert's Drawers (and other variations on GRE shared tasks) Gatt, Belz, Reiter, Viethen

Robert's Drawers (and other variations on GRE shared tasks) Gatt, Belz, Reiter, Viethen. Available resources. TUNA Corpus (Gatt et al; ca. 2500 refs) one-shot references balanced 2500 refs to furniture or people Robert's drawers (Viethen and Dale; ca. 140 refs) one-shot references

bapril
Download Presentation

Robert's Drawers (and other variations on GRE shared tasks) Gatt, Belz, Reiter, Viethen

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Robert's Drawers(and other variations on GRE shared tasks)Gatt, Belz, Reiter, Viethen

  2. Available resources • TUNA Corpus (Gatt et al; ca. 2500 refs) • one-shot references • balanced • 2500 refs to furniture or people • Robert's drawers (Viethen and Dale; ca. 140 refs) • one-shot references • not yet balanced • GREC (“GRE in Context”) (Belz and Varges) • 2000 introductory passages from Wikipedia • 1000 annotated, rest in progress • annotated for reference to the main subject (“topic”) • different NP types:subjects, objects, possessives • COCONUT(Jordan) • goes beyond just identification • (possibly another corpus of newspaper texts)

  3. Short-term additions to resources • Add comprehension data: • Carry out experiments to get people to identify referents and pair results with corpus descriptions. Data include: • reaction time • error rate • self-paced reading for GREC-type corpora

  4. Long-term additions to resources • Eye-tracking data • Situated reference in virtual environments (Koller et al, this Workshop) • In progress: small multimodal corpus (Bangerter, van der Sluis, Gatt)

  5. Task definition • Task structure: • provide a data source • have a small set of clearly defined tasks but ALSO: • have an open category • Evaluation: • default metric • call for proposals for evaluation metrics • correlate metrics with human judgments/performance • Scope for variation: • Task: content determination, realisation, lexical choice • Type of reference: full definite, anaphoric, singular/plural • Goal: model production or enhance comprehension

  6. (Sub-)communities • GRE people (the usual suspects) • CoNLL/EMNLP community • Psycholinguists: • advice/expertise • computational psycholinguistic modelling

  7. Aims • “Community” aims: • Have fun! • Get people working together, consolidate the community • Broaden the community • Broader aims: • Have a test-bed to see if NLG STECs actually work • GRE is probably the best initial candidate • Scientific aims: • Hothouse effect • Evaluation: • Use different methods • Evaluate the methods

  8. Execution: Logistics • Dry run to pilot the idea • Possibly at UCNLG (September) • Shared competitive task: Content Determination • singular definites, furniture • Production evaluation, using TUNA • Include a call for evaluation metrics • Also include open track • Main event (larger scale & wider scope) • Co-located with INLG? • Several shared tasks + open category • Evaluation: • Production: match between algorithm & human • Comprehension: ease of identification, etc.

  9. Evaluation: £££ • Sources of expense: • Human evaluations • Adding comprehension data to the corpora • Organisational costs (web site, etc) • Who's paying? • Community effort • Aberdeen platform grant • Brighton Prodigy project funds • No special funding (yet)

More Related