1 / 35

Collaborative Annotation of the AMI Meeting Corpus

Collaborative Annotation of the AMI Meeting Corpus. Jean Carletta University of Edinburgh. AMI Partners. NXT Major Development Sites. AMI's aim. aim: to develop technologies for browsing meetings and to assist people during meetings

cathy
Download Presentation

Collaborative Annotation of the AMI Meeting Corpus

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Collaborative Annotation of the AMI Meeting Corpus Jean Carletta University of Edinburgh

  2. AMI Partners

  3. NXT Major Development Sites

  4. AMI's aim • aim: to develop technologies for browsing meetings and to assist people during meetings • interdisciplinary: signal processing, language engineering, theoretical linguistics, human-computer interfaces, organizational psychology, ...

  5. Why annotation? • For basic scientific understanding - e.g., • How do people choose a next speaker? • What is the relationship between speech and gesture during deixis? • For machine learning • Hand-code e.g. statement vs. question • Identify features for each like word sequences and prosody • Use the data to fit a statistical classifier that codes new data automatically

  6. AMI Meeting Rooms 4 close- and 2 wide-view cameras, 4 head-set and 8 array microphones, presentation screen capture, whiteboard capture, pen devices, plus extra site-dependent devices TNO Edinburgh IDIAP

  7. IS1004d, 3:07 - 4:11

  8. Corpus Overview • 100 hrs of well-recorded meetings • orthographically transcribed with word timings by forced alignment • ASR output • heavily annotated by hand for communicative behaviours • Creative Commons Share-Alike licensing, with demo DVD

  9. Hand Annotations • transcription with word-level timings from forced alignment (100%) • timestamping against signal (10-30%) • head gestures; hand gestures for addressing and interactions with objects; location in room; gaze; emotion? • discourse structure (70%) • dialogue acts (some w/ addressing), named entities, topic segments, linked extractive and abstractive summaries

  10. Costs in person-hrs/hr

  11. Core Problems • How do we represent all of these kinds of annotation on the same base data, including both structural relationships and timing? • How do we allow for multiple (human and machine) annotations of the same property, so that we can compare them?

  12. NITE XML Toolkit • Mature toolkit for handling annotations with temporal ordering and full structural relations • Data storage format designed to support distributed corpus development • Libraries for data handling, query, and writing graphical user interfaces • End user annotation tools for common tasks • Command line utilities for analysis, feature extraction • Open source

  13. NXT corpus design • data model is multi-rooted tree with arbitrary graph structure over the top • each node has one set of children, multiple parents • annotations often naturally map to a tree • corpus design to decide where trees intersect • NXT can represent arbitrary graphs but the more the data has this character, the less useful the query language is

  14. Stand-off XML extract from Bdb001.A.speech-quality.xml <speechquality nite:id="Bdb001.emphasis.16" type="emphasis"> <nite:child href="Bdb001.A.words.xml#id(Bdb001.w.1,342)..id(Bdb001.w.1,344)" /> </speechquality> extract from Bdb001.A.words.xml <w nite:id="Bdb001.w.1,342" starttime="356.39" endtime="" c="W">time</w> <w nite:id="Bdb001.w.1,343" starttime="" endtime="" c="HYPH">-</w> <w nite:id="Bdb001.w.1,344" starttime="" endtime="356.59" c="W">line</w>

  15. Metadata file Like set of DTDs for the XML files plus: • connections between the files • list of "observations" (coded dialogues/group discussions/texts) • catalog for finding signals and data on disk

  16. Simple example query ($w word)($r reference): ($w@POS = “NN”) && ($r ^ $w) Return list of 2-tuples of words and referring expressions where the word’s part of speech is NN and the word is in the referring expression.

  17. General features of the language • Match variable by no type, single type, or disjunctive type • Attribute and content tests for existence, ordering, equality, match to regexp • The usual boolean combinators • Quantifiers forall and exists • Filtering by passing results to another query to create a result tree (not list)

  18. Uses for queries • Exploring the data in a browser • Basic frequency counts • Verifying data quality • Indexing complexes for further use • Finding things for screen rendering in GUI

  19. Only configuration needed to: • search/index data in NXT format • display data in a standardized (ugly) way • Set up annotation tools for some common tasks • dialogue act • named entity • time-stamped labelling

  20. [named entity demo]

  21. Programming tailored interfaces • development time is 1.5 days - 2 weeks depending on • how clear the spec is • complexity of the interface and whether our "transcription view" middleware fits • familiarity with Swing

  22. Named entity coder

  23. Summary • NXT provides infrastructure for collaborative annotation that • Is distributed • Provides structural relationships • Provides timing w.r.t signals • Works for large-scale projects • NXT’s best current demonstration is in the AMI Meeting Corpus

More Related