140 likes | 224 Views
DARPA Communicator: The Development of Advanced Dialog Systems Using Open Source Software. Bryan George, Samuel Bayer Presented at July 27, 2001. DARPA Communicator Program Vision. W: I need an early flight to send new computers to Bosnia C: Where from? W: Washington DC
E N D
DARPA Communicator: The Development of Advanced Dialog Systems Using Open Source Software Bryan George, Samuel Bayer Presented at July 27, 2001
DARPA Communicator Program Vision W: I need an early flight to send new computers to Bosnia C: Where from? W: Washington DC C: OK, there’s a Tuesday evening flight out of Andrews arriving 8:38 AM on Wednesday in Frankfurt Germany W: No, I prefer [a flight from Andrews into] Ramstein Germany. C: How about MAC Flight #1296 arriving Ramstein at 10:45AM on Wednesday? W: Is that a C-141 aircraft? C: No, it’s a C-5. W: OK, arrange for transportation on that flight • Remote access to information via spoken mixed-initiative dialogue with context tracking, clarifications and confirmations • Technology focus: dialogue, presentation • Application focus: mobile, military
Language Generation Dialogue Management Text-to-Speech Audio Application Backend Hub Context Tracking Speech Recognition Frame Construction DARPA Communicator and the Galaxy Communicator Software Infrastructure (GCSI) The GCSI, originally implemented by MIT and now maintained, extended and distributed by MITRE, underlies the dialogue systems being developed by Communicator participants
GCSI Design Requirements Flexibility: Embeddability: the infrastructure should be flexible enough to encompass the range of interaction strategies that the various Communicator sites might experiment with the infrastructure should be easy to embed into other software programs Maintenance: the infrastructure should be supported and maintained for the Communicator program Obtainability: the infrastructure should be easy to get and to install Leverage: the infrastructure should support longer-term program and research goals for distributed dialogue systems Learnability: the infrastructure should be easy to learn to use
Hub GCSI Flexibility: Background • Hub and spoke infrastructure • Hub supports scripting, logging • Distributed • Message-based “rec: audio available” Audio audio.mitre.org “rec: audio available” Speech Recognition rec.mitre.org
GCSI Flexibility: Design Benefits • Message-passing means that the hub doesn’t need compile-time knowledge of server APIs (vs. CORBA, e.g.) • Hub scripting allows the programmer to dictate the flow of control of messages • So programs can integrate synchronous and asynchronous servers without modifying the servers themselves • So programmers can insert simple tools and filters to convert data among formats without modifying the servers themselves • Hub script behavior is controlled by the hub state • So programs can easily modify the message flow of control in real time
GCSI Obtainability: Open Source • Open source licensing simplifies software distribution • Puts source code in the hands of researchers while preserving the intellectual property rights of the developers • Open source can simplify commercialization • Joint MIT/MITRE GCSI open source license is MIT X Consortium* (no use restrictions) • Open source infrastructure is a platform for open source components • Contributions from MITRE, MIT, CMU, Colorado... *Plus US Gov’t use rights - see http://communicator.sourceforge.net/download/opensourcelicense.html
GCSI Obtainability: Installation • Resource restrictions impose focus on most common platforms in the Communicator program • Intel Linux • Sparc Solaris • Windows NT • Also known to work or have worked on other configurations (e.g., HP-UX, SGI IRIX, PPC Linux), but these configurations are not supported • Open source supports community action • Programmers have source if they want to enable a new OS (and, we hope, contribute their modifications to the code base)
GCSI Learnability: Training • Communicator program participants have had the option of attending a two-or-three-day introduction to the GCSI at MITRE-Bedford • Building servers • Scripting the Hub • Logging • Building an end-to-end system • Course materials available to program participants for download as a self-guided tutorial
GCSI Learnability: Support Materials • Documentation, in HTML and PDF (400 pages) • Extensive examples • Basic server development • Backchannel audio connections (“brokering”) • GUI embedding • Toy end-to-end dialogue system • At least two sites have succeeded in creating dialogue systems using the GCSI in a short period of time without attending our training course
Hub GCSI Embeddability • Embeddability means • Compatibility with other software packages • Compatibility with external main loops (CORBA, Java Swing, etc.) Swing GCSI • GCSI addresses these concerns • Thread-safe server library with well-defined API, with distinguished symbol prefixes • Event-based programming model implements the default Communicator server loop in C, Python and Allegro Common Lisp GCSI CORBA
GCSI Maintenance • Requirement for prompt support favors in-house development over third-party tools • Bug queue (bugs-darpacomm@linus.mitre.org), feature enhancement surveys for major releases • Enhancements in GalaxyCommunicator 3.0 release • Better simultaneous session management • Message continuations • Improved configuration management support • Memory management improvements • Hub scripting improvements • New XDR-based communications protocol • Major brokering enhancements
Leveraging the GCSI • Exploration of service standards for dialogue components • Delivery platform for readily consumable open-source dialogue components (e.g., audio servers, recognizers, parsers, synthesizers) • MITRE, CMU, Colorado among the Communicator sites planning such releases • Exploration of domain portability issues • Exploration and definition of "best practice" in dialogue system development
Getting Started The GCSI download consists of a scriptable central hub, libraries for constructing compliant spoke servers in C, Java, Allegro Common Lisp, and Python, extensive examples, documentation and sample servers The GCSI is hosted by the MITRE Corporation. It is available directly from MITRE (http://fofoca.mitre.org/download) or via SourceForge (http://communicator.sourceforge.net) For more information on the DARPA Communicator program, visit the DARPA Communicator home page (http://www.darpa.mil/ito/research/com/index.html)