260 likes | 416 Views
Sports Scores Speech Recognition System. Major League Baseball Score System. Development Team Members. Dan Corkum (Director) Jason NguyenTrieu Dan Ragland (Producer) Quang Vu Andrew Wagner. Sponsor: Jim Larson, Intel Corporation. Goals & Objectives.
E N D
Sports Scores Speech Recognition System Major League Baseball Score System
Development Team Members • Dan Corkum (Director) • Jason NguyenTrieu • Dan Ragland (Producer) • Quang Vu • Andrew Wagner Sponsor: Jim Larson, Intel Corporation
Goals & Objectives • Develop a compelling Speech Recognition Application for Retrieval of Sports Information. • Incorporate Ease of Use Techniques including: Tapered Prompts, Global Commands, Barge-In, Repair Dialogs, and others. • Develop an Architecture that is both Robust and Modular. Design for Reuse.
Example Application Cellular Phone Application • Using Wireless Web • Embedded Windows CE (Auto PC)
Core Modules • “Web Viking” – Parse Internet Web Pages to retrieve sports information. • Data Warehousing & Querying – Database for storage of searchable information. • Client and Server Communication – Enables communication between Server and remote Clients. • VUI (Voice User Interface) Voice Prompts and Response System – The core engine that controls the entire VUI. • Dialog Database – Contains the content for the text-to-speech prompts and response criteria.
Web Viking • The purpose of the Web Viking is to retrieve data from web sites, parse and format it into a format so that the database interface can understand it. • There are three data collection scripts: Schedule, Scores, and Standing/Ranking • The data comes from 2 sources: • Major League Baseball • ESPN • Two chances to get the right data: • First, we get data from MLB web site and parse it. If it fails for any reason, we'll try to get data from the ESPN web site.
Web Viking • How is the data retrieved? • We used the library functions available in the CPAN (Comprehensive Perl Archive Network.) • The HTTP::Request module: package up the URL request • The HTTP::Response module: handle the data coming back. • How the data is parsed: • Match and strip off unnecessary data. • Regular expression • Split • Format data and check result.
Database & Queries • The Database was implemented using MS Access. • It functions as a storage site keeping track of team names, scores associated with each team, league/division ranking information, and the schedules for each game. • The Database Handler was written in Java. • Its primary purpose is to query the database and fetch the results to the sport score server.
Client & Server Communication • Being an Internet based application, the server is designed to support multiple clients simultaneously. • Communications is implemented using TCP (Transmission Control Protocol). A secure, reliable, and widely used Internet protocol. • The maximum number of clients supported by the Sports Score server is administrator configurable based on the performance needs of the server.
Client & Server Communication • Both server and client-side communications are data independent. • Data is encapsulated in a packet before transmission. Data wrapper contains information pertaining to what type of data is encapsulated, and it’s size. • Data packeting allows for multiple information types (ping, data request, communications termination, etc…) • Labeling each packet with a type allows for quick identification and routing of information to necessary destinations within the server/client.
VUI (Voice User Interface) Voice Prompts and Response System User Interface and Underlying Logic
VUI Design Considerations Two Options For Design: 1. Dialog logic coded directly into code. 2. Dialog logic entered into a data structure and presented by separate internal logic.
Fast initial implementation Ultimate flexibility of features Duplicated code Difficult to provide consistent global functionality Hard-coded grammars VUI Advantages & Disadvantages of Hard-Coded Dialogs
Good design: Data separated from presentation Consolidation of code Easy to create and maintain dialogs Features aided by use of recursion Computer-generated grammars Much work required before any results seen Difficult to customize specific components VUI Advantages & Disadvantages of Dialog Database
VUI Decision: Dialog Database • Sports Score dialogs all follow the same basic pattern • Implementation could be modularized by separating the dialogs from their presentation logic • The gains made by the ease of entry and flexibility for the end-user outweighed the losses in implementation time • Some features require recursion
VUI VUI Features • Tapered, User-Level Sensitive Prompts • Tapered, User-Level Sensitive Help • Barge-In capability • User shortcut capability (users can answer future prompts from any prompt) • Navigational user commands (“back”,”quit”,etc) • Enumerated user commands to allow the user to say a number as an alternative to the command
VUI Queries • All query parameters are accumulated in an XML document • When a query occurs, the document is sent to the server • The server returns an XML document containing results • The results are read to the user based on administrator-defined result strings
Why XML? • XML is fast becoming the industry standard for data transfer over the Internet • XML’s hierarchical structure lends itself to this application • Several XML parsers already exist for various platforms (we used IBM’s XML4J) • The HTML-like nature of XML makes results easy to read, even for a human.
How Query Results Are Read • The administrator defines parameter-value pairs as criteria for which response is read • Each response consists of segments of literal text along with parameter values (which can be drawn either from the client or server)
VUI The Results • The front-end is very customizable • Dialogs can be built simply and quickly • The system administrator needs no knowledge of programming concepts • The overall behavior of the system could be changed without changing each prompt • The computer speech engine is accessed in only one area of code, so it could be swapped with minimal effort
Dialog Structure • The Dialog System consists of: • Prompts • Responses • Help System • All Dialogs are tapered (Prompts, Responses, & Help) • Repair Dialogs – Example: Two teams from same city (New York Mets and Yankees)
Summary • We not only developed a powerful Speech Recognition Application for Retrieval of Sports Information, we also developed a reusable framework which can be easily modified for use in other applications. • We incorporated Ease of Use Techniques including: Tapered Prompts, Global Commands, Barge-In, Repair Dialogs, and others.
More Information is available on the Web:http://www.cs.pdx.edu/~danr/public/capstone/