1 / 26

Sports Scores Speech Recognition System

Sports Scores Speech Recognition System. Major League Baseball Score System. Development Team Members. Dan Corkum (Director) Jason NguyenTrieu Dan Ragland (Producer) Quang Vu Andrew Wagner. Sponsor: Jim Larson, Intel Corporation. Goals & Objectives.

gema
Download Presentation

Sports Scores Speech Recognition System

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Sports Scores Speech Recognition System Major League Baseball Score System

  2. Development Team Members • Dan Corkum (Director) • Jason NguyenTrieu • Dan Ragland (Producer) • Quang Vu • Andrew Wagner Sponsor: Jim Larson, Intel Corporation

  3. Goals & Objectives • Develop a compelling Speech Recognition Application for Retrieval of Sports Information. • Incorporate Ease of Use Techniques including: Tapered Prompts, Global Commands, Barge-In, Repair Dialogs, and others. • Develop an Architecture that is both Robust and Modular. Design for Reuse.

  4. Example Application Cellular Phone Application • Using Wireless Web • Embedded Windows CE (Auto PC)

  5. Core Modules • “Web Viking” – Parse Internet Web Pages to retrieve sports information. • Data Warehousing & Querying – Database for storage of searchable information. • Client and Server Communication – Enables communication between Server and remote Clients. • VUI (Voice User Interface) Voice Prompts and Response System – The core engine that controls the entire VUI. • Dialog Database – Contains the content for the text-to-speech prompts and response criteria.

  6. Architecture - Server

  7. Architecture - Client

  8. Web Viking • The purpose of the Web Viking is to retrieve data from web sites, parse and format it into a format so that the database interface can understand it. • There are three data collection scripts: Schedule, Scores, and Standing/Ranking • The data comes from 2 sources: • Major League Baseball • ESPN • Two chances to get the right data: • First, we get data from MLB web site and parse it. If it fails for any reason, we'll try to get data from the ESPN web site.

  9. Web Viking • How is the data retrieved? • We used the library functions available in the CPAN (Comprehensive Perl Archive Network.) • The HTTP::Request module: package up the URL request • The HTTP::Response module: handle the data coming back. • How the data is parsed: • Match and strip off unnecessary data. • Regular expression • Split • Format data and check result.

  10. Database & Queries • The Database was implemented using MS Access. • It functions as a storage site keeping track of team names, scores associated with each team, league/division ranking information, and the schedules for each game. • The Database Handler was written in Java. • Its primary purpose is to query the database and fetch the results to the sport score server.

  11. Client & Server Communication • Being an Internet based application, the server is designed to support multiple clients simultaneously. • Communications is implemented using TCP (Transmission Control Protocol). A secure, reliable, and widely used Internet protocol. • The maximum number of clients supported by the Sports Score server is administrator configurable based on the performance needs of the server.

  12. Client & Server Communication • Both server and client-side communications are data independent. • Data is encapsulated in a packet before transmission. Data wrapper contains information pertaining to what type of data is encapsulated, and it’s size. • Data packeting allows for multiple information types (ping, data request, communications termination, etc…) • Labeling each packet with a type allows for quick identification and routing of information to necessary destinations within the server/client.

  13. VUI (Voice User Interface) Voice Prompts and Response System User Interface and Underlying Logic

  14. VUI Design Considerations Two Options For Design: 1. Dialog logic coded directly into code. 2. Dialog logic entered into a data structure and presented by separate internal logic.

  15. Fast initial implementation Ultimate flexibility of features Duplicated code Difficult to provide consistent global functionality Hard-coded grammars VUI Advantages & Disadvantages of Hard-Coded Dialogs

  16. Good design: Data separated from presentation Consolidation of code Easy to create and maintain dialogs Features aided by use of recursion Computer-generated grammars Much work required before any results seen Difficult to customize specific components VUI Advantages & Disadvantages of Dialog Database

  17. VUI Decision: Dialog Database • Sports Score dialogs all follow the same basic pattern • Implementation could be modularized by separating the dialogs from their presentation logic • The gains made by the ease of entry and flexibility for the end-user outweighed the losses in implementation time • Some features require recursion

  18. VUI VUI Features • Tapered, User-Level Sensitive Prompts • Tapered, User-Level Sensitive Help • Barge-In capability • User shortcut capability (users can answer future prompts from any prompt) • Navigational user commands (“back”,”quit”,etc) • Enumerated user commands to allow the user to say a number as an alternative to the command

  19. VUI Queries • All query parameters are accumulated in an XML document • When a query occurs, the document is sent to the server • The server returns an XML document containing results • The results are read to the user based on administrator-defined result strings

  20. Why XML? • XML is fast becoming the industry standard for data transfer over the Internet • XML’s hierarchical structure lends itself to this application • Several XML parsers already exist for various platforms (we used IBM’s XML4J) • The HTML-like nature of XML makes results easy to read, even for a human.

  21. How Query Results Are Read • The administrator defines parameter-value pairs as criteria for which response is read • Each response consists of segments of literal text along with parameter values (which can be drawn either from the client or server)

  22. VUI The Results • The front-end is very customizable • Dialogs can be built simply and quickly • The system administrator needs no knowledge of programming concepts • The overall behavior of the system could be changed without changing each prompt • The computer speech engine is accessed in only one area of code, so it could be swapped with minimal effort

  23. Dialog Structure • The Dialog System consists of: • Prompts • Responses • Help System • All Dialogs are tapered (Prompts, Responses, & Help) • Repair Dialogs – Example: Two teams from same city (New York  Mets and Yankees)

  24. Dialog Structure Overview

  25. Summary • We not only developed a powerful Speech Recognition Application for Retrieval of Sports Information, we also developed a reusable framework which can be easily modified for use in other applications. • We incorporated Ease of Use Techniques including: Tapered Prompts, Global Commands, Barge-In, Repair Dialogs, and others.

  26. More Information is available on the Web:http://www.cs.pdx.edu/~danr/public/capstone/

More Related