1 / 27

VOICE USER INTERFACE

VOICE USER INTERFACE. BY R.SELVI K.PRIYA I-MCA. INTRODUCTION. voice portal can be defined as “speech enabled access to Web based information”. In other words, a voice portal provides telephone users with a natural language interface to access and retrieve Web content. Voice based

tilly
Download Presentation

VOICE USER INTERFACE

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. VOICE USER INTERFACE BY R.SELVI K.PRIYA I-MCA

  2. INTRODUCTION • voice portal can be defined as “speech • enabled access to Web based information”. In other words, a voice portal provides telephone users with a natural language interface to access and retrieve Web content.

  3. Voice based internet access uses rapidly advancing speech recognition technology to give users any time, anywhere communication and access-the Human Voice- over an office, wireless, or home phone.

  4. Why Voice? • Natural speech also has an advantage as • an input modality, allowing for hands-free and eyes-free use. With • proper design, voice commands can be created that are easy for a user to • remember • Speech can be used to create an interface that is • easy to use and requires a minimum of user attention.

  5. VUI (Voice User Interface) • Automated speech recognition (ASR) is the • engine that drives today's voice user interface (VUI) systems. These let • customers break the 'menu barrier' and perform more complex • transactions over the phone. • The best VUI systems are "speaker independent"-they understand • naturally spoken dialog regardless of the speaker.

  6. ASR systems can be enabled with voice authentication, • eliminating the need for PINs and passwords. • the 'voice Web' is evolving, where • browsers or Wireless Application Protocol (WAP)-enabled devices • display information based on what the user vocally asks for. • Good VUI systems have multiple fallback strategies for speech • recognition failure, such as asking callers to spell names or breaking a • question into parts.

  7. VoiceXML is a markup language • specification that defines the key elements of voice-enabled transactions, • which also allows repurposing of this data across any number of • platforms and systems

  8. Next Generation Network Services • It is the next step in the world • communications traditionally enabled by three separate networks: the • public switched telephone network (PSTN) voice network, the wireless • network and the data network (the Internet). NGNs converge all three of • these networks-voice, wireless and internet- into a common a common • packet infrastructure.

  9. Three types of services drive NGNs: real-time and non real-time • communication services, content services, and transaction services. • The • services-driven NGN gives service provides greater control, security, • and reliability while reducing their operating costs. Service providers • can quickly and costs effectively build new revenue.

  10. Next-generation Network Benefits •  More choices-Open systems promote multiple-vendor • participation in the marketplace. •  Lower cost-Solutions built on open standards cut • development time for solution providers. •  Innovative Services- Working to standards frees developers • to concentrate on adding value. •  Lower Risk- Compatibility with products and technologies • from many suppliers decreases the risk of ownership.

  11. Deploying NGN Services • The NGN begins with media servers, which provide advanced • media-processing capabilities. The value of a media server is its • flexibility for supporting advanced media-processing services like basic • voice announcements, interactive voice response (IVR), conferencing, • messaging, text-to-speech (TTS), and speech recognition.

  12. Voice Recognition and AuthorizationSpeaker Verification • The speaker-specific characteristics of speech are due to • differences in physiological and behavioral aspects of the speech • production system in humans. The main physiological aspect of the • human speech production system is the vocal tract shape.

  13. The vocal tract modifies the spectral content of an acoustic wave • as it passes through it, thereby producing speech. Hence, it is common in • speaker verification systems to make use of features derived only from • the vocal tract.

  14. It is possible to represent the vocal-tract in a parametric form as the • transfer function H(z). In order to estimate the parameters of H(z) from • the observed speech waveform, it is necessary to assume some form for • H(z). • Figure 3 illustrates the differences • in the models for two speakers saying the same vowel. • Figure 3:

  15. Speaker Modeling : • The purpose of voicemodeling is to build a model that captures these variations in the extracted set of features. There are two types of models that have been used extensively in speaker verification and speech recognition systems: • Stochastic models and template models. • The stochastic model treats the speech production process as a parametric random process and assumes that the parameters of the underlying stochastic process can be estimated in a precise, well defined manner. The template model attempts to model • the speech production process in a non-parametric manner by retaining a number of sequences of feature vectors derived from multiple utterances of the same word by the same person.

  16. A verypopular stochastic model for modeling the speech production process is • the Hidden Markov Model (HMM). HMMs are extensions to the conventional Markov models, wherein the observations are a • probabilistic function of the state, i.e., the model is a doubly embedded • stochastic process where the underlying stochastic process is not directly • observable (it is hidden).

  17. For speech signals, another type of HMM, called a left-right model • or a Bakis model, is found to be more useful. A left-right model has the • property that as time increases, the state index increases (or stays the • same)-- that is the system states proceed from left to right.

  18. Pattern Matching: • The pattern matching process involves the comparison of a given • set of input feature vectors against the speaker model for the claimed • identity and computing a matching score.

  19. Speech Recognition Challenges • The basic challenges faced by voice application • developers included the following: • 1. Variability of speech patterns: • in many different ways. • Processing power: • Extracting meaning:

  20. Conclusion. • Speech-enabled Internet portals or voice portals are quickly • becoming the hottest trend in E-commerce – broadening the access to • internet content to everyone with the most universal communications • device of all, a telephone. • The potential for voice portals is as wide as the reach of • telephones, which today number 1.3 billion around the world. • A voice portal provides telephone users with a natural • language interface to access and retrieve web content.

  21. A voice portal can also • provide users access to virtual personal assistance and web-based unified • messaging applications • Voice portals can also cut operating expenses by freeing up agent • time and replacing human operators with an easy-to-use automated • solution.

  22. The voice portal reference system is a packaged, integrated • hardware and • Software reference system for building hardened E-Business and • speech-enabled voice portal solutions.

  23. QUERIES ?????????????

  24. THANK YOU

More Related