160 likes | 268 Views
Voice-enabled Image Identification System Design. Aashish P. Shrestha Ming Ming Zheng Multimedia Signal Processing , University of Bridgeport, Connecticut Prof. B. Barkana Spring 2009. Introduce.
E N D
Voice-enabled Image Identification System Design Aashish P. Shrestha Ming Ming Zheng Multimedia Signal Processing, University of Bridgeport, Connecticut Prof. B. Barkana Spring 2009
Introduce • Voice-enabled Application is widely use at this modern time. Basically, it is a sub area of speech recognition. The task of a voice-enabled application is that let machine accept and recognize your command through normal human’s voice.
The overview • (SAPI) Speech Recognition Engine • Voice signal processing • Image Identification • System Design • System Performance • Conclusion
Speech Recognize Engine • Microsoft Speech Application Programming Interface (SAPI): Microsoft provides a speech recognition engine in the SAPI, this engine can transfer prospective human’s voice into text by comparing the input voice with the voice database. Also, it can transfer the text into human’s voice.
Voice Signal Processing Three main classes used in the SpSharedRecoContext interface: • ISpEventSource: handle the start point of speech signal • GetRecognizer: Returns a reference to the current recognizer object associated with context . • ISpeechRecoResult: Return a compared value between input voice and the voice from speech engine.
Image Identification • We preset the image value by its file name. Then the system will get the file name as key word. Finally, save it into the database. • Final output from individual speech results to Image as spoken. • Example: Select “Apple”
Recognizer User’s Voice 2 Speech Engine SAPI (Voice Data) 2 3 1 Image Data Base Back-end Admin module 4 Output Architecture Flow System Design
System Design • In the first stage, the speech engine will initialize and load the voice data according to the database. The database is where we store the information of pictures. • Secondly, users can input their voice by proper way. If the input voice matches the voice data in the speech engine, the system will go to step three, and show the proper image. Meanwhile, the system will reflect the text and speak it out using system voice. We indicate this step as step four.
System Requirement • Hardware: PC with speakers and microphone. • Software: Window 2000/XP/VISTA, Microsoft Access, Microsoft SAPI V5.1, C#.net
System Maintenance • A back-end Database Admin Module: • Add a Picture
System Maintenance • Edit or Delete items:
Demonstration • We will demonstrate our system.
Advantage and Drawback Advantages: 1. Accuracy 2. Fast 3. Robust Drawback: Sometime easily affect by the noise environment
Conclusion From this project, we can see, the voice-enabled application is robust and reliable. It has been used in the market for about two decades. The voice command also can easily be integrated with other applications, which involve in any touch-free command.