200 likes | 301 Views
Sensory – Leader in Speech Technologies. PRESENTATION NAME. Sensory Background. Profitable, VC Backed, Private Company Founded in 1994 Market Position Sensory is #1 in Consumer Embedded Speech Recognition for Command and Control
E N D
Sensory – Leader in Speech Technologies PRESENTATION NAME
Sensory Background • Profitable, VC Backed, Private Company Founded in 1994 • Market Position • Sensory is #1 in Consumer Embedded Speech Recognition for Command and Control • Sensory has 85% Market Share of Dedicated Speech Recognition IC’s • Sensory is #1 to provide speech recognition in Bluetooth headset • Small and Medium Footprint Embedded Software Solutions • Unique, Patented Neural Network and HMM Technologies
Telecom Automotive Consumer Toy Other SHUAIXIAN AVON 50,000,000 Products Shipped More Customers, More Products, More Experience!
What Does Sensory Sell? • Chips • Low cost, low power, 8-Bit Micro’s & DSP’s with: • Speech recognition (speaker independent & dependent) • Speaker verification (voice biometrics!) • Speech & music synthesis • Voice record, voice morphing, pitch detect, beat detect… • Software/Technology • Speech control of Bluetooth headset • Continuous digit dialing on ARM9, etc. • Name dialing up to 1000’s of SI names • Speaker verification for biometric phone security
“Any product with a user interface can and will have a voice user interface”
Sensory’s Suite of Audio Technologies Recognition Synthesis Enhancement Speaker independent Speaker dependent Word spotting Continuous digits Speaker verification SingBack/BarkBack Speech compression Text to speech Music synthesis Voice record & play Pitch/tempo control Voice morphing Real Time Lip Synch Noise cancellation Noise reduction Animated speech Beat detection Sound sourcing Pitch detection Sonic Network Sensory has 10 patents issued and 11 more applied for
REAL Products the RSC 4xx has been designed into • Automotive Stereos • Cordless Phones • Hands Free Kits • Automotive security • Remotes • Light Switches • Ceiling Fans • Weather Stations • Sunroof control • Robots • Lamps • Toys • Educational Products • Seasonal Products • Novelty Products • Clocks • Hand Held Dialers and Organizers • Answering Machines
Languages Supported • American English • Spanish • German • Italian • UK English • French • Japanese • Korean • Mandarin Other languages can be created to fit customer needs.
Fluent Soft • Speech Recognition • Hybrid Engine (NN and HMM) • Small Footprint • Dynamic Vocabulary Build • Continuous Digits • Text-to-Speech • Micro-footprint TTS for name and command only solution • Open TTS • Hybrid speech playback engine
Fluent Soft Recognizer • Features and technologies • Smallest footprint solution w/fast response time • Minimal horsepower: 10-50 MIPS implementations • Minimal RAM: 4-10KB Code RAM implementations • Minimal ROM/Flash: 20KB- 200KB implementations • Text To Recognition (TTR) Vocabulary Generation • Speaker Independent = NO training • Dynamically build recognition sets on fly on the device through syncing with your PC, or with text • Probabilistic phonetic dictionary based on the rules of the language provide scalable capabilities • Scalable Speaker Independent Vocabularies • Up to 10k words or 4k phrases / names per set • n-Best for large recognition sets • Natural, Continuous Digits • Word Spotting for natural and seamless UI
Hardware Supported • Many Options Available • Processors • ARM9 • ARM7 - TBD • Intel XScale & StrongARM • TI OMAP Both ARM9 and DSP cores supported • Bridge support for the OMAP too • Hitachi SH3 & SH4 • Motorola Dragonball i.MX • Motorola MCORE 210 & 310 • Custom Porting Available
O/S Supported • Operating Systems • Symbian • Monta Vista Embedded Linux • Microsoft – PC/x-86, WinCE, WinMobile, .NET, Smartphone 2002 & 2003 • Palm 5x and 6x • Custom Porting for proprietary Operating System Available • Customer Porting for no Operating System too! • Memory Requirements & MIPS • Highly dependent on memory access speed, processor overhead, and features desired
Sensory and Bluetooth Headsets • How does the Speech Recognition Work? • Button press or software trigger recognizer to listen in constrained window • Saves battery life • Improves accuracy • Speech recognizer engine utilizes the best of Hidden Markov Modeling and Neural Networks • Small footprint engine requires • Minimal horsepower: 10-50 MIPS implementations • Minimal RAM: 4-10KB Code RAM implementations • Minimal ROM/Flash: 20KB- 200KB implementations • Integrated synthesis for speech response and prompting • ADPCM, CELP, MELP, SX, and other implementations • Datarates of 32Kbps down to 1 Kbps • Text to speech implementations available
Bluetooth Headsets w/CSR BC-5 Now Can Have Speech I/O Abilities! • Headset market is in danger of being commoditized • Basic connectivity offered by all players • Gross margins will suffer for those without differentiation • Negotiating leverage with major OEMs will decrease • Differentiation has been limited largely to “Style” • More interesting plastic design, color, etc - not a defensible differentiator • Most headsets are not user friendly • Increased feature set making matters worse • Confusing button push sequences and holding times • LED or “Beep” status & prompt messages are not understood by many users • Differentiation possible by more appealing Features and by superior User Friendliness! • Voice interface is the only way to do this! • Small product with mic & speaker – no room for more buttons! Challenge – How do you add value & features without changing form factor? Answer – Sensory’s Speech I/O
Bluetooth Headset Value Added Features through Voice I/O • Basic access to information • Press button & make requests • One button many functions • What can I say? • Handsfree control • Accept/reject incoming calls • Respond to prompts • What extension? • Voice Dialing • Speed dial for favorites • last number redial • Continuous digit dialing • Dialing by name • Voice prompting instead of “beeps” • Confirmation of incoming and outgoing phone numbers • Natural sounding voices available in any language • Easier to interpret than beeps and flashes
Alpha Headset Usage Model • Basic Operation - Just Hit the Button and Say: • Pair Mode – to make initial phone connection • Battery Check – to check battery level • What Can I Say? – Lists all available commands • Am I Connected? – Tells connection status • Redial – Calls last outgoing call • Call back – Calls last received call • More Voice Dialing. Hit the SAME button and say: • Call Home • Call Office • Call Voicemail • Call Favorite 1 • Call Favorite 2 • Call Favorite 3 • Call Favorite 4 • Call Favorite 5 • Handsfree Receiving Incoming Calls • “Call from 408 625-3300, Would you like to accept?” • YES/NO ACCEPT/REJECT Focus on high value low risk features
With Voice Recognition for user input… • User Input much more intuitive, and dramatically wider range of features enabled • “Am I Connected” • “Pair Mode” • “Call Home” • “Call VoiceMail” • “Check Battery Level” • “Adjust Bass”, “Adjust Treble” • “Play Music”, “Skip Song”, etc • “Conference” (calls together) • “Help” or “What can I say?” • With voice interface: adding more input options does not increase confusion on button pushes • Removes the key barrier to having a roadmap of increased features – keeps market from commoditizing!
With Voice Synthesis for messaging to user • With voice synthesis Headset response is • much more useful to address a wide range of information • unlikely to be confused as to message meaning • able to message more specificity than beeps/LEDs • Examples • “Incoming call from 408-297-9729, do you want to accept?” • “Your headset is connected” • “Battery Level high” • “Calling Last Outgoing Call” • “Call Terminated” • “Use the up/down button to adjust tone” • “Headset Mute in on” • “You are in Pair Mode” • “Your Dialing commands are Call Favorites 1-5, Call Home, Call Voicemail,etc…” • Etc. • Good voice messaging design will reduce tech support calls from users not understanding their headset modes/beeps/LED flashes.
Voice Interface Headsets enable new business models & compelling network features • Voice commands enable simple handsfree access to key services • “Call Verizon”, “Call T Mobile”, etc – prebuilt for headsets sold through carrier to have simple carrier support access • “Call Information” for automatic dial of 411 which is major revenue generator for carriers • Makes YOUR headset more attractive to carrier since it is legitimate revenue increasing enabler for them. • Simple access to major 3rd party services enable marketing co-promotion and revenue generation from headset sales • “Call Goog411”, “Call TellMe”, etc. enables partnerships for headset OEMs and enhanced services for user
Thanks! Contact Information: Erika Fratzke Sales Associate Sensory, Inc. 503 546 6378 x212 efratzke@sensoryinc.com