370 likes | 378 Views
From the Lab to Ubiquity Speech Technology’s Road to Mainstream. Eric Chang, Ph.D. Assistant Managing Director MSR Asia Advanced Technology Center. Impact of Disruptive Technology. 9/3/2004, 6:03 AM, Pacific Green Bay. Human Language Capability. Technology Adoption Lifecycle.
E N D
From the Lab to UbiquitySpeech Technology’s Road to Mainstream Eric Chang, Ph.D. Assistant Managing Director MSR Asia Advanced Technology Center
Impact of Disruptive Technology 9/3/2004, 6:03 AM, Pacific Green Bay
Technology Adoption Lifecycle • Successful technology adoptions increase exponentially • But it’s not a smooth process, there is a “chasm”* *Geoffrey Moore, Crossing the Chasm
Technology Adoption Lifecycle Visionaries Early Adopter Early Majority Late Majority Laggards
Technology Adoption Lifecycle • Visionaries • Early Adopters • Early Majority • Late Majority • Laggards
Visionaries • Technology for technology’s sake • Technology fans • Won’t be a major market
Early Adopter • Adopts technology for its benefits • Not afraid to try something new
Early Majority • Will not be the first to adopt a new technology • Practical and utilitarian • Heavily influenced by what other people are doing • Strong “viral” effect
Late Majority • Will adopt a technology only when necessary • By this stage, a few dominant technology providers have emerged.
Laggards • Suspicious of new technology • Can get alone fine without new technology
Discontinuous Technology • Not a simple extension of repackaging of existing technology • Automobile replacing horse and carts • Telephones replacing telegraphs
Technology Adoption Lifecycle Visionaries Chasm Early Adopter Early Majority Late Majority Laggards
Into the Tornado Visionaries Tornado Early Majority Late Majority Early Adopter Laggards
Technology Adoption Lifecycle • Visionaries: Speech Search & Transcription • Early Adopters: Dictation • Early Majority: Call Center Automation • Late Majority: Leapfrog • Laggards
Technology Adoption Lifecycle • Visionaries: Speech Search & Transcription • Early Adopters: Dictation • Early Majority: Call Center Automation • Late Majority: Leapfrog • Laggards
Technology Adoption Lifecycle • Visionaries: Speech Search & Transcription • Early Adopters: Dictation • Early Majority: Call Center Automation • Late Majority: Leapfrog • Laggards
Early Adopter: Dictation • Continuous dictation first sold in 1996 • Shelfware • English, Chinese, and Japanese dictation in Office XP in 2001 • Current status • Vital for people who need it • No viral effect yet • Difficulties in handling speaker accent, wearing microphone, individual vocabulary
Language is a live medium • Dalian: 腕狭子 • Japan: WaiShaTsu • England: White Shirt • US: Suit • Japan: SeBiRoo • England: Saville Row
Technology Adoption Lifecycle • Visionaries: Speech Search & Transcription • Early Adopters: Dictation • Early Majority: Call Center Automation • Late Majority: Leapfrog • Laggards
Cost Satisfaction Productivity Revenue The Business Value of Speech for Call Centers $5/call to $.20/call Reduced Call Time Fewer Agents Less Time in Queue Increased System Usage Customer Retention Customer Focus Less Time/Call Efficient Agents New Revenue Opportunities Up-Sell/Cross-Sell
Cost Satisfaction Productivity Revenue Call Center Examples • Merrill Lynch • Automation rates from 82% to 90% • First Year Savings $6.3M • Amtrak • 61% Increase in Satisfaction • 75% Increase in Automation Rate • 90% Increase in Ticket Sales • ThriftyCar Rental • 40% increase in CSR productivity • $1 million first year savings
The Business Value of Speech for Operators The mobile operators need to make money from value-added services! Revenue In US$M
Calendaring / Email Location Based E-Commerce / Alerts Places: Auto Services Speech SMS / MMS Voice Dialing Search/Browsing Speech Makes Value-Added Services Usable
ASP.NET Speech Speech SDK Microsoft Speech Server & SDK • Extends ASP.NET and Visual Studio • Call center + multimodal solution • Unifies web & call center • Reduces TCO • Introduced in March 2004 • Strong partner ecosystem • Strategic partnerships with Intel, Intervoice, Scansoft • 200+ Beta Program Applications • 300+ Partner Applications • 30,000 Speech SDK Users
Human Error Rate Speech Recognition: Approaching Human Error Rate Microsoft licensed CMU Sphinx-II Whisper in MSR Speech in Office XP Speech in Tablet/Office 11 Speech in Longhorn
Human Naturalness Text to Speech Approaching Human Naturalness Naturalness
Technology Adoption Lifecycle • Visionaries: Meeting Transcription • Early Adopters: Dictation • Early Majority: Call Center Automation • Late Majority: Leapfrog • Laggards
What is Leapfrog? $120 $50 $15
Leapfrog’s Technology • Crucial technology • Speech compression technology • Using simple touch sensitive screen with new books and speech coding chip. • Sound business model • >80% highly recommendation rate in Amazon • Invented by a lawyer trying to teach his 3 year old child to read!
Technology Adoption Lifecycle:Late Majority, Case of Leapfrog
Summary • Disruptive technology adoption follows the technology adoption lifecycle • Natural language understanding is hard • Domain-free reasoning & common sense hardest • Truly human-level understanding likely elusive • Adoption of speech technology will increase • 2-3 years: telephony, multimodal, accessibility. • 7-10 years: intelligent assistance, meeting search/transcription, speech everywhere.
Acknowledgement • Faculty Forum Speech, 2003, Kai-Fu Lee • Crossing the Chasm, Geoff Moore • Into the Tornado, Geoff Moore Thanks! echang@microsoft.com http://Research.microsoft.com/users/echang