1 / 20

IBM India Research Laboratory Overview with an effort to be in the context of FIRE

Learn more about IBM India Research Laboratory, the second largest population of IBM outside the US, and their focus areas and technical competencies.

minniee
Download Presentation

IBM India Research Laboratory Overview with an effort to be in the context of FIRE

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. IBM India Research Laboratory Overviewwith an effort to be in the context of FIRE Debapriyo Majumdar (debapriyo@in.ibm.com) IBM India Research Lab, Bangalore

  2. IBM Research - Overview • The largest private research institution in the world • Annual R&D budget of around $6B (includes development as well) • Over 3,000 researchers • Mathematics, Computer Science, Physics, Service Science, … • Over 40,000 US patents since 1993 • Most patents of all companies in the world in the last 15 years • Eight labs across the world

  3. Zürich Beijing Almaden Watson Columbia University Established: 1961 Watson San Jose Established: 1956 Established: 1945 Established: 1995 Established: 1986 Established: 1952 Tokyo Austin Haifa India–DEL/BLR Established: 1982 Established: 1995 Established: 1972 Established:1998/2005 IBM Research Labs Worldwide

  4. IBM India and India Research Lab • IBM India - Second largest population of IBM outside the US (over 75,000) • Current technical population 40,000+ • India Research Lab • Delhi, since 1998 • Bangalore, since 2005 • About 150 technical people . Delhi . Kolkata . . . Mumbai Pune Hyderabad . . Chennai Bangalore

  5. IRL Focus Areas Business Areas Service Delivery Emerging Solutions Software Infrastructure Services Application Services Contact Center Services Telecom Others (Banking, etc.) Systems Technical Competencies • Computer Science • Distributed Systems – systems mgmt., middleware • Information Management – IE, Data mining • Interaction Technologies – speech • Programming Technologies – parallel and hi-perf. prog. • Software Engineering – model-driven, distributed dev. • Math Science • Operations Research • Algorithms • Optimization • Game Theory • Service Science • Service Engineering • Service Productivity • Service Management • Service Quality • Service Supply Chains

  6. Why do we care?Services dominate the world’s GDP… Japan United States China India

  7. IRL Focus Areas Business Areas Service Delivery Emerging Solutions Software Infrastructure Services Application Services Contact Center Services Telecom Others (Banking, etc.) Systems In the context of FIRE Technical Competencies • Computer Science • Distributed Systems – systems mgmt., middleware • Information Management – IE, Data Mining • Interaction Technologies – speech • Programming Technologies – parallel and hi-perf. prog. • Software Engineering – model-driven, distributed dev. • Math Science • Operations Research • Algorithms • Optimization • Game Theory • Service Science • Service Engineering • Service Productivity • Service Management • Service Quality • Service Supply Chains

  8. Information and Knowledge Management @ IRL • Speech recognition and synthesis • Hindi, Indian English & Hinglish • Translation: Hindi  English and English  Hindi • UIMA Annotators (rule based) • with IIT-Bombay • Linking structured and unstructured data • Learning attributes from noisy or incomplete information • For example, customer transaction logs • More…

  9. Challenges • Data • Noisy • Incomplete • Could be ill-structured • Problem • Defining the problem is often our job too • Focus on the application • What you build must work • Users must be satisfied • Firefighting

  10. Some Examples…

  11. Speech - Core Technologies • Desktop speech recognition (Hindi & Indian English) • More than 1100 speakers • More than 250 hours of broadband speech data • Vocabulary of 75000 words • Accuracy: 90-95% • Telephony speech recognition • 500 speakers each for Hindi, English & Hinglish • A prototype for movie booking system in Indian English

  12. SENSEI: Voice and Accent Training for Call-Centers • Challenges • Increase in the number of call centers in India • Agents need to speak in foreign accent • Very high attrition rates in call-centers • Hiring involves evaluation and training • Solution: Sensei, a tool that is used for: • Candidate Screening: evaluates a candidate’s pronunciation, grammar and fluency • On-board Training: evaluates correctness of sounds produced, syllable stress, speaking rate and fluency • Monitoring: analyzes pre-recorded calls to determine if the agent maintained the required quality of voice/accent • Application: Cost reduction by automation of Accent Training and Evaluation The Hindu, 30 Oct. 2006

  13. Machine Translation: Linguistic & Statistical

  14. English-Hindi Machine Translation system

  15. Speech Recognition in IBM-IRL • Nitendra Rajput, “Statistical Language Modeling for Hindi Speech Recognition” National Symposium on Modelling and Shallow Parsing of Indian Languages, MSPIL 2006. • M Kumar, N Rajput, A Verma, “Hybrid Baseform Builder for Phonetic Languages,” International Conference on Intelligent Sensor and Information Processing, Jan 2005, Chennai. • Mohit Kumar, Nitendra Rajput, Ashish Verma, “A large-vocabulary continuous speech recognition system for Hindi,” IBM Journal of Research and Development, Vol. 48, No. 5/6, 2004. • Nitendra Rajput, L. Venkata Subramaniam, Ashish Verma, Adapting Phonetic Decision Trees Between Languages for Continuous Speech Recognition,” Proceedings: IEEE International Conference on Spoken Language Processing (ICSLP 2000), Beijing, China, Oct 16-20, 2000. • Niloy Mukherjee, Nitendra Rajput, L. Venkata Subramaniam, Ashish Verma, On Deriving a Phoneme Model for a New Language,” Proceedings: IEEE International Conference on Spoken Language Processing (ICSLP 2000), Beijing, China, Oct 16-20, 2000. • Raghavendra Udupa U, Tanveer A Faruquie, Hemanta K Maji, "An algorithmic framework for the decoding problem in statistical machine translation," COLING 2004. • R. Udupa and T. Faruquie, "An english-hindi statistical machine translation system," in Proceedings of the 1st IJCNLP, Sanya, Hainan Island, China. • Tanveer Faruquie, Nitendra Rajput, Vimal Raj, “Improving automatic call classification using machine translation,” IEEE ICASSP 2007, Honolulu, Hawai, USA, Jan 2007.

  16. EROCS: Entity RecOgnition in Context of Structured data Extracted entities and keywords/features Original text I have noticed in my statement that you have deducted Rs750 from my a/c (#20310284) as account maintenance fee. Can you please explain why you have charged this money? CustID: 0205492 SavingID: 20310284 Unhappy, Simple Saving A/C Complain: I have noticed in my statement that you have deducted Rs750 from my a/c (#20310284) as account maintenance fee. Can you please explain why you have charged this money? Complaint metadata + Customer/account data brought together by automatically linking the complaint with the customer/account Exploit linked information analysis in core business • Up-sell/Cross-sell, customer segmentation, campaign assessment, churn analysis.

  17. EROCS: Entity RecOgnition in Context of Structured data Linkage Discovery

  18. Present relevant transaction data and follow-up question to the agent within seconds Consistent, high-quality customer experience Reduce agent training cost Reduces privacy concerns CallAssist Transcript of Call Customer to Agent: Hi, I am John…. ….. status of a DVD player …. ……. Agent to Customer: …tell me thebrand…? Customer to Agent: …… I bought a Sony….

  19. That’s all for now… • IBM Research – India • Technical areas, applications, Services • Examples on: • Speech related works… • Translation… • Information Extraction… FIRE: It has been a great start!

  20. Thank you!

More Related