120 likes | 243 Views
Automatic Transcript Generation. Helmer Strik A 2 RT Dept. of Language & Speech University of Nijmegen. Problem & Solution. Problem: We have Audio from radio & TV We need Transcripts Solution ASR: Automatic Speech Recognition. History of ASR. It all started more than 100 years ago.
E N D
Automatic Transcript Generation Helmer Strik A2RT Dept. of Language & Speech University of Nijmegen
Problem & Solution Problem: • We have Audio from radio & TV • We need Transcripts Solution ASR: Automatic Speech Recognition
History of ASR It all started more than 100 years ago
History of ASR 1870 - Alexander Graham Bell: Make speech visible, for the hearing impaired 1952 - AT&T Bell Laboratories: 1st ASR - ten English digits 2001 - ASR is ‘everywhere’ : • PC: dictation + ‘Command & Control’ • mobile phones (hands free) • call-centers • tap phone calls
First: A/D-conversion Before ASR: A/D-conversion Speech - analogue & continuous Mic. + sound card WAV file - digital & discrete
What is ASR? Answer: conversion from speech to text X: unknown speech signal ASR W: a string of words
How: probabilistic approach Find W that max. P(W|X) P(W|X) = P(X|W) * P(W) / P(X) • P(W) - language model • P(X|W) - acoustic model • Whole word models • Phoneme models + Lexicon
ASR ASR = • Phoneme models (HMMs) • Lexicon • Language model P(X|W) P(W)
Training HMMs & LMs are trained: Speech + manual transcripts (lexicon) Training procedure • ASR: • HMMs (Hidden Markov Models) • Language Models
Decoding Automatic Transcript Generation: X: unknown speech signal ASR W: the automatic transcripts