130 likes | 304 Views
Topic: Tuning the Labels. Kishore Prahallad Email: skishore@cs.cmu.edu Carnegie Mellon University & International Institute of Information Technology Hyderabad. Objective of this Lecture. To tune the labels (phone boundaries) to get better quality output. Better Labels. Research:
E N D
Topic: Tuning the Labels Kishore Prahallad Email: skishore@cs.cmu.edu Carnegie Mellon University & International Institute of Information Technology Hyderabad Speech Technology - Kishore Prahallad (skishore@cs.cmu.edu)
Objective of this Lecture • To tune the labels (phone boundaries) to get better quality output Speech Technology - Kishore Prahallad (skishore@cs.cmu.edu)
Better Labels • Research: • Automatic segmentation models such as HMMs or neural networks could be tuned to obtain better labels. • Practical: • Use existing state-of-art speech segmentation algorithm • Manually verify and correct the misaligned labels • For small databases, manual correction is more apt. • Emulabel is the tool best suited for this purpose. Speech Technology - Kishore Prahallad (skishore@cs.cmu.edu)
Install Emulabel Step 1: untar the package (see course web site for emulabel package) $tar xvfz EMULABEL.tar.gz $ cd EMULABEL $ls (type ls to see the contents) Step 2: untar the emu-linux $tar xvfz emu-linux-1.4.2.tar.gz $cd emulabel Step 3: Login as root to install Emulabel $su .... #./doinstall.sh (install emulabel) Step 4: Go back to the EMULABEL directory #cd .. Step 5: Install TCL/TK versions.... #rpm -i tcl-8.0.5-35.i386.rpm --force #rpm -i tk-8.0.5-35.i386.rpm --force #rpm -i tclx-8.0.5-35.i386.rpm --force Step 6: Check to See, whether it runs: #exit (come out of the root login) Step 7: Go to voice directory $emulabel etc/emu_lab (this command should invoke GUI of emulabel) Speech Technology - Kishore Prahallad (skishore@cs.cmu.edu)
Emulabel invoked… Speech Technology - Kishore Prahallad (skishore@cs.cmu.edu)
Step 1 Press return Speech Technology - Kishore Prahallad (skishore@cs.cmu.edu)
Step 2 List of wave files appear here. Select one to label Speech Technology - Kishore Prahallad (skishore@cs.cmu.edu)
Step 3 Wave files and the red-labels appear Speech Technology - Kishore Prahallad (skishore@cs.cmu.edu)
Step 3 Move these red markers to move the boundaries Speech Technology - Kishore Prahallad (skishore@cs.cmu.edu)
Step 3 Listen to this red-marked region by right-click the mouse Speech Technology - Kishore Prahallad (skishore@cs.cmu.edu)
Manual Labels • Save the manual corrected labels • Labels are stored in lab/ directory. Speech Technology - Kishore Prahallad (skishore@cs.cmu.edu)
Evaluation • Compare the voice samples synthesized • before labeling Vs after labeling Speech Technology - Kishore Prahallad (skishore@cs.cmu.edu)
Additional Reading for the lecture • http://festvox.org • 11-752 CMU Course Lecture Notes • http://festvox.org/festtut/notes/festtut_toc.html • http://festvox.org/bsv/bsv-pitchmarks-sect.html Speech Technology - Kishore Prahallad (skishore@cs.cmu.edu)