1 / 13

Building a Catalan diphone voice

Building a Catalan diphone voice. Ariadna Font Llitjos May 10, 2001. Defining the phoneset. Most Catalan phones (34) plus 2 Spanish phones ( th and jj )

yuma
Download Presentation

Building a Catalan diphone voice

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Building a Catalan diphone voice Ariadna Font Llitjos May 10, 2001

  2. Defining the phoneset • Most Catalan phones (34) plus 2 Spanish phones (th and jj) • Reason: All Catalan speakers also have Spanish phones, and there are many Spanish borrowed words that are in most Catalan speaker’s lexicon • Left out phones that need a much finer classification than the ones made for English phones (beta, gamma, etc)

  3. Generating Diphone Schema • Mostly same as Spanish, but with the new set of phones. • Catalan has 8 vowels (w/o considering stress), whereas Spanish has only 5 -> had to add a level of vheight (high mid-high mid-low low) ( draw graph on the board) • Mapping Catalan phones to a predefined set of phones • Over generative. Voice better suited to pronounce foreign or nonsense words that contain phones in the language but no legal combination of those

  4. Mapping Catalan phones to a predefined set of phones • Options: Spanish and English • My choice: English • Reasons: • English has more phones for vowels, more appropriate than Spanish, • Spanish phones have already been mapped to English phones, better to just map the phones directly to English, rather than indirectly

  5. Generating and recording the prompts • 1109 prompts (recorded on festvox0) • Lots of room noise (typing, door, talking, etc.) • Microphone not always in same position • Different power and even different intonation and duration throughout the whole recording process

  6. Labeling nonsense words • Automatically: • make_labs • make_diph_index • Manually: • Find a set of diphones that are wrong and look them up in dic/afldiph.est • Edit and correct the corresponding file with emulabel • Rerun make_diph_index (etc.)

  7. Extracting pitchmarks and LPS coefficients • Automatically: • make_pm_wav (edit to modify pitch range of speaker) • find_powerfactors (tells us what general power difference exists between files, calculated a table of power modifiers for each file) • make_lpc

  8. Testing phone synthesis • (SayPhones ‘(pau o l a pau s o k l a r i a d n a pau)) • Catalan voice • Spanish voice • English voice (modifying the phones)

  9. Catalan voice is still quite bad • Bad example • But it does have a basic Spanish phone… and without it, it would sound like this And here is how kal_diphone sounds

  10. Added tokenization • To be able to tell the numbers in Catalan (followed the Spanish tokenizer) Show file

  11. Added some lexical entries • Letters of the alphabet, symbols, punctuation, some content words…

  12. Phrasing, duration and intonation • Not there yet • Nor can I get it to SayText

  13. Summary: building a diphone voice • Define phoneset • Generate diphone schema • Generate prompts • Record prompts • Label prompts • Extract pitchmarks and LPC coefficients • Test phone synthesis • Hand correct labels • Add tokenizer • Add lexicon • Add prosody, durations and intonation • Test and evaluate voice • Package for distribution

More Related