290 likes | 378 Views
Realtime Voicewriting Education. 2005 INTERSTENO CONGRESS VIENNA, AUSTRIA Presented by Phillip A. Kaufman, CRI.
E N D
Realtime VoicewritingEducation 2005 INTERSTENO CONGRESS VIENNA, AUSTRIA Presented by Phillip A. Kaufman, CRI
Just about anyone who fluently speaks English, German, Italian, French, Spanish, Dutch or Japanese can use IBM ViaVoice or Dragon NaturallySpeaking to produce short documents. But there are several significant differences between the common use of these applications and realtime voicewriting.Let’s examine those differences.
The average user dictates fairly slowly at his/her own pace, pausing to compose a thought, then dictating, then pausing to compose another thought, then dictating, et cetera. The realtime voicewriter repeats everything that is said by individuals in a colloquy or an individual giving a soliloquy, and has no control over what is said or the speed at which it is said.
The average user is likely to use ViaVoice or NatSpeak simultaneously for computer command and control, websurfing, e-mail and dictating documents. The realtime voicewriter’s sole use of the speech recognition engine when voicewriting is the instantaneous transcription of the dialogue or monologue of others.
Most average users dictate short documents. Realtime voicewriters typically dictate nonstop for at least 30 minutes, and often for several hours.
The average user is usually the only person that sees the rough draft of what he or she has dictated. Some errors are okay because the document will be corrected before others read it. The realtime voicewriter’s text is often viewed by other people as it is produced; therefore, accuracy must be near perfect.
The average user may take his time dictating long strings of commands and punctuation, or go back and format later. The realtime voicewriter cannot pause to think about formatting. He applies specialized dictation and software techniques to compress what must be dictated to produce formatting and inserts those utterances appropriately to produce all appropriate formatting on-the-fly.
The average user can dictate sloppily and obtain acceptable results. The realtime voicewriter must learn and practice diligently to develop extremely good enunciation and diction along with control of his voice and breathing.
The average user can use an average computer and is often ignorant of performance and maintenance concerns. The realtime voicewriter must have a high-end computer, and know how to perform adjustments and maintenance to obtain the best, consistent performance from that computer.
The average user typically uses speech recognition for the same type of subject matter. Most realtime voicewriters encounter a very wide variety of subject matters and must have a very broad knowledge base.
And the list could go on and on… There is a lot to learn. It is not magic.
It is through mastery of • Rapid, clear dictation and specialized dictation techniques • Computer operation, performance and maintenance as it relates to speech recognition • The application of specialized, structured principles of speech recognition software manipulation • CAT software tools that an individual becomes capable of realtime voicewriting
EDUCATION IS KEY • WELL-STRUCTURED AND COMPREHENSIVE • BEST PROVIDED THROUGH SCHOOLS Most students need to be lead to the water and coaxed to drink • THE TEACHER’S INFLUENCE IS AS VITAL AS THE TEXTBOOKS, CURRICULUM AND OTHER RESOURCES • THERE IS A LOT TO LEARN IN ORDER TO MASTER REALTIME VOICEWRITING
Typical, uneducated preparation and/or underdeveloped training techniques result in less than optimal results. MOST TRAINING PROVIDED TO DATE HAS BEEN FROM NOTHING MORE THAN SOFTWARE MANUALS
What would a well-developed realtime voicewriting curriculum cover and how would it be structured?
Introduction to realtime voicewriting equipment • Dictation practice equipment and audio sources • Introduction to developing dictation skills (Dictation skill development will go on throughout the course and includes over 40 specially developed exercises and a structured set of speed of drills.) • The History of Realtime Voicewriting • Computer operating system basic knowledge for realtime voicewriting • Basic computer performance considerations for realtime voicewriting • How speech recognition works • An overview of realtime voicewriting career options • Intermediate dictation skills (speakers, punctuation and macros) • The average users speech recognition experience (creating a set of user files like everyone else does) • Word processor basics • Three preparation steps before creating realtime voicewriting user files • The vocabulary tools in NaturallySpeaking • The Voice-Ed Foundation/Specialization Vocabulary theory in NaturallySpeaking • Creating a structured set of realtime voicewriting files in Naturally Speaking • Voice-Ed golden rules of user file care and re-creation • Improving recognition with the correction tool in NaturallySpeaking • Improving recognition with the Vocabulary Editor in NaturallySpeaking • Other accuracy improving techniques in DNS • Advanced computer maintenance for realtime voicewriting • Understanding and troubleshooting computer audio for speech recognition • Creation of a better set of DNS user files • The vocabulary tools in IBM ViaVoice • Creating a set of user files in ViaVoice • Improving recognition with the correction tool in ViaVoice • Improving recognition with the Macro Editor in ViaVoice • Realtime Voicewriting with CAT software • Correcting and editing in the CAT software • CAT system globals for realtime voicewriting • Vocal health and overcoming colds and other problems with your voice • Specialized vocabularies • Dictation skills for obtaining accuracy with fast talkers • Digital room audio recording for court reporting • Dictation and software techniques for court reporting • Dictation and software techniques for captioning • Captioning academics for realtime voicewriters • Dictation and software techniques for CART • CART academics for realtime voicewriters
Classes for academics applicable to all career specialties should be taught simultaneously with the basic realtime voicewriting classes, i.e. grammar, punctuation, computer basics, terminology, et cetera.
A MODERATE LEVEL OF COMPETENCY IN BASIC REALTIME VOICEWRITING SHOULD BE ACHIEVED BEFORE FOCUSING ON A SPECIALIZATION • CART • Captioning • Court Reporting • Medical Transcription SOME CAREERS DO NOT REQUIRE SPECIAL ACADEMICS • Telephone and Internet Transcription Services • Other corporate realtime transcription venues
Teachers would need to have: • a laptop or desktop computer equivalent to that on which the students will learn. • one or both speech recognition engines • a speech silencer mask and, optionally, an open-mic headset • a USB speech processor • Instructors’ copies of all course books and materials • speed building dictation tapes • course outlines, syllabi and lesson plans • a set of tests and quizzes • the CAT software chosen by the school • Preparatory training
Students would need to have: • a speech silencer mask and, optionally, an pen-mic, sound-isolating headset • one or two cassette recorder/players • a high-end laptop computer • ViaVoice Pro USB or ViaVoice and NatSpeak Pro • a USB speech processor • textbooks, audio exercises and other materials • speed building dictation tapes, CDs and/or online practice dictation recordings • CAT software student or full version (not necessary to begin studying, can be added and learned late in program)
WILL UNMANNED, AUTOMATED SPEECH RECOGNITION EVER REPLACE STENOTYPISTS AND REALTIME VOICEWRITERS? • It’s not likely, at least not within the next few decades. Why? Because there are a few things that it takes a human’s input to accomplish, things that are almost certain to be required for some time into the future in order to obtain the most accurate realtime transcription. These include: • Instantaneous insertion of punctuation at the appropriate places. • Speaker identification with automated formatting thereof. • Homonym and other forms of conflict resolution. The speech recognition engines are very good at handling a great deal of these because of grammatical modeling, but a speech recognition engine can only function on probabilities, while the human mind can discern countless variables. • There will always be someone participating in a discourse that mumbles or does not speak loudly or clearly enough for the microphones and computer to pick up his/her speech and convert it to text accurately. • People often talk over each other. It would be almost impossible for a speech recognition system to separate out the words of each speaker.
Certainly, speech recognition will improve year after year. But the original pioneers and experts in the field of speech recognition engine development say that they do not predict any major breakthroughs in the next couple of decades, and that the improvements will be incrementally smaller and smaller. The uses of speech recognition are certain to grow rapidly. It will probably be everywhere. But accurate realtime transcription of multiple speakers is a special use of speech recognition. HAL won’t be taking over our jobs in our lifetimes.
REALTIME VOICEWRITNGREAL OPPORTUNITIESFOR STUDENTS, TEACHERS, SCHOOLS, THE INDUSTRY… AND SERVICES FOR THE WORLD