220 likes | 402 Views
Some from Heim Chap 13. Sound. you tube. Learning outcomes. Describe the basics of human hearing Explain the difference between visual and auditory interaction Describe the classes and subclasses of sound output and the attributes of each
E N D
Some from Heim Chap 13 Sound you tube
Learning outcomes • Describe the basics of human hearing • Explain the difference between visual and auditory interaction • Describe the classes and subclasses of sound output and the attributes of each • Describe the classes and subclass of sound input and recognition and attributes of each
Hearing • Provides information about environment:distances, directions, objects etc. • Physical apparatus: • outer ear – protects inner and amplifies sound • middle ear – transmits sound waves as vibrations to innerear • inner ear – chemical transmitters are released and cause impulses in auditory nerve • Sound • pitch – sound frequency • loudness – amplitude • timbre – type or quality the human 1
Sound is vibration http://www.hsc.csu.edu.au/ipt/mm_systems/3288/digitising_sound_answers.htm
Timbre is harmonic structure • A sine wave is all energy on the ‘first harmonic’ or ‘fundamental’ frequency (sounds like O) • Other shapes of sound wave come from a distribution of energy into other multiples of the fundamental http://hyperphysics.phy-astr.gsu.edu/hbase/audio/geowv.html http://www.sfu.ca/sonic-studio/handbook/Triangle_Wave.html
Hearing (cont) • Humans can hear frequencies from 20Hz to 15kHz • less accurate distinguishing high frequencies than low. • Higher frequencies disappear as you get older • Auditory system filters sounds • can attend to sounds over background noise. • for example, the cocktail party phenomenon. • Hearing aids disrupt this filtering • Hearing is involuntary • A sudden ‘grabs’ attention before we think • And some sounds are harder to ignore (e.g. baby crying) • ‘Listening’ is voluntary (largely) • Whether we choose to process the meaning, especially if the sound is language (although something like hearing your name is pretty well involuntary) the human 1
What if…. • You are in a noisy environment • Night clubbing • Phone call/ text message? • Your hearing is below average • You are deaf the human 1
Sound versus Visual Sound exists in time and over space, vision exists in space and over time. (Gaver, 1989) • Sound is only there when it is playing/made • Vision is there until it is replaced
Sound Interaction • Computer Output/Generation (input to human) • Non speech • Music • Audio Icons and Earcons • Speech • Computer Input/Recognition • Speech • Non speech • Environmental • Music 9
Computer Output: Music • Can be pre-recorded or generated • Movies • Games • Immersive experiences • Activates your brain in a different way from language • Acts almost entirely independently from hand-to-eye processing
Generating music • Exciting area for artists • Everything from pseudo real to completely abstract • There are Jazz music generators that only skilled people can differentiate from actual musicians. • Serato – dj software (www.serato.com) • Auckland company doing fantastic things • Several UOA grads there
Auditory Icons and Earcons • The difference between these two is subtle • Auditory icons: emphasis on ‘natural’ sounds and metaphor with real world • e.g. sound of filling a bottle with water to match moving a large file • Earcons: ‘Artificial’ sounds (generated) • e.g. more abstract metaphorical relationship to action or purely a convention (like corporate colour schemes) Windows hardware fail insert remove
Auditory Icons and Earcons • Redundant Encoding • It aids memory by adding additional associations. • Can alert without interrupting (well, at least leaves the visual field clear) • An alterative communications channel. • Positive/Negative Feedback • Auditory alarms might be crucial to the safe operation of computer-operated machinery or mission-critical environments • Too many alarms • Annoying • Ignored
Using Sound in Interaction Design • Learnability of the mapping between the icon and the object represented • “Oink” and “bow wow” have high articulatory directness (low distance between ‘appearance’ and function [or denotation]) • A swishing sound accompanying a paintbrush tool also has high articulatory directness • A system beep, on the other hand, carries no information about what it denotes (but we may quickly learn to associate it with an error; and the square wave structure is a bit toward unpleasant, so it’s better for an error than feedback on success)
Can you remember earcons? • How many? • How often do you hear them? • Can you intuitively tell what these mean? On Off Sleep Mis-recognized Dis-ambiguate
Speech Output • Eyes free operation • Alternative output channel • Good for checking your essays • Navigation is hard • Back tracking, • Finding location of a particular thing
Speech Output • Recorded • Menu choices for telephone systems • Books or other multimedia experiences • Generated (‘text-to-speech’, TTS) • Synthesizer built into Office • See http://office.microsoft.com/en-nz/powerpoint-help/using-the-speak-text-to-speech-feature-HA102066711.aspx • Google Translate has a nice one too (better, I think) • Can give pronunciation rules (the Google one sounds British to me, see also http://www.bell-labs.com/project/tts/sable.html) • Still sound a little artificial • Best synthesizers have a physical model of the tongue and breath to give natural flow between phonemes
Sound Input • Speech • Environmental • Music
Speech Recognition • Two distinct applications: • Transaction • Transcription • Transaction • Telephone menu systems • Choose from a limited number of options, works ok • Automatic speech recognition (ASR) • Built into operating systems • Siri (iPhone) and Android are ~~ usable • This is a triumph of Artificial Intelligence • Very difficult, ongoing research problem • Not just about recognizing phonemes but also finding the ‘right’ interpretation (helped e.g. by statistical word triple frequencies, but better if AI is ‘deeper’)
Searching Speech and Audio • Sound files do not afford easy opportunities for indexing and searching • Speech recognition can be used to transcribe speech files and create transcripts that can be searched like any other text file • So long as recognition accuracy is ok, which it isn't at the moment • Tune identification apps • Hum a bit of the tune and it tells you what it is! (e.g. Soundhog)
Summary • Describe the basics of human hearing • Explain the difference between visual and auditory interaction • Sound is transitory • Describe the classes and subclasses of sound output and the attributes of each • Non speech • Music • Earcons • Speech • Describe the classes and subclass of sound input and recognition and attributes of each • Speech • Transaction • Transcription