Sound

Some from Heim Chap 13 Sound you tube

Learning outcomes • Describe the basics of human hearing • Explain the difference between visual and auditory interaction • Describe the classes and subclasses of sound output and the attributes of each • Describe the classes and subclass of sound input and recognition and attributes of each

Hearing • Provides information about environment:distances, directions, objects etc. • Physical apparatus: • outer ear – protects inner and amplifies sound • middle ear – transmits sound waves as vibrations to innerear • inner ear – chemical transmitters are released and cause impulses in auditory nerve • Sound • pitch – sound frequency • loudness – amplitude • timbre – type or quality the human 1

Sound is vibration http://www.hsc.csu.edu.au/ipt/mm_systems/3288/digitising_sound_answers.htm

Timbre is harmonic structure • A sine wave is all energy on the ‘first harmonic’ or ‘fundamental’ frequency (sounds like O) • Other shapes of sound wave come from a distribution of energy into other multiples of the fundamental http://hyperphysics.phy-astr.gsu.edu/hbase/audio/geowv.html http://www.sfu.ca/sonic-studio/handbook/Triangle_Wave.html

Hearing (cont) • Humans can hear frequencies from 20Hz to 15kHz • less accurate distinguishing high frequencies than low. • Higher frequencies disappear as you get older • Auditory system filters sounds • can attend to sounds over background noise. • for example, the cocktail party phenomenon. • Hearing aids disrupt this filtering • Hearing is involuntary • A sudden ‘grabs’ attention before we think • And some sounds are harder to ignore (e.g. baby crying) • ‘Listening’ is voluntary (largely) • Whether we choose to process the meaning, especially if the sound is language (although something like hearing your name is pretty well involuntary) the human 1

What if…. • You are in a noisy environment • Night clubbing • Phone call/ text message? • Your hearing is below average • You are deaf the human 1

Sound versus Visual Sound exists in time and over space, vision exists in space and over time. (Gaver, 1989) • Sound is only there when it is playing/made • Vision is there until it is replaced

Sound Interaction • Computer Output/Generation (input to human) • Non speech • Music • Audio Icons and Earcons • Speech • Computer Input/Recognition • Speech • Non speech • Environmental • Music 9

Computer Output: Music • Can be pre-recorded or generated • Movies • Games • Immersive experiences • Activates your brain in a different way from language • Acts almost entirely independently from hand-to-eye processing

Generating music • Exciting area for artists • Everything from pseudo real to completely abstract • There are Jazz music generators that only skilled people can differentiate from actual musicians. • Serato – dj software (www.serato.com) • Auckland company doing fantastic things • Several UOA grads there

Auditory Icons and Earcons • The difference between these two is subtle • Auditory icons: emphasis on ‘natural’ sounds and metaphor with real world • e.g. sound of filling a bottle with water to match moving a large file • Earcons: ‘Artificial’ sounds (generated) • e.g. more abstract metaphorical relationship to action or purely a convention (like corporate colour schemes) Windows hardware fail insert remove

Auditory Icons and Earcons • Redundant Encoding • It aids memory by adding additional associations. • Can alert without interrupting (well, at least leaves the visual field clear) • An alterative communications channel. • Positive/Negative Feedback • Auditory alarms might be crucial to the safe operation of computer-operated machinery or mission-critical environments • Too many alarms • Annoying • Ignored

Using Sound in Interaction Design • Learnability of the mapping between the icon and the object represented • “Oink” and “bow wow” have high articulatory directness (low distance between ‘appearance’ and function [or denotation]) • A swishing sound accompanying a paintbrush tool also has high articulatory directness • A system beep, on the other hand, carries no information about what it denotes (but we may quickly learn to associate it with an error; and the square wave structure is a bit toward unpleasant, so it’s better for an error than feedback on success)

Can you remember earcons? • How many? • How often do you hear them? • Can you intuitively tell what these mean? On Off Sleep Mis-recognized Dis-ambiguate

Speech Output • Eyes free operation • Alternative output channel • Good for checking your essays • Navigation is hard • Back tracking, • Finding location of a particular thing

Speech Output • Recorded • Menu choices for telephone systems • Books or other multimedia experiences • Generated (‘text-to-speech’, TTS) • Synthesizer built into Office • See http://office.microsoft.com/en-nz/powerpoint-help/using-the-speak-text-to-speech-feature-HA102066711.aspx • Google Translate has a nice one too (better, I think) • Can give pronunciation rules (the Google one sounds British to me, see also http://www.bell-labs.com/project/tts/sable.html) • Still sound a little artificial • Best synthesizers have a physical model of the tongue and breath to give natural flow between phonemes

Sound Input • Speech • Environmental • Music

Speech Recognition • Two distinct applications: • Transaction • Transcription • Transaction • Telephone menu systems • Choose from a limited number of options, works ok • Automatic speech recognition (ASR) • Built into operating systems • Siri (iPhone) and Android are ~~ usable • This is a triumph of Artificial Intelligence • Very difficult, ongoing research problem • Not just about recognizing phonemes but also finding the ‘right’ interpretation (helped e.g. by statistical word triple frequencies, but better if AI is ‘deeper’)

Searching Speech and Audio • Sound files do not afford easy opportunities for indexing and searching • Speech recognition can be used to transcribe speech files and create transcripts that can be searched like any other text file • So long as recognition accuracy is ok, which it isn't at the moment • Tune identification apps • Hum a bit of the tune and it tells you what it is! (e.g. Soundhog)

Summary • Describe the basics of human hearing • Explain the difference between visual and auditory interaction • Sound is transitory • Describe the classes and subclasses of sound output and the attributes of each • Non speech • Music • Earcons • Speech • Describe the classes and subclass of sound input and recognition and attributes of each • Speech • Transaction • Transcription

Sound

Sound

Presentation Transcript

~ Sound ~

SOUND

Sound

SOUND

Sound

Sound

SOUND

Sound

Sound

Sound

SOUND

Sound

Sound

Sound

Sound

Sound

SOUND

Sound

Sound

SOUND

Sound

Sound