1 / 33

See howstuffworks

EE2F1 Multimedia (1): Speech & Audio Technology Lecture 10: Audio Technology Martin Russell Electronic, Electrical & Computer Engineering School of Engineering The University of Birmingham. See http://www.howstuffworks.com. Human Hearing – reminder. Human ear – quoted as 20Hz to 20kHz range

fritzi
Download Presentation

See howstuffworks

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. EE2F1Multimedia (1): Speech & Audio TechnologyLecture 10: Audio TechnologyMartin RussellElectronic, Electrical & Computer EngineeringSchool of EngineeringThe University of Birmingham See http://www.howstuffworks.com

  2. Human Hearing – reminder • Human ear – quoted as 20Hz to 20kHz range • Upper limit drops with age – many can’t hear above 15kHz • Other animals can hear up to 50kHz (ultrasonic whistle) • Lower limit – below 40Hz we ‘feel’ the frequency with our bodies – the ear becomes less sensitive at low frequencies • For speech, we need 300Hz to 3.4kHz – based on intelligibility tests – hence telephone lines have a bandwidth of 3.1kHz

  3. Human hearing • We have two ears! • Allows direction finding – two methods are used • Ear nearest to the sound hears the sound both: • Louder since we have one on each side of our head • Sooner since the travel time of the sound is less (this also produces a phase shift between the ears) • Using this we can identify the direction of the sound along the Left-Right axis • ….but the ear can also identify front-back direction (but not as accurately as L-R) • ….and also to a lesser extent up-down

  4. Analogue sound • 1877 -Thomas Edison – metal cylinder phonograph. Record and playback in mono. Low bandwidth (few kHz). 1885 - Wax cylinders for general use. • 1900 - 78’s – 10 inch disc, holds 3 minutes of audio. Mechanical playback (motor and horn). Low bandwidth (5 kHz) • 1948 – LP’s 33RPM, 20 minutes per side, electrically assisted playback. Stereo. High bandwidth (18 kHz) • 1949 – 45’s – single track per side http://www.history-of-rock.com/record_formats.htm

  5. Mechanical modulation • All these formats relied on a modulated groove which moved a mechanical stylus. • Prone to wear – mechanical contact • Mechanical damage produces scratches, clicks and pops • Modulated groove produces movement in the stylus which moves a magnet. Coil is placed next to magnet and current is induced producing electrical signal

  6. Analogue Radio • First was AM radio - Medium wave (520-1630kHz) and long wave (150-300kHz) • Mono, 4.5kHz bandwidth. • In USA, stereo AM introduced. Many competing standards, but still low quality. • FM radio – VHF (88-108MHz) • Stereo, 15kHz bandwidth. • Ideal would be to send Left and Right as 2 channels – but mono receivers would need to be stereo and add channels together. Mono cost > stereo cost! • Solution: send L+R and L-R. Mono just receives L+R • Stereo decodes both and uses op-amps to de-multiplex L and R

  7. http://www.st-andrews.ac.uk/~www_pa/Scots_Guide/RadCom/part21/page1.htmlhttp://www.st-andrews.ac.uk/~www_pa/Scots_Guide/RadCom/part21/page1.html

  8. Magnetic Tape • All need electronics – tape is magnetic • 1940’s – commercial use of magnetic tape. Reel-to-reel, not suitable for domestic use • 1964 – Philips introduce the “Cassette” for dictation (speech only) machines. Mono and low quality by design! • 1966 – 8-track cartridge for stereo use, mainly in cars. Competed with cassette for several years, but Philips improved cassette (stereo and hi-fi) and it won.

  9. Magnetic tape • Thin plastic tape coated with ferric oxide powder • If exposed to magnetic field, Fe2O3 is permanently magnetized See http://howstuffworks.com

  10. Digital Audio - CD’s • 1982 - First digital audio introduced to consumers • 12cm single sided disc, stereo, 44.14kHz, 16 bit • Main marketing claim – indestructible • Sales slow to start – Audiophiles didn’t like the sound • Eventually convenience becomes main marketing force • Technology: Laser pickup reads ‘pits’ and ‘bumps’ representing 1’s and 0’s

  11. CD construction • CD is sandwich of plastic with reflective aluminium • Laser focuses on one bit at a time See www.howstuffworks.com/cd.htm

  12. Pits and bumps – 1’s and 0’s • Reflective layer has small bumps 125m high – this is ¼ wavelength of the laser • Laser beam covers both the bump and the surrounding area • Light from the surrounding areas is reflected, plus light from the bump – but this light has 125 + 125m extra to travel so reflection is 180 degrees out of phase with surround light and cancels out – a 0 • Where there is no bump, all the light is reflected (in phase) – a 1 • Tracking – laser needs to stay over the bumps – most common way is the Three-beam pickup

  13. Data rate • CD’s are uncompressed – 16 bit, 44.14kHz sampling, 74 minute max length • Capacity = 16 * 44140 * 2 * 60 * 74 / 8 = 740Mbytes • Error checking is built in – but for audio some errors can be hidden my interpolation • E.g. samples go 341, 368, (error), 422, 448 • Simply average the samples around the error • (368+422)/2 = 395 – linear interpolation • Note Data CD’s are lower capacity – more error checking

  14. Mono and stereo • Mono – one sound source in front of the listener • Information can be delivered – mono still used for speech • No directional information – not relevant for many apps • Stereo – we have two ears, so use two sound sources • Directional information – can localise sound L/R making a ‘soundstage’ • Most modern analogue sources are stereo

  15. Stereo basics • Two discrete channels for Left and Right • Sound of equal amplitude and in phase appears to come from the centre between the two speakers • Sound louder on one channel and in phase appears to come from between the speakers but closer to the louder one • Sounds that are out of phase appear to come from nowhere in particular – ambience or the surroundings • Fundamental idea is amplitude variations steer sound left to right, phase variations steer sound front to surround

  16. Multi channel sound • Human ear detects direction of sound (front to back) • Earlobe produces frequency signature depending on direction – hence complex shape of earlobe – a mechanical filter • Hafler Stereo – uses a third loudspeaker to reproduce the out of phase sounds

  17. Cinema Audio • “Don Juan” Warner Brothers 1926 – first commercial film with sound (music but no dialogue) • “The Jazz Singer”, Warner Bothers 1927 – (music, effects plus some dialogue) • Sound-on-disc record player playing wax record synchronised to projector • Sound-on-disc replaced with sound-on-film in 1930s See http://www.howstuffworks.com

  18. Sound-on-film • Two technologies: • Optical • Magnetic • Optical technology: • Transparent strip recorded on side of film • Width of strip varies – variable area soundtrack • Light source focussed onto strip – shines through onto photocell • Varying width give varying current • Variation is variable density soundtrack

  19. Magnetic Sound-on-Film • Became popular in 1950s • Advantages • Stereo, better sound quality • Drawbacks • Added after filming, more expensive, shorter life, more easily damaged • Up to 6 tracks, but too expensive • Stereo optical tracks too noisy until 1965

  20. Dolby noise reduction • Split audio into 4 bands • Pre-emphasis • Companding • Re-combination See http://www.howstuffworks.com

  21. Surround Stereo • Two optical strips used to encode 4 channels: • Left • Right • Centre • Rear • Encoding based on Boolean logic See http://www.howstuffworks.com

  22. Dolby Stereo screen Surround Right Centre Left Audience

  23. Home Systems • Dolby Surround • Dolby ProLogic • Four tracks of information on two physical tracks • ‘Backwards compatible’ with stereo • Dolby Surround uses Left and Right speakers to produce ‘phantom’ centre speaker • ProLogic sends centre sound to actual centre speaker See http://www.howstuffworks.com

  24. Digital sound • First used in “Jurassic Park” • DTS – Digital Theatre System • Optical track on film synchronises with audio on CD • Six audio tracks compressed onto 1 or 2 CDs • Initially viewed as temporary solution, but has lasted much longer than expected See http://www.howstuffworks.com

  25. Digital sound-on-film • Dolby Digital • Space between sprocket holes used to encode information See http://www.howstuffworks.com

  26. TV and videotape audio • Analogue TV – 15 kHz bandwidth, mono. Outside UK, stereo analogue TV introduced • Videotape (V2000, Betamax, VHS) – all used a simple linear audio track – similar to cassette. Low quality • Stereo video introduced in late 1970’s – simply split the mono track in half, so quality even poorer • Hi-Fi introduced in 1984 – first was Sony Betamax, VHS followed a year later. Note that modern Hi-Fi VHS machines use analogue FM for audio – not NICAM!

  27. Nicam 728 - Stereo TV • Near Instantaneous Companded Audio Multiplex • http://tallyho.bc.nu/~steve/nicam.html • Sound is digitised to 14 bits accuracy at a sampling rate of 32KHz. • The upper frequency limit of a sound channel is 15KHz due to anti-aliasing filters at the encoder. • The 14 bit original sound samples are companded digitally to 10 bits for transmission. (Digital compansion ensures that the encoding and decoding algorithms can track perfectly).

  28. Nicam (continued) • Near Instantaneous Companding - 1ms worth of sound data has to be input before the companding process can do its work. The "Audio Multiplex" term implies that the system is not limited just to stereo operation. • Each packet consists of a 32 bit header • Plus a 'payload' packet of 704 data bits – total rate of 728 bits/ms or 728Kbits/sec NICAM packet 32 bit header 704 bits data Header contains information to enable data to be de-compressed

  29. Companding; Non-linear coding • Takes 1 ms of raw 14-bit samples (i.e. 32 samples) and reduces them from 14 bits to 10. • The numerically largest (positive or negative) sample of the 32 is used to choose which 10 bits of all 32 samples in the block will actually be transmitted. • Note 3 range bits to signal negative values as well • Compression factor is about 0.7 – but is lossy – zero’s replace Least Significant Bits

  30. Compression • Human ear is non-linear – can tolerate more distortion (or less accuracy) for louder sounds – basic companding • Masking – loud sound can mask lower-amplitude sound of similar frequency • Perceptual coding – MP3’s!

  31. Digital broadcast formats - MPEG • Digital TV (Sky, Cable, ITV Digital) - all provide stereo only (now….) • MPEG-1 (ISO/IEC 11172-3) provides single-channel ('mono') and two-channel ('stereo' or 'dual mono') coding at 32, 44.1, and 48 kHz sampling rate. The predefined bit rates range from 32 to 448 kbit/s for Layer I, from 32 to 384 kbit/s for Layer II, and from 32 to 320 kbit/s for Layer III.

  32. MPEG (continued) • MPEG-2 BC (ISO/IEC 13818-3) provides a backwards compatible multichannel extension to MPEG-1; up to 5 main channels plus a 'low frequency enhancement' (LFE) channel can be coded; the bit rate range is extended up to about 1 Mbit/s; • An extension of MPEG-1 towards lower sampling rates 16, 22.05, and 24 kHz for bitrates from 32 to 256 kbit/s (Layer I) and from 8 to 160 kbit/s (Layer II & Layer III). • http://mpeg.telecomitalialab.com/faq/audio.htm

  33. Summary • Analogue audio • Digital audio • Cinema audio • Home systems • Video and TV audio

More Related