360 likes | 497 Views
C. Shannon: Communication in Presence of Noise combination of channel and signal spectrum should be as flat (as random-like) as possible. energy of the signal. level of noise in the channel. frequency. energy of the signal. energy of the signal. level of noise in the channel.
E N D
C. Shannon: Communication in Presence of Noise • combination of channel and signal spectrum should be as flat (as random-like) as possible energy of the signal level of noise in the channel frequency
energy of the signal energy of the signal level of noise in the channel level of noise in the channel resource space resource space Forces of Nature if the receiver could be controlled • put more resources (introduce less noise) where there is more signal • biological system optimized for information extraction from sensory signals • if signal could be controlled (e.g. in communication) • put more signal where there is less noise • sensory signal optimized for a given communication channel
Roman Jakobson • Born in Moscow in 1896, • Co-founder Moscow Linguistic Circle (1915) • Prague Linguistic Circle (1926). • Free French University New York (1942-1946) • Professor at Columbia, Harvard, and M.I.T. • 1982 We speak in order to hear in order to be understood
Radio Rex (1917) Newton l/4 beer Where is the message? /u/ /o/ /a/ /e/ /iy/ • “limited commercial success” • -John Pierce 1969
frequency time time
frequency time
time frequency get spectral components time Short-term Spectrum 10-20 ms /j/ /u/ /ar/ /j/ /o/ /j/ /o/
critical bandwidth Spectral resolution of hearing spectral resolution of hearing decreases with frequency (critical bands of hearing, perception of pitch,…) 10000 5000 2000 1000 critical bandwidth [Hz] 500 100 threshold of perception of the tone 50 50 100 500 1000 5000 10000 frequency [Hz] what happens outside the critical band does not affect decoding of the sound in the critical band noise bandwidth
frequency energies in “critical bands”
loudness = intensity 0.33 intensity (power spectrum) intensity ≈ signal 2 [w/m2] loudness [Sones] |.|0.33 loudness
Not all spectral details are important a) compute Fourier transform of the auditory spectrum and truncate it (cepstrum) b) approximate the auditory spectrum by an autoregressive model 6th order AR model 14th order AR model power (loudness) power (loudness) frequency (tonality) frequency (tonality)
It’s about time (to talk about TIME)
Several hundreds of miliseconds long buffer appears to be consistent with perception • data in the buffer interact • data outside the buffer do not interact with the data in the buffer temporal buffer => filter
Frequency response (peak around 5 Hz) Impulse response (effective length around 200 ms)
Frequency response (peak around 5 Hz) Impulse response (effective length around 200 ms)
spectrogram (short-term Fourier spectrum) time [s] Perceptual Linear Prediction (PLP) (12th order model) RASTA-PLP
Machine recognition of speech trained on data from New Jersey Labs
filter spectrogram spectrum from RASTA-PLP