1 / 36

Understanding the soundscape concept: the role of sound recognition and source identification

Understanding the soundscape concept: the role of sound recognition and source identification. David Chesmore Audio Systems Laboratory Department of Electronics University of York. Overview of Presentation. Role of soundscape analysis

terena
Download Presentation

Understanding the soundscape concept: the role of sound recognition and source identification

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Understanding the soundscape concept:the role of sound recognition and source identification David Chesmore Audio Systems Laboratory Department of Electronics University of York

  2. Overview of Presentation • Role of soundscape analysis • Instrument for Soundscape Recognition, Identification and Evaluation (ISRIE) • Soundscape description language • Applications • Conclusions

  3. Role of Soundscape Analysis • Potential applications: • identifying relevant sound elements in a soundscape (e.g. high intensity sounds) • determining positive and negative sounds • biodiversity studies • tranquil areas • preserving important soundscapes • planning and noise abatement studies

  4. Soundscape Analysis Options • Manual • Advantage: subjective • Disadvantages: time consuming, limited resources, subjective, very large storage requirements • Automatic • Advantages: objective (once trained), continuous analysis possible, much reduced data storage requirements • Disadvantage: reliability of sound element classification

  5. How to Automatically Classify Sounds? • Major issues to address: • separation and localisation of sounds in the soundscape (especially with multiple simultaneous sounds) • classification of sounds depends on feature overlap, number of elements • Number of elements, localisation, etc depends on application

  6. Instrument for Soundscape Recognition, Identification and Evaluation (ISRIE) • ISRIE is a collaborative project between York, Southampton and Newcastle Universities • 1 of 3 projects arising from EPSRC Noisy Futures Sandpit • York - sound separation + sound classification • Southampton - applications + interface with users • Newcastle - sound localisation + arrays

  7. Aim of ISRIE • Aim is to produce an instrument capable of automatically identifying sounds in a soundscape by: • separating sounds in 3-d • localising sounds from the 3-d field • classification of sound in a restricted range of categories

  8. Outline of ISRIE ISRIE Sensor Localisation + Separation (alt, az) Location Duration, SPL, LEQ Classification Category

  9. Sound Separation - Sensor • B-format microphone as sensor • Provides 3D directional information • A coincident microphone array reduces convolutive separation problems to instantaneous. • More compact and practical than multi-microphone solutions. Outputs W – omni-directional component X – fig-8 response along x-axis Y – fig-8 response along y-axis Z – fig-8 response along z-axis

  10. Overview of Separation Method • Use Coincident Microphone array • Transform into Time-Frequency Domain • Find Direction Of Arrival (DOA) vector for each Time-Frequency point. • Filter sources based on known or estimated positions in 3D space

  11. Assumptions • Approximately W-Disjoint Orthogonal • Sparse in time-frequency domain, i.e. the power in any time-frequency window is attributed to one source. • Sound sources are geographically spaced (sparse) • Noise sources have unique Direction of Arrival (DOA).

  12. The Dual Tree Complex Wavelet Transform (DT-CWT) • Efficient filterbank structure • Approximately shift invariant

  13. STFT separation

  14. DT-CWT separation

  15. Separation results - speech • 3 Male speakers • Recorded in anechoic chamber ISVR. Mixed to virtual B-format, known locations spaced around microphone

  16. Source Estimation and Tracking • Examples used known source locations. In many deployment scenarios, this is acceptable. • More versatility could be provided by finding source locations and tracking • Two approaches considered • 3D histogram approach • Clustering using plastic self organising map

  17. Results • 2 Speakers – Directional Geodesic Histogram Position of peaks at (0,0) and (10,20) degrees Blur between peaks due to 2 sources only approximating the assumptions

  18. Signal Classification What features? TDSC Which classifier? ANN – MLP, LVQ Which Sounds?

  19. ISRIE Sound Categories

  20. Time-Domain Signal Coding • Purely time-domain technique • Successfully used for: • Species recognition • birds, crickets, bats, wood-boring insects • Heart sound recognition • Current applications • Environmental sound • Vehicle recognition

  21. Time-Domain Signal Coding Time Epoch

  22. MultiscaleTDSC (MTDSC) • New method of D-S data presentation • Replaces S-matrix, A-matrix or D-matrix • Multiscale • Made from groups of epochs in powers of 2 (512, 256, etc) • Inspired by Wavelets

  23. MTDSC Value in frame n=4

  24. MTDSC Example Logarithmic Chirp – 100Hz - 24kHz Epoch frame length 2m

  25. MTDSC (cont) • Currently use shape but will investigate: • epoch duration (zero-crossings interval) only • epoch duration and shape • epoch duration, shape and energy • Also use mean, can also use varience, higher order statistics for larger values of m (e.g. 9)

  26. MTDSC Results (1) 1 Audio MTDSC data generation & stacking 3 output LVQ network 2 3 • Winning output determines result • Overall network accuracy: 76% • Some categories better than others • Road, Rail – 93%

  27. MTDSC Results (2) • 3 different Japanese cicada species used for biodiversity studies (2 common, 1 rare) in northern Japan • 21 test files from field recordings including 1 with -6dB SNR • Backpropagation MLP classifier • 20 out of 21 test files correctly classified • ~ 95% accuracy

  28. Practical ISRIE ISRIE Sensor Localisation + Separation (alt, az) Location Duration, SPL, LEQ Classification Category User Supplied Data Approx location required sound category

  29. Restricting Location target Automatic rejection of signals Cone of acceptance a b

  30. Further Automated Analysis • At present, ISRIE only provides a classified sound element in a small range of categories • Can we create a soundscape description language (SDL)? • Needs to be flexible enough to accomodate manually and automatically generated soundscapes • Take inspiration from speech recognition, natural language, bioacoustics (e.g. automated ID of insects, birds, bats, cetaceans)

  31. sonotag = G(L,q,f,d,t,D,a,c,p,G) where L = label q,f = direction of sound d = estimated distance to sound t = onset time D = duration a = received sound pressure level c = classification (a = automatic, m = manual) p = level of confidence in classification G = geotag = G(ll,lo,al) ll = lat, lo = longitude, al = altitude • Other possibilities exist

  32. Example of Monaural Sonotags 18s recording of O. viridulus at nature reserve in Yorkshire in 2003 G(plane,-1,-1,100,11:52.5,5,35,a,0.96,(53.914,-0.845,10)) G(Bird1,-1,-1,100,12:02,5,41,a,0.99,(53.914,-0.845,10)) G(O. viridulus,-1,-1,1,11:50,1.5,50,a,0.99,(53.914,-0.845,10)) G(O. viridulus,-1,-1,1,11:45,2,50,a,0.99,(53.914,-0.845,10))

  33. Example of 3-D Sonotags G(speaker1,10,20,2,14:00,5,42,a,0.92,(53.9,-0.9,10)) Treat separated sounds as monaural recordings for classification G(speaker2,0,0,1.5,14:00,5,43,a,0.96,(53.9,-0.9,10))

  34. Applications (1) • BS 4142 assessments • PPG 24 assessments • Noise nuisance applications • Other acoustic consultancy problems • Soundscape recordings • Future noise policy

  35. Applications (2) • Biodiversity assessment, endangered species monitoring • Alien invasive species (e.g. Cane Toad in Australia) • Anthropomorphic noise effects on animals • Habitat fragmentation • Tranquility studies

  36. Conclusions • ISRIE has been shown to be successful in separating and classifying urban sounds • much work still to be done, especially in classification • Automated soundscape description is possible but a flexible and formal framework is needed

More Related