1 / 16

Adaptive Methods for Speaker Separation in Cars

Adaptive Methods for Speaker Separation in Cars. DaimlerChrysler Research and Technology Julien Bourgeois Julien.Bourgeois@daimlerchrysler.com. General context. s 2 (t). s 1 (t). +Road Noise spatially diffuse. Several simultaneous speakers (sources) spatially located. x 1 (t). x 4 (t).

willa
Download Presentation

Adaptive Methods for Speaker Separation in Cars

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Adaptive Methods for Speaker Separation in Cars DaimlerChrysler Research and Technology Julien Bourgeois Julien.Bourgeois@daimlerchrysler.com

  2. General context s2(t) s1(t) +Road Noise spatially diffuse Several simultaneous speakers (sources) spatially located x1(t) x4(t) Separation Algorithm Individual speech flows Microphones Goal: provide individual speech input for each passenger

  3. General context s2(t) s1(t) +Road Noise spatially diffuse Mixing system Several simultaneous speakers (sources) spatially located x1(t) x4(t) Separation Algorithm Individual speech flows Microphones Goal: provide individual speech input for each passenger

  4. General context s2(t) s1(t) +Road Noise spatially diffuse Several simultaneous speakers (sources) spatially located Software x1(t) x4(t) Separation Algorithm Individual speech flows Microphones Goal: provide individual speech input for each passenger

  5. Plan of the presentation • Overview of existing methods • Supervised/Informed separation vs. Blind separation • Blind separation and prior spatial information • Conclusion and future work

  6. Existing methods: CASA vs. Multichannel Techniques • CASA: • 1 microphone separation • Heuristics based on an analysis of human auditory system • Requires a lot of data (training of parameters) • Multi-microphones techniques: • Speech moves much faster than… • the coherence relating two (or more) microphones.

  7. Direction of interest Filters output Existing Methods: Beamforming • Beamforming: • Prior information on target position • Constrain the response in the direction of interest • Minimize the output power • Problem of target cancellation if prior spatial info is not perfect.

  8. Independent Outputs Acoustic Mixing Sources BSS Dependent Observations Existing methods: Blind Source Separation • Blind Source Separation (BSS) • First applications to speech separation at the end of the 90’s • Only requirement: statistically independent sources • Difficult optimization problem: maximizing a nonlinear function (independence measure). • With many microphones, target cancellation can also appear. • Permutation ambiguity.

  9. The question is… • Is it possible to merge Beamforming and BSS, and combine their advantages? • In cars, prior knowlegde on speaker positions, separate blindly is suboptimal.

  10. Independent Outputs Acoustic Mixing Sources BSS Dependent Observations Blind separation and prior spatial information Prior info : positions Initialisation of BSS according to speakers positions helps optimisation procedure a lot. • Solve permutations problem solved • Target cancellation problem solved

  11. BSS is not that blind… • BSS performances depends dramatically on the type of mixing • Strictly causal

  12. BSS is not that blind… • BSS performances depends dramatically on the type of mixing • Strictly causal • Non strictly causal

  13. Direction of interest Filters output Beamforming is not that informed… • Perfect prior spatial information is actually not necessary: Target cancellation problem can be solved if one can detect activity/silences of each speaker. • The detection problem is strongly related with IDIAP smart meeting room projects.

  14. Conclusion and future works • Combining BSS with a beamformer is not gainful. • We may inform BSS efficiently in the case of non-causal mixings (algorithmic rotation of the microphone array)

  15. Conclusion and future works • Combining BSS with a beamformer is not gainful. • We may inform BSS efficiently in the case of non-causal mixings (algorithmic rotation of the microphone array)

  16. Thank you!

More Related