160 likes | 272 Views
Adaptive Methods for Speaker Separation in Cars. DaimlerChrysler Research and Technology Julien Bourgeois Julien.Bourgeois@daimlerchrysler.com. General context. s 2 (t). s 1 (t). +Road Noise spatially diffuse. Several simultaneous speakers (sources) spatially located. x 1 (t). x 4 (t).
E N D
Adaptive Methods for Speaker Separation in Cars DaimlerChrysler Research and Technology Julien Bourgeois Julien.Bourgeois@daimlerchrysler.com
General context s2(t) s1(t) +Road Noise spatially diffuse Several simultaneous speakers (sources) spatially located x1(t) x4(t) Separation Algorithm Individual speech flows Microphones Goal: provide individual speech input for each passenger
General context s2(t) s1(t) +Road Noise spatially diffuse Mixing system Several simultaneous speakers (sources) spatially located x1(t) x4(t) Separation Algorithm Individual speech flows Microphones Goal: provide individual speech input for each passenger
General context s2(t) s1(t) +Road Noise spatially diffuse Several simultaneous speakers (sources) spatially located Software x1(t) x4(t) Separation Algorithm Individual speech flows Microphones Goal: provide individual speech input for each passenger
Plan of the presentation • Overview of existing methods • Supervised/Informed separation vs. Blind separation • Blind separation and prior spatial information • Conclusion and future work
Existing methods: CASA vs. Multichannel Techniques • CASA: • 1 microphone separation • Heuristics based on an analysis of human auditory system • Requires a lot of data (training of parameters) • Multi-microphones techniques: • Speech moves much faster than… • the coherence relating two (or more) microphones.
Direction of interest Filters output Existing Methods: Beamforming • Beamforming: • Prior information on target position • Constrain the response in the direction of interest • Minimize the output power • Problem of target cancellation if prior spatial info is not perfect.
Independent Outputs Acoustic Mixing Sources BSS Dependent Observations Existing methods: Blind Source Separation • Blind Source Separation (BSS) • First applications to speech separation at the end of the 90’s • Only requirement: statistically independent sources • Difficult optimization problem: maximizing a nonlinear function (independence measure). • With many microphones, target cancellation can also appear. • Permutation ambiguity.
The question is… • Is it possible to merge Beamforming and BSS, and combine their advantages? • In cars, prior knowlegde on speaker positions, separate blindly is suboptimal.
Independent Outputs Acoustic Mixing Sources BSS Dependent Observations Blind separation and prior spatial information Prior info : positions Initialisation of BSS according to speakers positions helps optimisation procedure a lot. • Solve permutations problem solved • Target cancellation problem solved
BSS is not that blind… • BSS performances depends dramatically on the type of mixing • Strictly causal
BSS is not that blind… • BSS performances depends dramatically on the type of mixing • Strictly causal • Non strictly causal
Direction of interest Filters output Beamforming is not that informed… • Perfect prior spatial information is actually not necessary: Target cancellation problem can be solved if one can detect activity/silences of each speaker. • The detection problem is strongly related with IDIAP smart meeting room projects.
Conclusion and future works • Combining BSS with a beamformer is not gainful. • We may inform BSS efficiently in the case of non-causal mixings (algorithmic rotation of the microphone array)
Conclusion and future works • Combining BSS with a beamformer is not gainful. • We may inform BSS efficiently in the case of non-causal mixings (algorithmic rotation of the microphone array)