Sound Source Separation using 3D Correlogram, Fuzzy Logic, and Neural Networks

Sound Source Separation using 3D Correlogram,Fuzzy Logic, and Neural Networks A RESEARCH PROJECT Eduardo Dias Trama

Table of Contents • INTRODUCTION • PROJECT OVERVIEW • THE PREPROCESSOR • THE LEARNING PROCESSOR • THE SEPARATION PROCESSOR • PROJECT EXPERIMENTS • CONCLUSION

INTRODUCTION • Overview of sound source separation • Sound separation methods • Related applications of sound separation

Overview of sound source separation • What is sound separation? • Psychoacoustic properties • Timbre • How can sound be modeled?

Sound separation methods • CASA (Computational Auditory scene Analysis), Marrian • Spatial and Periodicity-and-Harmonicity • CASA: 3D Correlogram analysis • Blind source separation and prediction-driven

Related applications of sound separation • Sound and voice recognition • Noise removal • Compression

PROJECT OVERVIEW • Overview • Auditory model analysis • Sound data library and classification • Sound data matching • Complete sound separation system

Overview • What is a piano sound? • Memory • Clustering

Auditory model analysis • Properties • Grouping • Past knowledge • Correlation

Sound data library and classification • Sound memory • How much information is needed for later analysis? • Does it matter if audio data is compressed? • Structure of classification

Sound data matching

Complete sound separation system

THE PREPROCESSOR • The Cochlea Filter Model • Correlogram • 3-D Correlogram

The Cochlea Filter Model • Filtering: basilar membrane (BM) • Detection: inner hair cell (IHC) • Compression: automatic gain control (AGC) • Cochleagram

Lyon cochlear model

Correlogram • Short time auto-correlations of the neural firing rates as a function of cochlear place (best frequency) versus time • Correlogram movie

Correlogram • Speech processing • Extract the formants of voiced and unvoiced sounds • Short duration • Auto-correlation window size Window size

Correlogram Frame • Vertical axis shows low to high frequencies from bottom to top • Horizontal axis represents the lag or time delay

Correlogram Frame • Dark areas in the image show activity in the Correlogram frame • Vertical lines: cochlear channels firing in the same period

Correlogram Frame • Horizontal bands are indicators of large amounts of energy within a frequency band

Slaney, Lyon structure to compute a Correlogram

3-D Correlogram • A series of Correlograms over time • Frequency information comes from a cochlea filter bank • A finite time/frequency analysis • It depends on the initial time

Daniel Ellis signal-processing front-end implementation

THE LEARNING PROCESSOR • Creating the network input • Classification • Artificial neuron network fuzzy classification

Creating the network input • Responsible for learning each Correlogram frame of a selected sound • It should be exposed to many small variations of the target (selected) sound • The total number of neural nets (NN) is: NN = FB x CF

Signal path to the network input

Class Family Length Frequency range Number of Correlogram frames Sufficient to classify one particular sound Make the matching process faster Intensive parallel processing Classification

Figure of a parallel neural network classification

Artificial neuron network fuzzy classification • Fuzzy IF-THEN rules to describe a classifier • An adaptive-network-based fuzzy classifier to solve fuzzy classification problems • ANFIS (adaptive-network-based fuzzy inference system)

Block diagram of a general fuzzy inference system

THE SEPARATION PROCESSOR • Choosing method for sound matching • The Matching Fuzzy Logic sound library • Sound separation

Choosing method for sound matching • Preamble, search, matching and interpolation • Target and precision • Fuzzy clustering algorithms

The Matching Fuzzy Logic sound library • A set of fuzzy sound elements will be used for matching (FIS) • The initial values for search need to be determined by external inputs • ANFIS (Adaptive Neuro-Fuzzy Inference Systems)

Sound separation • Search, match and extract • Step 1: Input process • Step 2: Classification • Step 3: Choosing what to separate • Step 4: Dynamics and pitch extraction • Step 5: Re-synthesis

Step 1: Input process • Analog to digital conversion • Cochlea filter bank • Cochleagram • Correlogram frames • Neuro-Fuzzy input matrix

Step 2: Classification

Step 3: Choosing what to separate • Rule 1: Assume that human auditory system can recognize one or more sounds from the audio input mixture • Rule 2: One recognizable audio should be selected for separation • Rule3: Assume that complete or partial information of selected audio class must exist in sound library

Step 4: Dynamics and pitch extraction

Step 5: Re-synthesis • Re-synthesis of selected sound Correlogram frames at unit pitch • Apply dynamics to each Correlogram frame • Correlogram frame inversion

PROJECT EXPERIMENTS • Experiment setup • Experiment procedures • Experiment results

Experiment setup

Experiment procedures • Recorded wave data:5 sec. @ 44100 Hz sample rate, 16 bits resolution, and two channels (stereo) • Down-sampled to 11025 Hz to one channel • Mixed combinations without delay • Mixed combinations with 0.5 sec. delay

Experiment results • Single Sound Source • Two sound source without delay • Two sound source with delay • Modeling ANFIS for Correlogram frames • Correlogram frame channel training (classification) • Correlogram frame channel evaluation (matching)

Single Sound Source

Two sound source without delay

Two sound source with delay

Sound Source Separation using 3D Correlogram, Fuzzy Logic, and Neural Networks

Sound Source Separation using 3D Correlogram, Fuzzy Logic, and Neural Networks

Presentation Transcript

Sound Source Separation using 3D Correlogram, Fuzzy Logic, and Neural Networks

Fuzzy Logic

SHORT TERM LOAD FORECASTING USING NEURAL NETWORKS AND FUZZY LOGIC

Fuzzy Logic

Fuzzy Logic

Fuzzy Sets and Fuzzy Logic

Fuzzy Logic

ILLUMINATION CONTROL USING FUZZY LOGIC

Speech Sound Production: Recognition Using Recurrent Neural Networks

Active Sonar Target Identification Using Evolutionary Neural Logic Networks

Fuzzy Logic

Fuzzy Logic

Fuzzy Sets and Fuzzy Logic

Fuzzy Sets and Fuzzy Logic

Fuzzy Logic

SHORT TERM LOAD FORECASTING USING NEURAL NETWORKS AND FUZZY LOGIC

NEURAL - FUZZY LOGIC FOR AUTOMATIC OBJECT RECOGNITION

Neural Networks, Fuzzy Logic, and Statistical Methods

SOLAR TRACKING USING FUZZY LOGIC

Fuzzy Logic

CLASSICAL LOGIC and FUZZY LOGIC

Fuzzy Logic