160 likes | 175 Views
Explore segmentation and event detection in soccer audio, utilizing domain knowledge for improved analysis. Features observations, segmentation techniques, decision rules, and detection results.
E N D
Segmentation and Event Detection in Soccer Audio Lexing Xie, Prof. Dan EllisEE6820, Spring 2001 April 24th, 2001
The problem • Event detection in sports video • In this project: the audio part • Our approach • Segmentation + Event Detection • Incorporate domain knowledge
Related work Observations on soccer audio Segmentation Features Decision scheme Result Event detection Scope Feature metric Result Generalization Next step Outline
Related Work • Audio segmentation • Speech-silence discrimination [Rabiner78] • Speech / music / mixture segmentation[Saunders96] [Scheirer97] [Williams99] • Sports audio analysis • Classify excited speech [Rui2000] • Keyword/event template matching [Chang96] [Rui2000]
Observations #1 • Sound Types • Foreground speech • Noisy vocal sound with visible phoneme structure • Background noise • Ambient crowd, whistles, cheers, etc. • Acoustics [Fahy2001] • Sound intensity in open space: • Sound attenuation in air • Production conditions • Frequency response of microphone • Automatic Gain Control
Observations #2 • Large variety across games • Commentator “verbosity” • Audience “excitability” not labeling and training • In different languages not ASR • Not template-matching & training • Assumptions on temporal characteristics • Short-term dynamics • Long-term variety
Seg. boundary sound Post-processing Feature extraction Decision Rules 1st formant energy Fricative energy Morphological operations Energy > Global Avg. & adaptive threshold Segmentation Algorithm • Commentary vs. Crowd segmentation
Segmentation Result commentary commentary commentary crowd crowd
Most distinctive segment Seg. boundaries Distance metric Pick up crowd,chop into units Feature calculation Spectral: centroid, roll-off Energies: E, Er1, Er2 feature contour and moments of the contours Detection #1 • Detecting audio events in crowd noise • Examples: crowd cheering, whistle, … • Subjective definition
Detection #2 • Compute Mahalanobis distances [Duda 73] • Feature element normalization and decorrelation • Pick up distinctive segments • Largest distance to all other segments (typically top 5~10%) • Clustering: detecting outliers • Merge adjacent segments
Time (sec) 100.2 55.0 95.2 49.1 0 128 Attacking.. Start Foul! Penalty kick GOAL! Detection Results • The game: River Plate vs. Los Andes • Assumptions: • The majority are Unimportant • We do have Important parts! • Cluster analysis helps
Generalization • Segmentation tasks • Other Sports (baseball, tennis, etc.) • Film sound track (Sense and Sensibility) • Detection of sparse audio events • Surveillance video Speech Speech Music Silence Silence
Next step • More experiments • Improve decision scheme • Improve GMM in segmentation • Use cluster analysis in detection • New features • Wish list • Classification of speech segments • Other interesting noise patterns • Investigate sound mixtures
Summary • Segmentation • Use energy features • Best result: precision 95%, recall 92% • Event detection • Use feature distance • Interesting segments retrieved • More work to follow