90 likes | 176 Views
AVICAR Progress Since April 2006. 8 Mics, Pre-amps, Wooden Baffle. Best Place= Sunvisor. 4 Cameras, Glare Shields, Adjustable Mounting Best Place= Dashboard. AVICAR: Recording Hardware. System is not permanently installed; mounting requires 10 minutes.
E N D
8 Mics, Pre-amps, Wooden Baffle. Best Place= Sunvisor. 4 Cameras, Glare Shields, Adjustable Mounting Best Place= Dashboard AVICAR: Recording Hardware System is not permanently installed; mounting requires 10 minutes.
AVICAR: Data Summary • 100 Talkers • 5 noise conditions: • Engine idling, • 35mph, windows closed / windows open • 55mph, windows closed / windows open • 4 types of utterances: • Isolated Digits • Phone numbers • Isolated Letters (e-set = articulation test) • TIMIT sentences • Publicly available (samples online)
Current Asynchronous AVSR(Chu and Huang, 2001) • Improves accuracy for CMU office commands database (10 speakers, 78 words, 3 lip features) • Not yet applied to AVICAR (because lip tracking problem not solved)
Audio Speech Recognition • Best Audio Enhancement: MVDR+MMSElogSA, Direction-Based VAD • Best Digit Recognition (99% average, 95% at 55mph with windows down): delay & sum beamforming, spectrum-based VAD, cross-training
Digression: AdaBoost • Simple features (rectangles; 4-8 additions to compute each feature). About 480,000 different features computed in training. Boosting chooses features, one at a time, to minimize total error.
Face Tracking • Face Tracker (AdaBoost): Works perfectly for some faces, not all… • Manual face & lip segmentation: 480 faces (120 frames from 24 talkers) • Currently retraining AdaBoost with new segmentations
AVICAR: Lip Tracking • Lip Tracker (SVM): Seems to work perfectly (so far) once it’s been initialized • What’s missing: Face & lip positions & sizes in AVICAR are highly predictable!!!