1 / 9

AVICAR Progress Since April 2006

AVICAR Progress Since April 2006. 8 Mics, Pre-amps, Wooden Baffle. Best Place= Sunvisor. 4 Cameras, Glare Shields, Adjustable Mounting Best Place= Dashboard. AVICAR: Recording Hardware. System is not permanently installed; mounting requires 10 minutes.

walda
Download Presentation

AVICAR Progress Since April 2006

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. AVICAR Progress Since April 2006

  2. 8 Mics, Pre-amps, Wooden Baffle. Best Place= Sunvisor. 4 Cameras, Glare Shields, Adjustable Mounting Best Place= Dashboard AVICAR: Recording Hardware System is not permanently installed; mounting requires 10 minutes.

  3. AVICAR: Data Summary • 100 Talkers • 5 noise conditions: • Engine idling, • 35mph, windows closed / windows open • 55mph, windows closed / windows open • 4 types of utterances: • Isolated Digits • Phone numbers • Isolated Letters (e-set = articulation test) • TIMIT sentences • Publicly available (samples online)

  4. Current Asynchronous AVSR(Chu and Huang, 2001) • Improves accuracy for CMU office commands database (10 speakers, 78 words, 3 lip features) • Not yet applied to AVICAR (because lip tracking problem not solved)

  5. Current Asynchronous AVSR(Chu and Huang, 2001)

  6. Audio Speech Recognition • Best Audio Enhancement: MVDR+MMSElogSA, Direction-Based VAD • Best Digit Recognition (99% average, 95% at 55mph with windows down): delay & sum beamforming, spectrum-based VAD, cross-training

  7. Digression: AdaBoost • Simple features (rectangles; 4-8 additions to compute each feature). About 480,000 different features computed in training. Boosting chooses features, one at a time, to minimize total error.

  8. Face Tracking • Face Tracker (AdaBoost): Works perfectly for some faces, not all… • Manual face & lip segmentation: 480 faces (120 frames from 24 talkers) • Currently retraining AdaBoost with new segmentations

  9. AVICAR: Lip Tracking • Lip Tracker (SVM): Seems to work perfectly (so far) once it’s been initialized • What’s missing: Face & lip positions & sizes in AVICAR are highly predictable!!!

More Related