1 / 38

otion

1. in. otion. Harmony. ♫. ♫. Zohar Barzelay , Yoav Y. Schechner. Dept. Elect. Eng. Technion – Israel Institute of Technology. Ack: Einav Namer, Yael Waissman, ISF. 2. ♫. “Harmony in otion”. ♫. Barzelay, Schechner. Violin-guitar: raw. 3. ♫. “Harmony in otion”.

Download Presentation

otion

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. 1 in otion Harmony ♫ ♫ Zohar Barzelay , Yoav Y. Schechner Dept. Elect. Eng. Technion – Israel Institute of Technology Ack: Einav Namer, Yael Waissman, ISF

  2. 2 ♫ “Harmony in otion” ♫ Barzelay, Schechner Violin-guitar: raw

  3. 3 ♫ “Harmony in otion” ♫ Barzelay, Schechner Violin: Detected and Recovered

  4. 4 ♫ “Harmony in otion” ♫ Barzelay, Schechner Guitar: Detected and Recovered

  5. 5 Video features: track all Find the best Barzelay & Schechner, Harmony in Motion

  6. 6 Finding an Audio-Visual Object (AVO) Barzelay & Schechner, Harmony in Motion

  7. 7 • Corresponding images? • * Always: unmatched features • * Good image match: • many “coincidences” * Spatial Edges Spatial matching: Many “coincidences” ? ? ? Barzelay & Schechner, Harmony in Motion

  8. 8 Audio-Visual matching* Feature-based* Feature = significant change in time: temporal-edge* Maximize coincidences* No need to match everything Spatial matching* Feature-based* Feature = significant change in space: edge, corner* Maximize coincidences* No need to match everything Barzelay & Schechner, Harmony in Motion

  9. 9 Feature-based Cross-Modal Matching Barzelay & Schechner, Harmony in Motion

  10. 9 Feature-based Cross-Modal Matching Barzelay & Schechner, Harmony in Motion

  11. 10 Feature-based Cross-Modal Matching Acceleration time [frames] Barzelay & Schechner, Harmony in Motion

  12. 11 Amplitude 1 1 0 0 t t Feature-based Cross-Modal Matching t ‘Visual Onsets’ ‘Audio Onsets’

  13. 12 Audio-VisualCoincidences Barzelay & Schechner, Harmony in Motion

  14. 13 amplitude 0 energy t F 0 frequency frequency Spectrogram t 0 Audio Pre-processing Barzelay & Schechner, Harmony in Motion

  15. 14 frequency spectrogram temporal derivative t 0 Beginning of new sounds Audio Onsets t 0 Significant change in audio Barzelay & Schechner, Harmony in Motion

  16. 15 Handling pitch-drift Barzelay & Schechner, Harmony in Motion

  17. 16 directional derivative spectrogram non-directional derivative spectrogram Handling pitch-drift Barzelay & Schechner, Harmony in Motion

  18. 17 Visual Matching t t 1 0

  19. 18 Visual Matching Amplitude 0 1 -4 0 t 1 1 t 0 -5 t

  20. 19 Ranking Criterion t 0 coincidences inconsistencies 1 t 0 1 0 t Barzelay & Schechner, Harmony in Motion

  21. 20 Residual Audio Onsets t 0 coincidences Residual Onsets 1 0 1 0 t Barzelay & Schechner, Harmony in Motion

  22. 21 Sequential Object Detection Amplitude Residual Onsets 1 0 t 1 t 0 0 t Barzelay & Schechner, Harmony in Motion

  23. 22 ♫ “Harmony in otion” ♫ Barzelay, Schechner Speech: raw

  24. 23 ♫ “Harmony in otion” ♫ Barzelay, Schechner Speech A-B-C: Detected & Recovered

  25. 24 ♫ “Harmony in otion” ♫ Barzelay, Schechner Speech 1-2-3: Detected & Recovered

  26. 25 Audio Isolation

  27. 26 amplitude 0 energy t F 0 frequency frequency Spectrogram t 0 Audio Pre-processing Barzelay & Schechner, Harmony in Motion

  28. 27 Audio Isolation Spectrogram Corresponding Onsets t frequency t 0 Barzelay & Schechner, Harmony in Motion

  29. 27 Audio Isolation Harmonic Sounds Spectrogram Corresponding Onsets t frequency t 0

  30. 28 Fourier representation phase amplitude 0 energy t 0 frequency F 0 frequency Spectrogram frequency t 0 Barzelay & Schechner, Harmony in Motion

  31. 29 Filtered audio amplitude 0 t -1 F Spectrogram old phase 0 frequency energy frequency 0 frequency t 0 Barzelay & Schechner, Harmony in Motion

  32. 30 Limitations: Temporal Tolerance t ¼ sec 0 t t 00:00:16 t 1 0 Barzelay & Schechner, Harmony in Motion

  33. 31 Limitations: Audio Sparsity Overlapping audio onsets • Soundsmay overlap in time • Onsets should not frequency t Time-Frequency overlap Barzelay & Schechner, Harmony in Motion

  34. 32 00:00:15 Detection Parameters acceleration t 1 0 time Feature-Detection: • edge scale • significance level • pruning Visual Edges: Barzelay & Schechner, Harmony in Motion

  35. 33 ♫ “Harmony in otion” ♫ Barzelay, Schechner Dual Viloin

  36. 34 ♫ “Harmony in otion” ♫ Barzelay, Schechner

  37. 35 ♫ “Harmony in otion” ♫ Barzelay, Schechner

  38. 36 Feature-based Cross-Modal Association • Features: Temporal Audio/Visual Edges. • Simultaneous Objects + Sounds. • A General Concept.

More Related