1 / 1

Finding Duplicates and Generating Audio Thumbs

Finding Duplicates and Generating Audio Thumbs. Chris Burges, Erin Renshaw, Dan Plastina † , John Platt and Rico Malvar . Communication, Collaboration and Signal Processing Group. Generating Thumbnails. Problem Statement. Finding Duplicates. In my music collection: Find duplicate files

zytka
Download Presentation

Finding Duplicates and Generating Audio Thumbs

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Finding Duplicates and Generating Audio Thumbs Chris Burges, Erin Renshaw, Dan Plastina†, John Platt and Rico Malvar Communication, Collaboration and Signal Processing Group Generating Thumbnails Problem Statement Finding Duplicates • In my music collection: • Find duplicate files • Find junk audio files • Generate15 second audio • thumbnails for browsing • All automatically! • Load audio file • Compute traces in fixed • user-chosen window • Compare against • fingerprints (FPs) from • previous files • If a match, declare a • duplicate • Else, save FP for that file • Repeat from (1) 'til done How Does It Work? • Use RARE fingerprints (FPs) • = 64 numbers encoding 6 sec • of audio • Duplicate Detection • Find files with similar FPs, or • FPs that match ‘junk’ • Audio Thumbnails • Generate FPs for entire file • Find repeated FPs within file • Form clumps of repeated FPs • Take well-separated, high • energy clump as thumbnail Above: ‘Spectral fullness’ for a 5-verse Dylan song. Results on 40,991 songs: Results: Using 6 quality measures (e.g. ‘Includes sung title?’), blind testing against thumbs chosen 30 seconds into the song gave 28% improvement. †Windows Digital Media Division

More Related