1 / 40

“Reinventing the wheel”:A Novel approach to music player interfaces

“Reinventing the wheel”:A Novel approach to music player interfaces. Tim Pohle, Peter Knees, Markus Schedl, Elias Pampalk, and Gerhard Widmer IEEE Transactions on Multimedia, Vol 9, No. 3, April 2007. Present by Yi-Tang Wang. Outline. Introduction Audio-Based Similarity Web-Based Similarity

kai-rosario
Download Presentation

“Reinventing the wheel”:A Novel approach to music player interfaces

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. “Reinventing the wheel”:A Novel approach to music player interfaces Tim Pohle, Peter Knees, Markus Schedl, Elias Pampalk, and Gerhard Widmer IEEE Transactions on Multimedia, Vol 9, No. 3, April 2007 Present by Yi-Tang Wang

  2. Outline • Introduction • Audio-Based Similarity • Web-Based Similarity • Problem Modeling • Evaluation and Results • Conclusion & future work

  3. Introduction • A novel music player interface using a wheel • Generating a circular playlist from personal repositories • Keeps on playing similar tracks • Not only audio-based similarity is used, but also text-based similarity

  4. Audio-Based Similarity • MFCCs ( Mel frequency cepstral coefficients ) • Discarding the higher-order MFCCs • beneficial for the ability to compare different frames, but possibly at the cost of discarding musically meaningful information.

  5. Audio-Based Similarity • The wave file were downsampled to 22 kHz • 19 MFCCs per frame • Ignoring the temporal order • Model the distribution of MFCC coefficients with Gaussian mixture model

  6. Audio-Based Similarity • Similarity between music • Compute the distance between two GMM • Likelihood • computing the probability that the MFCCs of song A be generated by the model of B • Drawback: need to store all MFCC coefficients

  7. Audio-Based Similarity • Sampling • Only store the GMM parameters, instead of storing MFCCs • Sample from one GMM • compute the likelihood given another GMM • Corresponds roughly to re-creating a song

  8. Web-Based Similarity • Cultural, social, historical, and contextual aspects should be taken into account • WWW information • Query using artist’s name + ”music” with Google • 50 top-ranked pages are retrieved • Remove all terms that - # of occur page < c • Such that about 10000 terms remain

  9. Web-Based Similarity • Term frequency tfta • a : artist , t : term • # of occurrences of t in documents related to a • Document Frequency dft • # of pages t occurred in • Term weight per artist • term frequency × inverse document frequency

  10. Web-Based Similarity • Each artist is described by a vector of term weights • Apply cosine normalization on the vector • Euclidean distance is a simple similarity measure • In this paper, we use SOM as measure method

  11. Web-Based Similarity - SOM • SOM -Self-organizing Maps • a subtype of artificial neural networks • It is trained using unsupervised learning • low dimensional representation of the training samples while preserving the topological properties of the input space • Using a rectangular 2-D grid in this paper for text-based similarity between songs

  12. Web-Based Similarity - SOM • A SOM consists of units • A model vector in the high-dimensional input data space is assigned to each of the units. • model vectors which belong to units close to each other on the 2-D grid, are also close to each other in the data space. • Training to choose model vectors Unit

  13. Web-Based Similarity - SOM • Batch-SOM algorithm • Initial • Randomly initialize the model vector • 1st step • for each data item xi, the Euclidean distance between x and each model vector is calculated • each data item x is assigned to the unit ci that represents it best.

  14. Web-Based Similarity - SOM • 2nd step • neighborhood relationship between two units is usually defined by a Gaussian-like function • hjk = exp(-djk2/rt2) • djk= distance on the map , rt= neighborhood radius • rt decrease with each iteration (the adaptation strength decreases gradually)

  15. Web-Based Similarity - SOM • Two artist is similar if they are mapped to same or adjacent units Newer experiments have actually shown that 6 × 6 grid might be better for this collection

  16. Combining two approach • Adding a constant value to the audio-based distance matrix for all songs of dissimilar artists • Half of maximum audio-based distance • Adding Penalty to transitions between songs by dissimilar artist

  17. Previous work • P. Knees, M. Schedl, T. Pohle and G.Widmer, “An Innovative Three Dimensional User Interface for Exploring Music Collections Enriched with Meta-Information from the Web,” ACM MM’06 • Audio-based similarity – Fluctuation Patterns • Using SOM only on audio-based data • Labeling SOM with information from www • A 3-D browsing system

  18. Problem Modeling • Map the playlist generation problem to Traveling Salesman Problem • The cities correspond to the tracks in collection • The distances are determined by the similarities between the tracks • Find a optimal route = producing a circular playlist

  19. TSP Problem • Greedy Algorithm • All edges are examined in order of increasing length and add to the route properly • Minimum Spanning Tree • Found a minimum spanning tree and do DFS • Connecting the nodes in the order they are first visited • LKH • Lin-Kernighan algorithm proposed in 1971 • Start with randomly generated tour • Deleting edges from the route and recombining the remaining tour fragments

  20. TSP Problem • One-Dimensional SOM • Train a 1-D cyclic SOM • a circular playlist • As many units as tracks? • Recursive approach • Combining subtour in a greedy manner

  21. Evaluation & Results • Collection 1 • 2545 tracks, 13 genres • A Cappella (4.4%), Acid Jazz (2.7%), Blues (2.5%), Bossa Nova (2.8%), Celtic (5.2%), Electronica (21.1%), Folk Rock (9.4%), Italian (5.6%), Jazz (5.3%), Metal (16.1%), Punk Rock (10.2%), Rap (12.9%), and Reggae (1.8%) • 103 artists • for each artist, minimum - 8 tracks, maximum - 61 tracks

  22. Evaluation & Results • Collection 2 • 3456 tracks, 7 genres • Classical (14.7%), Dance (15.0%), Hip-Hop (14.5%), Jazz (13.6%), Metal (14.9%), Pop (11.6%), and Punk (15.6%). The minimum number • 339 artists • for each artist, minimum - 1 tracks, maximum - 317 tracks

  23. Fluctuations Between Genres • A Cappella, Acid Jazz, Blues, Bossa Nova, Celtic, Electronica, Folk Rock, Italian, Jazz, Metal, Punk Rock, Rap, andReggae (collection 1)

  24. Shannon Entropy • Estimate how locally coherent a playlist is • Count how many of n consecutive tracks belonged to each genre • n = 2…12 • Typical album contains about 12 tracks • Average over the whole playlist • SOM yields better results on web-enhanced data than LKH on audio only data

  25. Shannon Entropy

  26. Long-Term Consistency • SOM algorithm on combined data

  27. Long-Term Consistency • MinSpan algorithm on audio similarity data

  28. Long-Term Consistency • Greedy algorithm on audio similarity data

  29. Long-Term Consistency

  30. User Study • 10 test persons using the collection 2 • Create a large playlist • Extract 10 seed tracks • Randomly choosing a start point • Selecting tracks at intervals of 3 degress • Generate two playlist • Adding the next nine tracks • Randomly choose from same genre

  31. User Study • Users rate each playlist from 1 to 5 • Summing up rating scores • Calculate the difference tspi,j-geni,j • i : playlist no. , j : user

  32. User Interface

  33. User Interface • The user interface is very intuitive and its handling extremely easy • Apple’s iPod • Users’ opinion • A scanning function to skip 10 seconds when pressing • Genres containing only a few tracks are quite difficult to locate • Not usable when finding a specific track

  34. Summary of Evaluation Result • all TSP algorithms provided better results with respect to our playlist evaluation criteria when using the web based extension • the combined similarity measure reduces the number of unexpected placements of tracks in the playlist

  35. Summary of Evaluation Result • LKH and greedy algorithm • best small-scale genre entropy values • large-scale genre distributions are quite fragmented • SOM-based algorithm • highest entropy values • the least fragmented long-term genre distributions • MinSpan algorithm • in the middle field regarding the entropy values

  36. Conclusion & future work • a new approach to conveniently access the music stored in mobile sound players • The whole collection is ordered in a circular playlist and thus accessible with only one input wheel • two different similarity measures — one relying on timbre information, the other on a combination of timbre and community metadata gathered from artist related web pages

  37. Conclusion & future work • Problems to solve • Not possible to precisely select a desired piece • only tracks selectable that are representative for a region • zooming or hierarchical structuring techniques • The user does not know in advance which region on the wheel contains which style of music

  38. Conclusion & future work • M. Schedl, T. Pohle, P. Knees, and G.Widmer, “Assigning and visualizing music genres by web-based co-occurrence analysis,” in Proc. 7th Int. Conf. Music Information Retrieval (ISMIR’06), Victoria, Canada, Oct. 2006.

  39. Thank You

More Related