300 likes | 393 Views
“ Anteproyecto de Ley para el Ordenamiento de las Retribuciones Adicionales al Salario Base del Sector Público”. Licda. Ileana Vega Montero Asesora Legal de APSE Setiembre 2012. INTRODUCCIÓN.
E N D
Song Genre and Artist Classification via Supervised Learning from Lyrics Adam Sadovsky Xing Chen CS 224N Final Project
Introduction • Goal: develop a classifier that classifies songs by their genre and/or artist using only their lyrics • Use EvilLyrics [1] program to build corpus of approximately fifteen popular albums per genre (rock, rap/hip-hop, and country) • Use Maxent and SVM classifiers with cross validation to classify lyrics • For Part-of-Speech (POS) features, we use the Stanford Log-linear Part-Of-Speech Tagger [2] to label all words in our corpus with a POS
Feature Selection Look at differences between… • Bag-of-Words: artist diction and content • Word Endings: artist style • Line Length: song pattern and rhythm Country Brooks & Dunn - Again ain't it funny, the turns life puts you through. don't know what's round the bend, man, you don't know where it's leadin' you. close your eyes, say a prayer, take it on the chin. it's a dawn sun, comes back again. baby, i thought that love was over and gone forever... never gonna come back to me. never gonna hold me again. Pearl Jam - Come Back If I keep holding out Will the light shine through? Under this broken roof It's only rain that I feel I've been wishin' out the days Oh oh oh Come back … Know that I still remain true I've been wishin' out the days Please say that if you hadn't have gone now I wouldn't have lost you another way From wherever you are Oh oh oh oh Come back {…PRP VBP VBG …} Rock Number of Lines: song length Repetition: style and rhythm Punctuation: writing style Part-of-Speech statistics: writing style
Genre Classification Attempt to distinguish between rap, rock and country Performance SVM Confusion Matrix {country, rap, rock} a b c <-- classified as 157 4 32 | a = country 6 152 3 | b = rap 46 8 119 | c = rock Feature Performance Best Alone (Maxent/SVM) Bag-of-words (75% / 72%) Word endings (73% / 72%) POS tags (61% / 61%) Most Significant Ablations Bag-of-Words (3% / 4%) Word endings (3% / 3%)
Classifying Artists • Classifier might perform better when each group of lyrics is by the same artist • Two new datasets: {beatles, u2, blink_182} (all rock) and {snoop_dogg, beatles, garth_brooks} (rap, rock, country) • Results:
References [1] http://www.evillabs.sk/evillyrics/ [2] http://nlp.stanford.edu/software/tagger.shtml