Multimodal Music Mood Classification using Audio and Lyrics

TitleMultimodal Music Mood Classification using Audio and Lyrics
Publication TypeConference Paper
Year of Publication2008
Conference NameInternational Conference on Machine Learning and Applications
AuthorsLaurier, C., Grivolla J., & Herrera P.
Conference Start Date11/12/2008
Conference LocationSan Diego, California (USA)
Keywordsclassification, emotion, lyrics, mir, mood, Music

In this paper we present a study on music mood classification using audio and lyrics information. The mood of a song is expressed by means of musical features but a relevant part also seems to be conveyed by the lyrics. We evaluate each factor independently and explore the possibility to combine both, using Natural Language Processing and Music Information Retrieval techniques. We show that standard distance-based methods and Latent Semantic Analysis are able to classify the lyrics significantly better than random, but the performance is still quite inferior to that of audio-based techniques. We then introduce a method based on differences between language models that gives performances closer to audio-based classifiers. Moreover, integrating this in a multimodal system (audio+text) allows an improvement in the overall performance. We demonstrate that lyrics and audio information are complementary, and can be combined to improve a classification system. Indeed, Juslin [5] reported that 29% people mentioned the lyrics as a factor of how music expresses emotions, showing the relevance of studying the lyrics in that context.

Our focus is to study the complementarity of the lyrics and the audio information to automatically classify songs by mood. In this paper, we first present different approaches using audio and lyrics separately, and then propose a multimodal classification system integrating the two modalities.


preprint/postprint document