Semantic Annotation of Music Collections: A Computational Approach

TitleSemantic Annotation of Music Collections: A Computational Approach
Publication TypePhD Thesis
Year of Publication2012
UniversityUniversitat Pompeu Fabra
AuthorsSordo, M.
AdvisorSerra, X., & Celma Ò.
Number of Pages239
Date Published12/2011
CityBarcelona (Spain)
Keywordsaudio tag classification, folksonomies, music information retrieval, Music tagging, Semantic categorization, semantic space, Social music
AbstractMusic consumption has changed drastically in the last few years. With the arrival of digital music, the cost of production has substantially dropped. The expansion of the World Wide Web has helped to promote the exploration of many more music content. Online stores, such as iTunes or Amazon, own music collections in the order of millions of songs. Accessing these large collections in an effective manner is still a big challenge. In this dissertation we focus on the problem of annotating music collections with semantic words, also called tags. The foundations of all the methods used in this dissertation are based on techniques from the fields of information retrieval, machine learning, and signal processing. We propose an automatic music annotation algorithm that uses content-based audio similarity to propagate tags among songs. The algorithm is evaluated extensively using multiple music collections of varying size and quality of the data, including a large music collection of more than a half million songs, annotated with social tags derived from a music community. We assess the quality of our proposed algorithm by comparing it with several state of the art approaches. We also discuss the importance of using evaluation measures that cover different dimensions; per–song and per–tag evaluation. Our proposal achieves state of the art results, and has ranked high in the MIREX 2011 evaluation campaign. The obtained results also show some limitations of automatic tagging, related to data inconsistencies, correlation of concepts and the difficulty to capture some personal tags with content information. This is more evident in music communites, where users can annotate songs with any free text word. In order to tackle these issues, we present an in-depth study of the nature of music folksonomies. We concretely study whether tag annotations made by a large community (i.e. a folksonomy) correspond with a more controlled, structured vocabulary by experts in the music and the psychology fields. Results reveal that some tags are clearly defined and understood both by the experts and the wisdom of crowds, while it is difficult to achieve a common consensus on the meaning of other tags. Finally, we extend our previous work to a wide range of semantic concepts. We present a novel way to uncover facets implicit in social tagging, and classify the tags with respect to these semantic facets. The latter findings can help to understand the nature of social tags, and thus be beneficial for further improvement of semantic tagging of music. Our findings have significant implications for music information retrieval systems that assist users to explore large music collections, digging for contentthey might like.
intranet