Active Learning for User-Tailored Refined Music Mood Detection

TitleActive Learning for User-Tailored Refined Music Mood Detection
Publication TypeMaster Thesis
Year of Publication2011
AuthorsSarasúa, Á.
preprint/postprint documenthttp://mtg.upf.edu/system/files/publications/Sarasua-Alvaro-Master-thesis-2011.pdf
AbstractMood detection is an increasingly area of interest in Music Information Retrieval (MIR). This thesis, which is built on top of the work by Cyril Laurier and Perfecto Herrera in the Music Technology Group (MTG), deals with two identified problems which are interrelated: first, the need for expanding current mood tags (happy, sad, aggressive and relaxed) to more specific and complex emotions; second, the need for customization that these complex emotions require. For the first problem, four new mood classifiers are implemented: mysterious, humorous, sentimental and triumphant. New song collections are created and user-validated in order to achieve this task. For the second issue (customization), the use of active learning techniques is explored. Active learning is based on the idea that a system can converge to its best performance more quickly by being able to smartly select those songs with which it is trained. A state-of-the-art review on active learning techniques is presented and uncertainty-based techniques are tested over the new and already existing song collections. ANalysis Of VAriance (ANOVA) showed that there is no significant improvement between active learning and standard full-set batch training. Active learning, though, would require the processing of fewer instances to achieve equivalent performance. As an extra experiment, a new happy collection with classical music is created. Different cases are studied and compared: a system trained with the old collection (that does not contain classical music) tested over the new one and vice versa, a system trained joining both collections and tested by 10-fold cross validation, etc. The results suggest that systems trained with certain genres do not generalize well to others.
intranet