Scalability, generality and temporal aspects in automatic recognition of predominant musical instruments in polyphonic music

TitleScalability, generality and temporal aspects in automatic recognition of predominant musical instruments in polyphonic music
Publication TypeConference Paper
Year of Publication2009
Conference NameConference of the International Society for Music Information Retrieval (ISMIR)
AuthorsFuhrmann, F., Haro G. G., & Herrera P.
Conference Start Date26/10/2009
Conference LocationKobe, Japan
AbstractIn this paper we present an approach towards the classification of pitched and unpitched instruments in polyphonic audio. In particular, the presented study accounts for three aspects currently lacking in literature: model scalability to polyphonic data, model generalisation in respect to the number of instruments, and incorporation of perceptual information. Therefore, our goal is a unifying recognition framework which enables the extraction of the main instruments’ information. The applied methodology consists of training classifiers with audio descriptors, using extensive datasets to model the instruments sufficiently. All data consist of real world music, including categories of 11 pitched and 3 percussive instruments. We designed our descriptors by temporal integration of the raw feature values, which are directly extracted from the polyphonic data. Moreover, to evaluate the applicability of modelling temporal aspects in polyphonic audio, we studied the performance of different encodings of the temporal information. Along with accuracies of 63% and 78% for the pitched and percussive classification task, results show both the importance of temporal encoding as well as strong limitations of modelling it accurately.
preprint/postprint documenthttp://mtg.upf.edu/files/publications/fuhrmann_haro_herrera_ismir09.pdf
intranet