What/when causal expectation modelling in monophonic pitched and percussive audio
Title | What/when causal expectation modelling in monophonic pitched and percussive audio |
Publication Type | Conference Paper |
Year of Publication | 2007 |
Conference Name | NIPS-Workshop Music, Brain, & Cognition |
Authors | Hazan, A. , Brossier P. , Marxer R. , & Purwins H. |
Abstract |
A causal system for representing a musical stream and generating further expected events is presented. Starting from an auditory front-end which extracts low-level (e.g. spectral shape, MFCC, pitch) and mid-level features such as onsets and beats, an unsupervised clustering process builds and maintains a set of symbols aimed at representing musical stream events using both timbre and time descriptions.
The time events are represented using inter-onset intervals relative to the beats. These symbols are then processed by an expectation module based on Predictive Partial Match, a multiscale technique based on N-grams. To characterise the system capacity to generate an expectation that matches its transcription, we use a weighted average F-measure, that takes into account the uncertainty associated with the unsupervised encoding of the musical sequence. The potential of the system is demonstrated in the case of processing audio streams which contain drum loops or monophonic singing voice. In preliminary experiments, we show that the induced representation is useful for generating expectation patterns in a causal way. During exposure, we observe a globally decreasing prediction entropy combined with structure-specific variations. |
preprint/postprint document | files/publications/79ee88-NIPS-2007-ahazan.pdf |