Tonal Description of Music Audio Signals
Title | Tonal Description of Music Audio Signals |
Publication Type | PhD Thesis |
Year of Publication | 2006 |
University | Universitat Pompeu Fabra |
Authors | Gómez, E. |
Advisor | Serra, X. |
Academic Department | Department of Information and Communication Technologies |
Abstract |
This dissertation is about tonality. More precisely, it is concerned with the problems that appear when computer programs try to automatically extract tonal descriptors from musical audio signals. This doctoral dissertation proposes and evaluates a computational approach for the automatic description of tonal aspects of music from the analysis of polyphonic audio signals.
In this context, we define a tonal description in different abstraction levels, differentiating between low-level signal descriptors (e.g. tuning frequency or pitch class distribution) and high-level textual labels (e.g. chords or keys). These high-level labels require a musical analysis and the use of tonality cognition models. We also establish different temporal scales for description, defining some instantaneous features as being attached to a certain time instant, and other global descriptors as related to a wider segment (e.g. a section of a song). Along this PhD thesis, we have proposed a number of algorithms to directly process digital audio recordings from acoustical instruments, in order to extract tonal descriptors. These algorithms focus on the computation of pitch class distributions descriptors, the estimation of the key of a piece, the visualization of the evolution of its tonal center or the measurement of the similarity between two different musical pieces. Those algorithms have been validated and evaluated in a quantitative way. First, we have evaluated low-level descriptors, such as pitch class distribution features and estimation of the tuning frequency (with respect to 440 Hz), and their independence with respect to timbre, dynamics and other external factors to tonal characteristics. Second, we have evaluated the method for key finding, obtaining an accuracy around 80%. This evaluation has been made for a music collection of 1400 pieces with different characteristics. We have studied the influence of different aspects such as the employed tonal model, the advantage of using a cognition-inspired model vs machine learning methods, the location of the tonality within a musical piece, and the influence of the musical genre on the definition of a tonal center. Third, we have proposed the extracted features as a tonal representation of an audio signal, useful to measure similarity between two pieces and to establish the structure of a musical play. For this, we have evaluated the use of tonal descriptors to identify versions of the same song, obtaining an improvement of 55% over the baseline. From a more general standpoint, this dissertation substantially contributes to the field of computational tonal description
|
Final publication | http://hdl.handle.net/10803/7537 |