Back New datasets released

New datasets released

07.10.2015

 

 

We have just released two datasets from two papers that will be presented at ISMIR this year.

The Semantic Artist Similarity dataset consists of two datasets of artists entities with their corresponding biography texts, and the list of top-10 most similar artists within the datasets used as ground truth. The dataset is composed by a corpus of 268 artists and a slightly larger one of 2,336 artists, both gathered from Last.fm. The former is mapped to the MIREX Audio and Music Similarity evaluation dataset, so that its similarity judgments can be used as ground truth. For the latter corpus we use the similarity between artists as provided by the Last.fm API.

Oramas, S., Sordo M., Espinosa-Anke L., & Serra X. (2015). A Semantic-based Approach for Artist Similarity. 16th International Society for Music Information Retrieval Conference.


- FlaBase: A Flamenco Music Knowledge Base

Its ultimate aim is to gather all available online editorial, biographical and musicological information related to flamenco music. Its content is the result of the curation and extraction processes combining several data sources (Wikipedia, MusicBrainz and Flamenco webs). FlaBase is stored in JSON format. This first release of FlaBase contains information about 1,102 artists, 74 palos (flamenco genres), 2,860 albums, 13,311 tracks, and 771 Andalusian locations.

Oramas, S., Gómez F., Gómez E., & Mora J. (2015). FlaBase: Towards the Creation of a Flamenco Music Knowledge Base. 16th International Society for Music Information Retrieval Conference.

Datasets section

Multimedia

Categories:

SDG - Sustainable Development Goals:

Els ODS a la UPF

Contact