Improving the description of instrumental sounds by using ontologies and automatic content analysis

TitleImproving the description of instrumental sounds by using ontologies and automatic content analysis
Publication TypeMaster Thesis
Year of Publication2012
AuthorsVaquero, C.
preprint/postprint documentfiles/publications/Carlos-Vaquero-Master-Thesis-2012.pdf
AbstractBrowsing sound collections in a social database is a complex task when no uniformity in the classification of the sounds and tags is applied; the relation between tag concepts and sounds can vary extremely from one user to another, as well as the types of sounds associated to them. Social databases are environments that allow users to have many different approaches and classification systems when tagging their contents, since a sound can be described and tagged according to different characteristics and with different perspectives and purposes. Browsing through such databases is often complex and inaccurate, returning results are often very distant in either their sound characteristics or the tags being used. Collaborative sound databases are a perfect environment to study the problems derived from an inaccurate or a multidimensional description. Using an ontology as the basis of the classification tags applied to sounds, may not only ease the browsing of sounds through the collection, but also help to define common definitions within the community of its users. This thesis defines a methodology to build a sound collection by using a proposed ontology of tags and the content analysis of its sounds. A corpus of 700 samples has therefore been recorded, classified according to a designed ontology, integrated in the database and analyzed. In addition, similarity measures between content based descriptions and semantic descriptions of this sounds is defined by extracting six different models, providing the possibility of automatically describing eventual new sounds to be integrated within our collection. Finally, the proposed models are evaluated within three different experiments and a preliminary survey of expert users acceptance.