Audio Fingerprinting

TitleAudio Fingerprinting
Publication TypeMaster Thesis
Year of Publication2011
AuthorsGarzón Lorenzo, A.
preprint/postprint document

The online multimedia catalogue is growing every day, opening new perspectives for users to enjoy and share audio collections. It would be interesting to discover ways to organize this enormous amount of audio information without the need of expensive human generated metadata. Computers are very accurate and fast searching text documents, but the future of user-friendly interaction with audio demands bridging the semantic gap that nowadays prevents audio content-based search. Nevertheless, in the last ten years, many commercial applications that use low level audio information to identify music or video succeeded amazingly, with accurate results in very short times and referencing huge content databases. However, audio effects and degradations like filtering, dynamic range compression, echo, delay, distortion, perceptual coding, adding background speech, noise, etc. make the identification task much harder. In this research work we focus on developing a robust and accurate fingerprinting algorithm that serves for all kinds of audio content, including music and speech. The goal beyond this thesis would be to include it in a live application that analyzes an incoming unknown audio excerpt and identifies it in a few seconds comparing its extracted fingerprint with thousands of pre-computed fingerprints stored in a database. We developed a novel fingerprint that achieves high discriminatory power and robustness. Initially, computational speed and database search optimization are not considered key factors and will be postponed until the implementation of a full and efficient system.