Adaptation of the Seam Carving Technique for Improving Audio Time-Scaling

TitleAdaptation of the Seam Carving Technique for Improving Audio Time-Scaling
Publication TypeMaster Thesis
Year of Publication2008
AuthorsTarrat, J. M.
preprint/postprint documentfiles/publications/Josep-Maria-Tarrat-Master-Thesis.pdf

This Master Thesis addresses the topic of adapting the Seam Carving Algorithm, an Image Processing Technique (IPT), for an audio Time-Scale application. We adapt an image-resizing algorithm for analyzing spectrograms images, which represent audio information. This approach aims to improve certain limitations of an existing Time Scale algorithm. The current Time-Scale Algorithm produces some sound artifacts when there is a slow audio attack and a large time-scale factor is applied. We suggest that using Seam Carving Algorithm some improvements could be achieved due to a bigger and customized analysis window than the Time-Scale Algorithm. Basically, we assume that a window with low energy of transients can be time-scaled without artifacts and, contrary, we must keep the duration of zones with high energy of transients. Using IPT we can obtain some results that are not possible using only classical Audio Processing Techniques. For instance, analyzing the Spectrogram using IPT can analyze long-term patterns difficult to detect in the Short-Term Fourier Transform (STFT). This method splits first the audio signal based on rhythm information. The Seam Carving Algorithm is applied to each window of the spectrogram images resulting from the segmentation, generating an envelope of varying time-scaling factors. From the results, we argue that with the appropriate analysis parameters, we improve the transformation quality. Also, we observed that the rhythm-based segmentation is crucial, so that a user-supervised process will be useful.