Fusion of musical contents, brain activity and short term physiological signals for music-emotion recognition

TitleFusion of musical contents, brain activity and short term physiological signals for music-emotion recognition
Publication TypeMaster Thesis
Year of Publication2017
AuthorsJarjoura, J.
AbstractDetecting and classifying emotions from EEG signals has been reported to be a complex and subject dependent task. In this study we propose a multi-modal machine learning approach, combining EEG and Audio features for music emotion recognition using a categorical model of emotions. The dataset used consists of film music that was carefully created to induce strong emotions. Five emotion categories were adopted: Fear, Anger, Happy, Tender and Sad. EEG data was obtained from three male participants listening to the labeled music excerpts. The emotion classification accuracy from the stand-alone EEG system achieved 76.70%, 81.35% and 77.23% for three participants using Support Vector Machines (SVM). EEG and Audio features were extracted, and later we applied machine learning techniques to study the improvement in the emotion recognition task using multi-modal features. Both feature types were extracted using a frame based approach. Feature level fusion was adopted to combine EEG and Audio features. The results show that the multimodal system outperformed the EEG mono modal system with classification accuracies of 85.59%, 89.11% and 88.21% for three subjects. Additionally, we evaluated the contribution of each audio feature in the classification performance of the multimodal system. Preliminary results indicate a significant contribution of individual audio features in the classification accuracy, we also found that various audio features that noticeably contributed in the classification accuracy were also reported in previous research studying the correlation between audio features and emotion ratings using the same dataset. These results conclude that there is some relevant and important acoustic information in the audio features which could improve the performance of the emotion recognition system. Furthermore, we propose a framework for dealing with emotion recognition from physiological signals measured in short duration. Results show that certain features from skin conductance and heart rate variability were found efficient in the emotion classification task, thus the role of the activation of the autonomic nervous system in emotion recognition.
KeywordsAudio features, EEG, Machine learning, multi-modal classification
Final publicationhttps://doi.org/10.5281/zenodo.1095499
intranet