Today I read a paper titled “Emotional State Categorization from Speech: Machine vs. Human”
The abstract is:
This paper presents our investigations on emotional state categorization from speech signals with a psychologically inspired computational model against human performance under the same experimental setup
Based on psychological studies, we propose a multistage categorization strategy which allows establishing an automatic categorization model flexibly for a given emotional speech categorization task
We apply the strategy to the Serbian Emotional Speech Corpus (GEES) and the Danish Emotional Speech Corpus (DES), where human performance was reported in previous psychological studies
Our work is the first attempt to apply machine learning to the GEES corpus where the human recognition rates were only available prior to our study
Unlike the previous work on the DES corpus, our work focuses on a comparison to human performance under the same experimental settings
Our studies suggest that psychology-inspired systems yield behaviours that, to a great extent, resemble what humans perceived and their performance is close to that of humans under the same experimental setup
Furthermore, our work also uncovers some differences between machine and humans in terms of emotional state recognition from speech