• Feature Opitimization of Speech Emotion Recognition   [iCBBE 2016]
  • Author(s)
  • Chunxia Yu
  • Speech emotion is divided into four categories, Fear, Happy, Neutral and Surprise in this paper. Traditional features and their statistics are generally applied to recognize speech emotion. In order to quantify each feature’s contribution to emotion recognition, a method based on the Back Propagation (BP) neural network is adopted. Then we can obtain the optimal subset of the features. What’s more, two new characteristics of speech emotion, MFCC feature extracted from the fundamental frequency curve (MFCCF0) and amplitude perturbation parameters extracted from the short-time average magnitude curve (APSAM), are added to the selected features. With the Gaussian Mixture Model (GMM), we get the highest average recognition rate of the four emotions 82.25%, and the recognition rate of Neutral 90%.
  • Feature Opitimization, Speech Emotion Recognition
  • References

Engineering Information Institute is the member of/source content provider to