Speech Emotion Recognition Application for Education


  • Weijia Xian




Speech Emotion Recognition; PAD Dimensions; Convolutional Neural Network (CNN); Least Squares Support Vector Machine (LSSVM).


Based on convolutional neural networks, a speech recognition application capable of analyzing human emotions is designed. This speech emotion recognition can better assist teachers to understand students' emotional status in the learning process and enable them to improve their teaching methods with the help of the system, thus achieving the goal of improving students' learning efficiency. The application is based on PAD dimension, convolutional neural network to extract deep speech emotion features, and Least squares support vector machine for emotion recognition, thus improving the recognition accuracy of this application.


Download data is not yet available.


Han WJ, Li HF, Ruan HB, et al. A Review of Research Advances in Speech Emotion Recognition[J]. Journal of Software, 2014, 25(1): 37-50.

ZBANCIOC M D, FERARU M. Using the Lyapunov exponent from cepstral coefficients for automatic emotion recognition [C] // International Conference and Exposition on Electrical and Power Engineering. Iasi, Romania: IEEE, 2014: 110–113.

SUN Ying, SONG Chun-xiao. Emotional speech feature extraction and optimization of phase space reconstruction [J]. Journal of Xidian University: Natural Science, 2017, 44(6): 162–168.

Cha Cheng. Research on speech emotion Recognition Algorithm based on feature Learning [D]. Nanjing: Southeast University, 2017:1-2.

Li Danyan. Research on speech emotion Recognition based on deep Learning [D]. Beijing: Beijing University of Posts and Telecommunications,2020:7.

Wang Li. Research on dimensional and continuous emotion prediction in valence-arousal space MEHRABIAN A. Pleasure-arousal-dominance: a general framework for describing and measuring individual differences in temperament [J]. Current Psychology, 1996, 14(4): 261–292.

LeCun Y, Boser B, Denker J S, et al. Backpropagation applied to handwritten zip code recognition[J]. Neural Computation, 1989, 1(4): 541-551.

Yu F, Koltun V. Multi-scale context aggregation by dilated convolutions[C]. International Conference on Learning Representations, 2016: San Juan, Puerto Rico.

WANG Jian-xin, CHEN Xiao-jie. Application in sintering process modeling using the feature selection algorithm of least squares support vector machine [J]. Machinery Design and Manufacture,2018(3): 75–77.

Burkhardt F, Paeschke A, Rolfes M, et al. A database of German emotional speech. [C].International Speech Communication Association, Lisbon, Portugal, 2005: 1517-1520.

Wang W, Wu J. Notice of Retraction: Emotion recognition based on CSO&SVM in e-learning[C]//2011 Seventh International Conference on Natural Computation. IEEE, 2011, 1: 566-570.

Poria S, Cambria E, Bajpai R, et al. A review of affective computing: From unimodal analysis to multimodal fusion [J]. Information Fusion, 2017, 37: 98-125.




How to Cite

Xian, W. (2022). Speech Emotion Recognition Application for Education. BCP Education & Psychology, 7, 378–383. https://doi.org/10.54691/bcpep.v7i.2691