A review of classification methods and databases used for speech emotion recognition


  • Shrikala Madhav Deshmukh Amity University Mumbai
  • Sita Devulapalli Amity University Mumbai






Artificial Neural Networks (ANN), Convolutional Neural Networks (CNNs), Classification Methods, Database, Gaussian Mixture Model (GMM), Hidden Markov Model (HMM), Neural Network Classifier, Recurrent Neural Network (RNN), Speech Emotion Recognition (SER),


In today’s world speech is the ideal way to interact with people. Speech emotion recognition (SER) has an increasingly significant role in the interactions among human beings and computers. For improving human machine interaction, it is very ideal to recognize emotions automatically because attention is aimed at study of the emotions. This paper is a review of classification methods and databases used for speech emotion recognition. Here two important fields in speech emotion recognition are addressed. First is the choice of appropriate classification method and second is the creation of emotional speech database or choosing appropriate database. The main purpose behind this review paper is to analyze the efficiency of several techniques widely used among the field of speech emotion recognition.




[1] Panagiotis Tzirakis, Jiehao Zhang, Bjorn W. Schuller, End-To-End speech emotion recognition using deep neural networks, IEEE (ICASSP 2018), pages 5089-5093.

[2] B. Yang, M. Lugger, Emotion recognition from speech signals using new harmony features, Elsevier Signal Processing 90 (2010), pages1415–1423.

[3] Assel Davletcharovaa, Sherin Sugathanb, Bibia Abrahamc, Alex Pappachen Jamesa, Detection and analysis of emotion from speech signals, Elsevier Procedia Computer Science 58 (2015), pages 91-96. https://doi.org/10.1016/j.procs.2015.08.032.

[4] Leila Kerkeni, Youssef Serrestou, Mohamed Mbarki, KosaiRaoof and Mohamed Ali Mahjoub, Speech emotion recognition: methods and cases study, International Conference on Agents and Artificial Intelligence (ICAART 2018), ISBN: 978-989-758-275-2, Volume 2, pages 175-182.

[5] Esther Ramdinmawii, Abhijit Mohanta and Vinay Kumar Mittal, Emotion recognition from speech signal, Proc. of the 2017 IEEE Region 10 Conference (TENCON), 2017, pages 1562-1567. https://doi.org/10.1109/TENCON.2017.8228105.

[6] Bong-Seok Kang, Chul-Hee Han, Sang-Tae Lee, Dae-HeeYoun and Chungyong Lee, Speaker dependent emotion recognition using speech signals, International Conference on Spokan language processing (ICSLP 2000).

[7] Shivaji J. Chaudhari, Ramesh M. Kagalkar, Automatic speaker age estimation and gender dependent emotion recognition, International Journal of Computer Applications, Volume 117 – No. 17, May 2015, pages 5-10.

[8] Jun-Seok Park and Soo-Hong Kim, Emotion recognition from speech signals using fractal features, International Journal of Software Engineering and Its Applications, Vol.8, No.5, 2014, pages 15-22.

[9] Nisha Chandran, Mahesh B. S., emotion recognition of speech signals using priori information of speaker’s gender, International Journal of Advanced Research in Electrical, Electronics and Instrumentation Engineering, Vol. 5, Issue 6, June 2016, pages 4509-4520.

[10] Puja Ramesh Chaudhari and John Sahaya Rani Alex, Selection of features for emotion recognition from speech, Indian Journal of Science and Technology, Vol.9 (39), October 2016.

[11] Igor Bisio, Alessandro Delfino, Fabio Lavagetto, Mario Marchese and Andrea Sciarrone, Gender-driven emotion recognition through speech signals for ambient intelligence applications, IEEE Transactions on Emerging Topics in Computing. 1, No. 2, December 2013, pages 244-257. https://doi.org/10.1109/TETC.2013.2274797.

[12] Sravani Nellore, A. Ramesh Kumar, V. Naveen Kumar, Emotion recognition from speech using embedded board OMAP 3530, International Journal for Advance Research in Engineering and Technology, Vol. 1, Issue IX, Oct.2013 ISSN 2320-6802, pages 38-44.

[13] Shing-Tai Pan, Chih-Hung Wu, Chen-Sen Ouyang, Ying-Wei Lee, Emotion recognition from speech signals by using evolutionary algorithm and empirical mode decomposition, Proceedings of EVA London 2018, UK, 2018, pages 140-147. https://doi.org/10.14236/ewic/EVA2018.29.

[14] Carlos Busso, Angeliki Metallinou, Iterative feature normalization scheme for automatic emotion detection from speech, IEEE Transactions on Affective Computing, June 2012.

[15] Zixing Zhang, Jing Han, Jun Deng, Xinzhou Xu, Fabien Ringeval, Björn Schuller, Leveraging unlabeled data for emotion recognition with enhanced collaborative semi-supervised learning, IEEE, 2018, pages 22196-22209.

[16] E. Sarath Kumar Naik, K. Suvarna, Comparative analysis of speaker dependent, speaker independent and cross language emotion recognition from speech using SVM, IRACST - International Journal of Computer Science and Information Technology & Security (IJCSITS), ISSN: 2249-9555, Vol.6, No. 4, July-August 2016.

[17] SanaulHaq and Philip J.B. Jackson, Speaker-dependent audio-visual emotion recognition, International Conference on Audio-Visual Speech Processing, 2009, pages 53-58.

[18] Manav Bhaykar, Jainath Yadav and K. Sreenivasa Rao, Speaker dependent, speaker independent and cross language emotion recognition from speech using GMM and HMM, IEEE, 2013.

[19] Nisha Beegum S, Wavelet and Fourier features based emotion recognition of speech signals, International Journal of Advanced Research in Computer and Communication Engineering, Vol. 5, Issue 1, January 2016.

[20] M. Morales-Perez, J. Echeverry-Correa, A. Orozco-Gutierrez and G. Castellanos-

[21] Dominguez, Feature extraction of speech signals in emotion identification, International IEEE EMBS Conference, 2008, pages 2590-2593.

[22] A Tickle, S Raghu and M Elshaw, Emotional recognition from the speech signal for a virtual education agent, Journal of Physics, 2013, pages 1-6.

[23] Lianzhang Zhu, Leiming Chen, Dehai Zhao, Jiehan Zhou and Weishan Zhang, Emotion recognition from chinese speech for smart affective services using a combination of SVM and DBN, Sensors MDPI, 2017.

[24] Kun Han, Dong Yu, Ivan Tashev, Speech emotion recognition using deep neural network and extreme learning machine, ISCA, 2014, pages 223-227.

[25] Mohan Ghai, Shamit Lal, Shivam Duggaand Shrey Manik, Emotion recognition on speech signals using machine learning, IEEE,2017, pages 34-39.

[26] Firoz Shah A., Raji Sukumar A., Babu Anto P., Speaker and text dependent automatic emotion recognition from female speech by using artificial neural networks, IEEE, 2009, pages 1411-1413. https://doi.org/10.1109/NABIC.2009.5393712.

[27] Aharon Satt, Shai Rozenberg, Ron Hoory, Efficient emotion recognition from speech using deep learning on spectrograms, ISCA,2017, pages 1089-1093.

[28] K. Sathiyamurthy, T. Pavidhra, B. Monisha and K. VishnuPriya, Hidden Markov model approach towards emotion detection from speech signal, CSCP, 2015, pages 13-19.

View Full Article: