UDK 004.93
COMBINING CLUSTERING AND CLASSIFICATION APPROACHES FOR SPEECH-BASED EMOTION RECOGNITION PROBLEM
A. S. Polyakova, M. Yu. Sidorov, E. S. Semenkin
Reshetnev Siberian State Aerospace University; 31, Krasnoyarsky Rabochy Av., Krasnoyarsk, 660037, Russian Federation; Ulm University; 43, Albert Einstein Alee, Ulm, 89081, Germany
Communication is an important ability of a human, which is based on linguistics and the emotional component. In the field of technology, the emotion recognition is still a challenge, especially when the recognition is based solely on the voice, which is the primary means of human communication. Selecting of relevant features for automatic classification and recognition is an important step. Recognition efficiency of speaker’s emotions depends on the database used in the system. Recognition of speaker’s emotions is a difficult task, since it requires a set of consecutive operations, such as voice activity identification, feature extraction, training and classification. Speech-based emotion recognition is one of the most popular and common task in the field of the computer linguistics. In this area, the main criterion is the accuracy of the classification procedures. In current work, a variety of data mining techniques, such as artificial neural networks, logistic regression, support vector machines, are proposed to solve the problem of automatic emotion recognition. To improve the performance of emotion recognition we used pre-clustering and classification approaches. The method of principal component analysis is used for selecting important features. Testing of the proposed approach was carried out with the task of emotion recognition based on acoustic characteristics.
Keywords: emotion recognition, clustering, classification, artificial neural networks, support vector machines.
References

1. Kheydorov I. E., Tszinbin’ Ya., Shi U, Soroka A. M., Trus A. A. [Classification of speech – based emotion using support vector machines]. Rechevye tekhnologii, 2008, No. 3, P. 63−71 (In Russ).

2. Eyben F., Wöllmer M., Schuller B. Opensmile: the munich versatile and fast open source audio feature extractor. Proceedings of the international conference on Multimedia, 2010, P. 1459–1462.

3. Boersma P. Praat, a system for doing phonetics by computer. Glot international, 2002, 5(9/10), P. 341–345.

4. Pantic M., Rothkrantz L. J. M. Toward an Affect- Sensitive Multimodal Human-Computer Interaction. Proceedings of the IEEE, Spec. Issue on Human- Computer Multimodal Interface, 2003, Vol. 91, No. 9, P. 1370–1390.

5. Brester K. Yu., Vishnevskaya S. R., Semenkina O. E., Sidorov M. Yu. [An effective procedure for authentication of student speech in distance education]. Vestnik SibGAU. 2014, No. 5 (57), P. 51–57 (In Russ.).

6. Coletta L. F. S., Hruschka E., Acharya A., Ghosh J. A differential evolution algorithm to optimise the combination of classifier and cluster ensembles. International Journal of Bio-Inspired Computation, 2014, Vol. 7, No. 2, P. 111–124.

7. Rahman A., Verma B. Cluster-based ensemble of classifiers. Expert Systems, 2013, Vol. 30, No. 3, P. 270–282.

8. Lefever E., Fayruzov T., Hoste V. A combined classification and clustering approach for web people disambiguation. Proceedings of the 4th International Workshop on Semantic Evaluations, 2007, P. 105–108.

9. Papas D., Tjortjis C. Combining Clustering and Classification for Software Quality Evaluation. Proceedings 8th Hellenic Conference on AI, SETN 2014, Ioannina, Greece, 2014, P. 273–286.

10. Kyriakopoulou A., Kalamboukis T. Combining Clustering with Classification for Spam Detection in Social Bookmarking Systems. Proceedings of the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases Discovery Challenge, (ECML/PKDD RSDC ’08), 2008, P. 47–54.

11. Ghosh J., Acharya A., Cluster ensembles. Wiley Interdisc. Rew.: Data Mining and Knowledge Discovery, 2011, Vol. 1, No. 4, P. 305–315.

12. Coletta L. F. S., Silva N. F. F., Hruschka E. R., Hruschka J. R., Estevam R. Combining Classification and Clustering for Tweet Sentiment Analysis. Proceedings of the Brazilian Conference on Intelligent Systems (Bracis 2014), São Carlos, 2014. P. 210–215.

13. Brester C., Semenkin E., Sidorov M., Minker W. Self-adaptive multi-objective genetic algorithms for feature selection. Proceedings of the International Conference on Engineering and Applied Sciences Optimization, Kos Island, Greece, 2014, P. 1838–1846.

14. Sidorov M., Brester C., Minker W., Semenkin E. Speech-Based Emotion Recognition: Feature Selection by Self-Adapted Multi-Criteria Genetic Algorithm. Proceedings of the 9th edition of the Language Resources and Evaluation Conference (LREC), Reykjavik, Iceland, 2014, P. 3481–3485.

15. Burkhardt F., Paeschke A., Rolfes M., Sendlmeier W. F., Weiss B. A database of german emotional speech. Proceedings of the International Speech Communication Association, Baixas, France, 2005, P. 1517–1520.


Polyakova Anastasiya Sergeevna – laboratory assistant, Department of Systems Analysis and Operations Research, Reshetnev Siberian State Aerospace University. E-mail: polyakova_nasty@mail.ru.

Sidorov Maxim Yurievich – Master’s Degree student, University of Ulm. Е-mail: maxim.sidorov@uni-ulm.de.

Semenkin Eugene Stanislavovich – Dr. Sc., professor, Department of System Analysis and Operations Research, Reshetnev Siberian State Aerospace University. Е-mail: eugenesemenkin@yandex.ru.