Human-human task-oriented conversations corpus for interaction quality modeling
Автор: Spirina A.V., Sidorov M. Yu., Sergienko R.B., Semenkin E.S., Minker W.
Журнал: Сибирский аэрокосмический журнал @vestnik-sibsau
Рубрика: Математика, механика, информатика
Статья в выпуске: 1 т.17, 2016 года.
Бесплатный доступ
Speech is the main modality for human communication. It can tell a lot about its owner: their emotions, intelligence, age, psychological portrait and others properties. Such information can be useful in different fields: in call centres for improvement in the quality of service, in designing Spoken Dialogue Systems for better adaptation of a system to users' behaviour, in the automatization of some processes for analysing people's psychological state in a situation with a high level of responsibility, for example, in a space programme. One such characteristic is the Interaction Quality. The Interaction Quality is a quality metric, which is used in the field of Spoken Dialogue Systems to evaluate the quality of human-computer interaction. As well as in Spoken Dialogue Systems, the Interaction Quality can be applied for estimating the quality of human-human conversations. As with any investigation in the field of speech analytics, for modelling the Interaction Quality for human-human conversations a specific corpus of task-oriented dialogues is required. Although there is a large number of speech corpora, for some tasks, as, for example, for Interaction Quality modelling, it is still difficult to find appropriate specific corpora. That is why we decided to generate our own corpus based on dialogues between the customers and agents of one company. In this paper we describe the current state of this corpus. It contains 53 dialogues, corresponding to 1165 exchanges. It includes audio features, paralinguistic information and experts' labels. We plan to extend this corpus both in the feature set and in the observations.
Interaction quality, human-human conversation, speech analysis, speech corpus
Короткий адрес: https://sciup.org/148177555
IDR: 148177555
Список литературы Human-human task-oriented conversations corpus for interaction quality modeling
- Schmitt A., Schatz B., Minker W. Modeling and predicting quality in spoken human-computer interaction. Proceedings of the SIGDIAL 2011 Conference. Association for Computational Linguistics, 2011, P. 173-184.
- European Language Resources Association. Available at: http://elra.info/Language-Resources-LRs. html (accessed 03.10.2015).
- Linguistic Data Consortium. Available at: https://catalog.ldc.upenn.edu/(accessed 03.10.2015).
- Bechet F., Maza B., Bigouroux N., Bazillon T., El-Bèze M., R. De Mori, Arbillot E. DECODA: a call-center human-human spoken conversation corpus. International Conference on Language Resources and Evaluation (LREC), 2012, P. 1343-1347.
- Vincenzo Pallotta, Rodolfo Delmonte, Lammert Vrieling, David Walker. Interaction Mining: the new frontier of Call Center Analytics. CEUR Workshop Proceedings, 2011, P. 1-12.
- Rafaelli A., Ziklik L., Doucet L. The Impact of Call Center Employees' Customer Orientation Behaviors on Service Quality. Journal of Service Research, 2008, Vol. 10, No. 3, P. 239-255.
- Lavalley R., Clavel C., Bellot P., El-Bèze M. Combining text categorization and dialog modeling for speaker role identification on call center conversations. INTERSPEECH, 2010, P. 3062-3065.
- Meignier S., Merlin T., LIUM SpkDiarization: An Open Source Toolkit For Diarization. Proceedings of CMU SPUD Workshop, 2010.
- Shout. Available at: http://shout-toolkit.sourceforge. net/(accessed 03.10.2015).
- Audacity. Available at: http://audacityteam.org/(accessed 03.10.2015).
- FFmpeg. Available at: https://www.ffmpeg.org/(accessed 03.10.2015).
- Schmitt A., Ultes S., Minker W. A Parameterized and Annotated Corpus of the CMU Let's Go Bus Information System. International Conference on Language Resources and Evaluation (LREC), 2012, P. 3369-3373.
- Eyben F., Weninger F., Gross F., Schuller B. Recent Developments in openSMILE, the Munich Open-Source Multimedia Feature Extractor. Proceedings of ACM Multimedia (MM), 2013, P. 835-838.
- M. Sidorov, A. Schmitt and E. Semenkin. Automated Recognition of Paralinguistic Signals in Spoken Dialogue Systems: Ways of Improvement. Journal of Siberian Federal University, Mathematics and Physics, 2015, Vol. 8, No. 2, P. 208-216.
- Flotr2. Available at: http://www.humblesoftware. com/flotr2/(accessed 03.10.2015).
- Praat: doing phonetics by computer. Available at: http://www.fon.hum.uva.nl/praat/(accessed 03.10.2015).