Ensemble Fusion Model for Enhanced Speech Emotion Recognition and Confusion Resolution

Автор: Rania Ahmed, Mahmoud Hussein, Arabi Keshk

Журнал: International Journal of Information Technology and Computer Science @ijitcs

Статья в выпуске: 1 Vol. 18, 2026 года.

Бесплатный доступ

In the field of human-computer interaction, identifying emotion from speech and understanding the full context of spoken communication is a challenging task due to the imprecise nature of emotion, which requires detailed speech analysis. In the area of speech emotion recognition, various techniques have been employed to extract emotions from audio signals, including several well-established speech analysis and classification methods. Despite numerous advancements in recent years, many studies still fail to consider the semantic information present in speech. Our study proposes a novel approach that captures both the paralinguistic and semantic aspects of the speech signal by combining state-of-the-art machine learning techniques with carefully crafted feature extraction strategies. We address this task using feature-engineering-based techniques, which involve extracting meaningful audio features such as energy, pitch, harmonics, pauses, central momentum, chroma, zero-crossing rate, and Mel-frequency cepstral coefficients (MFCCs). These features capture important acoustic patterns that help the model learn emotional cues more effectively. This work is primarily conducted on the IEMOCAP dataset, a large and well-annotated emotional speech corpus. By framing our task as a multi-class classification problem, we extract 15 features from the audio signal and use them to train five machine learning classifiers. Additionally, we incorporate text-domain features to reduce ambiguity in emotional interpretation. We evaluate our model's performance using accuracy, precision, recall, and F-score across all experiments.

Еще

Speech Emotion Recognition, Machine Learning, Multimodal, Ensemble Model

Короткий адрес: https://sciup.org/15020188

IDR: 15020188 | DOI: 10.5815/ijitcs.2026.01.05