A Novel Multimodal Sarcasm Detection Methodology with Emotion Recognition Using E-RS-GRU and KLKI-FUZZY Techniques
Автор: Ravi Teja Gedela, J.N.V.R. Swarup Kumar, Venkateswararao Kuna, Sasibhushana Rao Pappu
Журнал: International Journal of Modern Education and Computer Science @ijmecs
Статья в выпуске: 6 vol.17, 2025 года.
Бесплатный доступ
Sarcasm, a subtle form of expression, is challenging to detect, especially in modern communication platforms where communication transcends text to encompass videos, images, and audio. Traditional sarcasm detection methods rely solely on textual data and often struggle to capture the nuanced emotional inconsistencies inherent in sarcastic remarks. To overcome these shortcomings, this paper introduces a novel multimodal framework incorporating text, audio, and emoji data for more effective sarcasm detection and emotion classification. A key component of this framework is the Contextualized Semantic Self-Guided BERT (CS-SGBERT) model, which generates efficient word embeddings. Primarily, frequency spectral analysis is performed on the audio data, followed by preprocessing and feature extraction, while text data undergoes preprocessing to extract lexicon and irony features. Meanwhile, emojis are analyzed for polarity scores, which provide a rich set of multimodal features. The fused features are then optimized using the Camberra-based Dingo Optimization Algorithm (C-DOA). The selected features and the embedded words from the preprocessed texts are given to Entropy-based Robust Scaling - Gated Recurrent Units (E-RS-GRU) for detecting sarcasm. Experimental results on the MUStARD dataset show that the proposed E-RS-GRU model achieves an accuracy of 76.65% and F1-score of 76.9%, with a relative improvement of 2.18% over the best-performing baseline and 1.25% over the best-performing state-of-the-art model. Additionally, KLKI-Fuzzy model is proposed for emotion recognition, which dynamically adjusts membership functions through Kullback-Leibler Kriging Interpolation (KLKI), enhancing emotion classification by processing features from all modalities. The KLKI-Fuzzy model exhibits enhanced emotion recognition performance with reduced fuzzification and defuzzification times.
Sarcasm Detection, Emotion Classification, Frequency Spectral Analysis, Feature Fusion, Feature Optimization
Короткий адрес: https://sciup.org/15020062
IDR: 15020062 | DOI: 10.5815/ijmecs.2025.06.08