Classification of Multilingual Financial Tweets Using an Ensemble Approach Driven by Transformers

Автор: Rupam Bhattacharyya

Журнал: International Journal of Information Engineering and Electronic Business @ijieeb

Статья в выпуске: 2 vol.17, 2025 года.

Бесплатный доступ

There is a growing interest in multilingual tweet analysis through advanced deep learning techniques. Identifying the sentiments of Twitter (currently known as X) users during the IPO (Initial Public Offering) is an important application area in the financial domain. The number of research works in this domain is less. In this paper, we introduced a multilingual dataset entitled as LIC IPO dataset. This work also offers a modified majority voting-based ensemble technique in addition to our proposed dataset. This test-time ensembling technique is driven by fine-tuning of state-of-the-art transformer-based pretrained language models used in multilingual natural language processing (NLP) research. Our technique has been employed to perform sentiment analysis over LIC IPO dataset. Performance evaluation of our technique along with five transformer-based multilingual NLP models over this dataset has been reported in this paper. These five models are namely a) Bernice, b) TwHIN-BERT, c) MuRIL, d) mBERT, and e) XLM-RoBERTa. It is found that our test-time ensemble technique solves this multi-class sentiment classification problem defined over the proposed dataset in a better way as compared to individual transformer models. Encouraging experimental outcomes confirms the efficacy of the proposed approach.

Еще

IPO, BERT, Multilingual Tweet Processing, Sentiment Analysis, Financial Social Media, Natural Language Processing (NLP)

Короткий адрес: https://sciup.org/15019735

IDR: 15019735   |   DOI: 10.5815/ijieeb.2025.02.02

Статья научная