Language model with uncertainty based memory augmentation for multi-hop question answering task
Автор: Sagirova A.R., Burtsev M.S.
Журнал: Труды Московского физико-технического института @trudy-mipt
Рубрика: Информатика и управление
Статья в выпуске: 3 (59) т.15, 2023 года.
Бесплатный доступ
Transformers have become the gold standard for many natural language processing tasks, however, models with self-attention mechanisms struggle to process long sequences due to their quadratic complexity. Therefore, processing long texts remains a challenge. To address this issue, we propose a two-stage method that first collects relevant information over the entire document and then combines it with local context to solve the task. Our experimental results show that fine-tuning a pre-trained model with memory-augmented input, including the least uncertain global elements, improves the model’s performance on multi-hop question answering task compared to the baseline. We also found that the content of the global memory correlates with the supporting facts required for the correct answer.
Transformer, global memory, multi-hop question answering
Короткий адрес: https://sciup.org/142239994
IDR: 142239994