A system for extracting symptom mentions from texts by means of neural networks

Автор: Serdyuk Yuri Petrovich, Vlasova Natalia Aleksandrovna, Momot Seda Rubenovna

Журнал: Программные системы: теория и приложения @programmnye-sistemy

Рубрика: Медицинская информатика

Статья в выпуске: 1 (56) т.14, 2023 года.

Бесплатный доступ

This paper presents a system for extracting symptom mentions from medical texts in natural (Russian) language. The system finds symptom mentions in texts, brings them to a standard form and identifies the found symptom to a group of similar symptoms. For each stage of processing we use a separate neural network. We extract symptoms of three areas of diseases: allergic and pulmonological diseases, as well as coronavirus infection (COVID-19). We present and describe an annotated corpus of sentences that is used to train neural networks for extracting symptom mentions. These sentences were marked up with the help of a simple XML-like language. An extended BIO-markup format was proposed for the sentences directly received at the input of the neural network. We give the quality evaluation of the symptom extraction accuracy under strict and flexible testing. Possible approaches to normalization and identification of symptom mentions and their implementation are described. Our results are compared with those achieved in similar researches, thus we show the place of our system among clinical decision support systems.

Еще

Natural language processing, neural networks, information extraction, symptom mentions, annotated corpus, bert-models, covid-19

Короткий адрес: https://sciup.org/143180115

IDR: 143180115   |   DOI: 10.25209/2079-3316-2023-14-1-95-123

Статья научная