Use of recurrent neural networks for analysis of unprocessed multilingual text

Бесплатный доступ

This article discusses general concepts of raw multilingual text analysis. A neural network based on long short-term memory (LSTM) was designed to mark sequences in order to additionally generate them at the symbol level. The network was trained to create lemmas, labels of parts of speech, and morphological characters. Sentence segmentation, tokenization and dependency analysis were handled by UDPipe 1.2. The results demonstrate the relevance of applying the proposed architecture at present.

Lstm, softmax, udpipe, neural network, machine learning, recurrent neural networksб softmax, lemmatization

Короткий адрес: https://sciup.org/170187854

IDR: 170187854   |   DOI: 10.24411/2500-1000-2020-10697

Статья научная