Lexicographic problems of machine translation systems: on the way from literal to neural
Автор: Beliaeva L., Kamshilova O.
Журнал: Вестник Волгоградского государственного университета. Серия 2: Языкознание @jvolsu-linguistics
Статья в выпуске: 5 т.23, 2024 года.
Бесплатный доступ
The article discusses some current issues of interpreting out-of-vocabulary words by modern machine translation systems (MT systems) in the context of changing forms and ways of maintaining an automatic dictionary. It provides a critical outline of the typology of MT systems and strategies for their development. It describes the impact of fast developing software and technologies on these strategies and analyzes the changes they bring into the forms of dictionary support. The research shows that the linguistic support and the structure of automatic dictionaries, whatever the MT system is, are fundamentally important for ensuring the quality of translation. Despite all the success of neural MT (NMT) systems, their automatically updated vocabulary databases do not record words characterized by terminological specificity and low frequency in the special texts and text corpora on which the system is trained. Analysis of translations performed by two popular NMT systems - Google Translate and Yandex Translate - has proven that they fail to process and unify the translation of words that are not entered in the system dictionaries, a task used to be solved easily by users of all types of MT systems with the help of automatic dictionaries. With statistic-based automatic dictionaries it remains a pressing problem and requires a special approach when editing MP results.
Machine translation, machine translation strategy, typology of machine translation systems, automatic dictionary, out-of-vocabulary words, linguistic support
Короткий адрес: https://sciup.org/149147497
IDR: 149147497 | DOI: 10.15688/jvolsu2.2024.5.1