Electronic corpus of the Tatar language based on the model of linguistic knowledge graphs
Автор: Gatiatullin A.R., Mukhamedshin D.R., Prokopyev N.A., Suleymanov D.S.
Журнал: Онтология проектирования @ontology-of-designing
Рубрика: Прикладные онтологии проектирования
Статья в выпуске: 4 (54) т.14, 2024 года.
Бесплатный доступ
The article presents a new version of the electronic corpus of the Tatar language, updated based on a linguistic knowledge graph model for Turkic languages. This new version of the corpus allows for information description across multiple linguistic levels: morphonological, syntactic, and semantic, through the use of knowledge graphs to represent linguistic data. This approach enhances corpus functionality, enabling searches that incorporate syntactic and semantic information. A distinctive feature of the electronic corpus implementation is that the model employed aligns closely with the structural and functional characteristics of Turkic languages and serves as a foundation for developing various software products for semantic text processing in Turkic languages. In particular, these products include the linguistic portal "Turkic Morphme" and the new version of the Tatar language electronic corpus, "Tugan Tel.".
Electronic corpus, knowledge graph, database management system, linguistic unit, turkic languages
Короткий адрес: https://sciup.org/170207431
IDR: 170207431 | DOI: 10.18287/2223-9537-2024-14-4-542-554