Using ontologies to contextualize queries to large language models

Бесплатный доступ

The use of large language models has become a common phenomenon in question-answering and dialog systems. For this, the model must be pre-trained on prepared text data, enabling it to generate highly probable correct responses in a dialog with the user. However, answer quality decreases when the questions pertain to objects, processes, or phenomena that are less described in the texts used to train the model. For this purpose, data that is new to the model is transferred to it along with the user query in the form of context, which is usually generated using a vector database of text fragments. The article proposes to use an ontology of a subject area as a source of contextual data instead of a vector database. Ontologies are supplied with a lexical representation of their formalized terminology system to identify an ontological fragment relevant to the user query and convert it into a natural language text of the formed context. This allows to reduce the response text volume while improving its semantic alignment with the user query. The article discusses the minimum structural requirements for the lexical representation of an ontology, including natural language names, their forms for concepts and relations, as well as their lexical meanings. The application of the proposed approach is shown through an example of obtaining an answer to a question on scientific articles using a large language model. The advantages and disadvantages of the proposed approach are discussed.

Еще

Ontology, large language model, query, context, response generation

Короткий адрес: https://sciup.org/170209597

IDR: 170209597   |   DOI: 10.18287/2223-9537-2025-15-2-239-248

Статья научная