Research on the potential of generative artificial intelligence for providing expert-level evaluative feedback in open-ended mathematical problems assessment

Автор: Lukoyanova M.A., Danilov A.V., Zaripova R.R., Salekhova L.L., Batrova N.I.

Журнал: Science for Education Today @sciforedu

Рубрика: Математика и экономика для образования

Статья в выпуске: 6 т.15, 2025 года.

Бесплатный доступ

Introduction. Modern education faces a contradiction between the active integration of generative artificial intelligence and its underexplored potential for providing evaluative feedback in development students’ mathematical literacy. The purpose of the article is to identify the potential of using a generative language model as a teacher’s tool for generating expert-level evaluative feedback when assessing open-ended mathematical problems Materials and Methods. The research is based on systemic-activity, criteria-oriented, and comparative approaches. Methods employed included theoretical analysis of scholarly literature, criteria-based assessment combined with prompt engineering techniques, as well as quantitative and qualitative analysis to determine the agreement between the evaluative feedback generated by the language model and that provided by a human expert. The sample consisted of 51 students Results. The research experimentally confirmed the feasibility of using generative artificial intelligence for providing evaluative feedback in mathematics education. An effective strategy for automating the assessment of open-ended mathematical problems was developed and substantiated, based on criteria-based assessment and prompt engineering techniques using GigaChat Pro language model. Empirical data revealed a moderate agreement between the evaluative feedback generated by GigaChat Pro and that provided by an expert teacher: accuracy reached 73%, Cohen’s coefficient (k) was 0,57, and the semantic similarity of textual comments (BertScore F1) was 0,614. Conclusions. The research concludes that generative language model holds significant potential for transforming assessment practice of open-ended mathematical problems. Key applications include automating and personalizing expert-level evaluative feedback, and scaling criteria-based assessment. Feedback quality is enhanced by optimizing assessment prompts, implementing multi-agent verification, and introducing selective assessment.

Еще

Evaluative feedback, Generative language model, Criteria-based assessment, Prompt engineering techniques, Open-ended problems, Mathematical literacy

Короткий адрес: https://sciup.org/147252839

IDR: 147252839   |   УДК: 004.8+51-77+37.031   |   DOI: 10.15293/2658-6762.2506.07