Effectiveness of prompt engineering strategies in generating mathematics educational content: An experimental study

Автор: Danilov A.V., Zaripova R.R., Lukoyanova M.A., Batrova N.I., Salekhova L.L.

Журнал: Science for Education Today @sciforedu

Рубрика: Математика и экономика для образования

Статья в выпуске: 4 т.15, 2025 года.

Бесплатный доступ

Introduction. The article presents the results of a study on generating high-quality educational content in mathematical literacy for 5th-grade students using generative AI. The problem stems from the lack of adaptive assignments that meet educational standards and the limitations of AI (hallucinations, non-reproducibility). The aim of the study is to develop, test, and evaluate the effectiveness of an original prompt-engineering strategy for generating pedagogically relevant and age-appropriate math problems. Materials and Methods. The study employs systemic and activity-based approaches. Methods include analysis of AI applications in education, experimental task generation using a hybrid prompt-engineering strategy (Few-Shot Learning + Chain-of-Thought + Role Prompting) based on ChatGPT-4o, expert evaluation (10 mathematics teachers with ≥12 years of experience), and statistical data processing (Cohen’s κ, mean values µ). Verification involved generating tasks in a new context (airports) and assessing them based on adequacy, student-appropriateness, and complexity criteria. Results. Key findings demonstrate the successful implementation of the strategy, enabling the generation of structurally consistent tasks (κ = 0.82). The critical role of Chain-of-Thought prompting in creating multi-step problems is emphasized. The authors highlight the dual functionality of tasks (learning and assessment). The experiment confirmed high expert ratings for adequacy (µ = 4.81), format compliance (µ = 4.77), and descriptive completeness (µ = 4.82). A limitation in terminology complexity for some tasks was identified. Conclusions. The study concludes that the combined prompt-engineering strategy is highly effective for generating standards-aligned tasks and has strong potential for integration into digital learning platforms. Further optimization of linguistic adaptation and the development of a validation pipeline are required for implementation.

Еще

Prompt engineering, Educational task generation, Mathematical literacy, Generative AI, Chain-of-Thought, Role prompting

Короткий адрес: https://sciup.org/147251601

IDR: 147251601   |   DOI: 10.15293/2658-6762.2504.05

Статья научная