AI vs. Human Writing: Developing a Novel Method for Text Authenticity Detection in Education

Автор: Vijay H. Kalmani, Amol C. Adamuthe, Arati Premnath Gondil, Vaishnavi Prashant Patil, Riya Amar Kore, Vaishnavi Mahadev Metkari

Журнал: International Journal of Modern Education and Computer Science @ijmecs

Статья в выпуске: 3 vol.17, 2025 года.

Бесплатный доступ

Rapid progress in generative artificial intelligence (AI) technologies has brought forth stupendous challenges in differentiating AI-written text from human text. The Naturalness Score, a composite measure that considers lexical diversity, syntactic complexity, sentiment variability, and grammatical faults, is a new idea that emerged from this study. The Naturalness score is part of a larger machine learning framework, although it does have an individual classifier called the Naturalness-Based Logistic Regression Classifier or NLRC. The NLRC model was analyzed against a large, diverse corpus of nearly 45,000 text samples, most of which were student essays, articles, and web-scraped content. The proposed model outperformed all existing baseline models with an accuracy of 96.41%, precision of 0.98, recall of 0.95, and F1 score of 0.96. The high areas under the receiver operating characteristic curve (AUC=1.00) and precision-recall curve (AUC-PR) also indicate the effectiveness of the model in differentiating AI generated from human-written text. The proposed approach offers several advantages including increased detection accuracy, resilience against AI-generated content, cross-domain applicability, and interpretability. The research has implications for applying such models in schools, although it also calls for future research on the implications of the rapidly changing landscape of AI-generated content which it states. It emphasizes the importance of these findings in developing robust and adaptive detection systems to ensure the integrity of academic assessments, thereby preventing the misuse of AI tools.

Еще

AI Detection, Text Classification, Machine Learning, Naturalness Score, Logistic Regression, Academic Integrity, Large Language Models

Короткий адрес: https://sciup.org/15019764

IDR: 15019764   |   DOI: 10.5815/ijmecs.2025.03.04

Статья научная