Solving binary classification problem using machine learning methods

Автор: Danshina A.A., Babenko A.A.

Журнал: НБИ технологии @nbi-technologies

Рубрика: Информационные технологии в безопасности и телекоммуникациях

Статья в выпуске: 3 т.18, 2024 года.

Бесплатный доступ

This study discusses machine learning models for solving binary classification problems. An algorithm for processing a data set for training and testing is provided, as well as a comparative analysis of the proposed models, based on the results of which the most rational one for achieving the stated goal was determined. A structured dataset consisting of 86 records on seven criteria was created, with categorical variables such as location, merchant and gender transformed using one-point coding. Correlation analysis was performed to assess the relationships between the numerical features. The dataset was then divided into training (70%) and test (30%) subsets for model evaluation. The different machine learning models were compared using F1-Score metric. All considered machine learning models cope with the objective of solving the binary classification problem. However, the AdaBoost model, used in conjunction with a weak single-level decision tree algorithm, turned out to be the most rational in use. High efficiency is achieved by bousting weak classifiers, which also compensates for such a problem of the model as overtraining.

Еще

Binary classification, machine learning models, data set, data distribution, correlation

Короткий адрес: https://sciup.org/149147330

IDR: 149147330 | УДК: 519.7 | DOI: 10.15688/NBIT.jvolsu.2024.3.4