Models and algorithms for automatic grouping of objects based on the k-means model

Бесплатный доступ

The paper is devoted to the study and development of new algorithms for automatic grouping of objects. The algorithms can improve the accuracy and stability of the result of solving practical problems, such as the problems of identifying homogeneous batches of industrial products. The paper examines the application of the k-means algorithm with the Euclidean, Manhattan, Mahalanobis distance measures for the problem of automatic grouping of objects with a large number of parameters. A new model is presented for solving problems of automatic grouping of industrial products based on the k-means model with the Mahalanobis distance measure. The model uses a training procedure by calculating the averaged estimate of the covariance matrix for the training sample (sample with pre-labeled data). A new algorithm for automatic grouping of objects based on an optimization model of k-means with the Mahalanobis distance measure and a weighted average covariance matrix calculated from a training sample is proposed. The algorithm allows reducing the proportion of errors (increasing the Rand index) when identifying homogeneous production batches of products based on the results of tests. A new approach to the development of genetic algorithms for the k-means problem with the use of a single greedy agglomerative heuristic procedure as the crossover operator and the mutation operator is presented. The computational experiment has shown that the new mutation procedure is fast and efficient in comparison with the original mutation of the genetic algorithm. The high rate of convergence of the objective function is shown. The use of this algorithm allows a statistically significant increase both in the accuracy of the result (improving the achieved value of the objective function within the framework of the chosen mathematical model for solving the problem of automatic grouping), and in its stability, in a fixed time, in comparison with the known algorithms of automatic grouping. The results show that the idea of including a new mutation operator in the genetic algorithm significantly improves the results of the simplest genetic algorithm for the k-means problem.

Еще

Automatic grouping, k-means, mahalanobis distance, genetic algorithm

Короткий адрес: https://sciup.org/148321983

IDR: 148321983   |   DOI: 10.31772/2587-6066-2020-21-3-347-354

Статья научная