Differential evolution in the decision tree learning algorithm
Автор: Mitrofanov S.A., Semenkin E.S.
Журнал: Сибирский аэрокосмический журнал @vestnik-sibsau
Рубрика: Информатика, вычислительная техника и управление
Статья в выпуске: 3 т.20, 2019 года.
Бесплатный доступ
Decision trees (DT) belong to the most effective classification methods. The main advantage of decision trees is a simple and user-friendly interpretation of the results obtained. But despite its well-known advantages the method has some disadvantages as well. One of them is that DT training on high-dimensional data is very time-consuming. The paper considers the way to reduce the DT learning process duration without losses of classification accuracy. There are different algorithms of DT training; the main of them being ID3 and CART algorithms. The paper proposes a modification of DT learning algorithms by means of the information criterion optimization for some selected attribute. The use of this modification allows avoiding optimization by means of enumeration search over the entire data set. The Separation Measure method is used to select the attribute. The method selects the attribute whose class-based averages are most distant from each other. Optimization of the selected attribute is carried out using the method of differential evolution, which is one of the evolutionary modeling methods designed to solve problems of multidimensional optimization. Self-configuring at the population level based on the probabilities of using mutation operator's variants was applied for differential evolution. The classification problems were solved to compare standard DT learning algorithms with the modified ones. Algorithm efficiency refers to the percentage of correctly classified test sample objects. Statistical analysis based on Student's t-test was carried out to compare the efficiency of the algorithms. The analysis showed that the use of the proposed modification of the DT learning algorithm makes it possible to significantly speed up the training process without losses in the classification effectiveness.
Separation measure, population-level dynamic probabilities, success history adaptation, decision tree, classification, optimization, differential evolution
Короткий адрес: https://sciup.org/148321923
IDR: 148321923 | DOI: 10.31772/2587-6066-2019-20-3-312-319
Список литературы Differential evolution in the decision tree learning algorithm
- Classification and Regression Trees / L. Breiman, J. H. Friedman, R. A. Olshen et al. Wadsworth. Belmont. California. 1984. 128 p.
- Hastie T., Tibshirani R., Friedman J. The Elements of Statistical Learning. Springer, 2009. 189 p.
- Ross Quinlan J. C4.5: Programs for Machine learning. Morgan Kaufmann Publishers. 1993. 302 p.
- Quinlan J. R. Induction of decision trees // Machine learning. 1986. No. 1(1). P. 81-106.
- David L. Davies, Donald W. Bouldin. A Cluster Separation Measure // IEEE Transactions on Pattern Analysis and Machine Intelligence. 1979. Vol. PAMI-1, Iss. 2. P. 224-227.