Applied classification problems using ridge regression
Автор: Kononova N.V., Mangalova E.S., Stroev A.V., Cherdantsev D.V., Chubarova O.V.
Журнал: Сибирский аэрокосмический журнал @vestnik-sibsau
Рубрика: Информатика, вычислительная техника и управление
Статья в выпуске: 2 т.20, 2019 года.
Бесплатный доступ
The rapid development of technical devices and technology allows monitoring the properties of different physical nature objects with very small discreteness of the data. As a result, one can accumulate large amounts of data that can be used with advantage to manage an object, a multiply connected system, and a technological enterprise. However, regardless of the field of activity, the tasks associated with small amounts of data remains. In this case the dynamics of data accumulation depends on the objective limitations of the external world and the environment. The conducted research concerns high-dimensional data with small sample sizes. In this connection, the task of selecting informative features arises, which will allow both to improve the quality of problem solving by eliminating “junk” features, and to increase the speed of decision making, since algorithms are usually dependent on the dimension of the feature space, and simplify the data collection procedure (do not collect uninformative data). As the number of features can be large, it is impossible to use a complete search of all features spaces. Instead of it, for the selection of informative features, we propose a two-step random search algorithm based on the genetic algorithm uses: at the first stage, the search with limiting the number of features in the subset to reduce the feature space by eliminating “junk” features, at the second stage - without limitation, but on a reduced set features. The original problem formulation is the task of supervised classification when the object class is determined by an expert. The object attributes values vary depending on its state, which makes it belong to one or another class, that is, statistics has an offset in class. Without breaking the generality, for carrying out simulation modeling, a two-alternative formulation of the supervised classification task was used. Data from the field of medical diagnostics of the disease severity were used to generate training samples.
Small samples, supervised classification, ridge-regression, quantile transformation, meta-classifier, significance of features, genetic algorithm
Короткий адрес: https://sciup.org/148321905
IDR: 148321905 | DOI: 10.31772/2587-6066-2019-20-2-153-159
Список литературы Applied classification problems using ridge regression
- Vafaie H., De Jong K. Robust Feature Selection Algorithms. Proceedings of the IEEE International Conference on Tools with Artificial Intelligence. 1993, P. 356-363.
- Cormen T. H., Leiserson C. E., Rivest R. L., Stein
- C. Introduction to Algorithms. 3rd edition. The MIT Press. 2009, 1320 p.
- Narendra P., Fukunaga K. A Branch and Bound Algorithm for Feature Subset Selection. IEEE Transactions on Computers. 1977, Vol. 26, P. 917-922.
- Foroutan I., Sklansky J. Feature Selection for Automatic Classification of non- Gaussian Data. IEEE Transactions on Systems, Man and Cybernetics. 1987, Vol. 17, P. 187-198.