Machine learning and data mining tools applied for databases of low number of records
Автор: Anysz Hubert
Журнал: Вестник Донского государственного технического университета @vestnik-donstu
Рубрика: Информатика, вычислительная техника и управление
Статья в выпуске: 4 т.21, 2021 года.
Бесплатный доступ
The use of data mining and machine learning tools is becoming increasingly common. Their usefulness is mainly noticeable in the case of large datasets, when information to be found or new relationships are extracted from information noise. The development of these tools means that datasets with much fewer records are being explored, usually associated with specific phenomena. This specificity most often causes the impossibility of increasing the number of cases, and that can facilitate the search for dependences in the phenomena under study. The paper discusses the features of applying the selected tools to a small set of data. Attempts have been made to present methods of data preparation, methods for calculating the performance of tools, taking into account the specifics of databases with a small number of records. The techniques selected by the author are proposed, which helped to break the deadlock in calculations, i.e., to get results much worse than expected. The need to apply methods to improve the accuracy of forecasts and the accuracy of classification was caused by a small amount of analysed data. This paper is not a review of popular methods of machine learning and data mining; nevertheless, the collected and presented material will help the reader to shorten the path to obtaining satisfactory results when using the described computational methods
Machine learning, data exploration, artificial neural networks, association analysis, automatic classification
Короткий адрес: https://sciup.org/142231895
IDR: 142231895 | DOI: 10.23947/2687-1653-2021-21-4-346-363