Using multilevel data sources to prepare training sets for cyberattack detection
Автор: Dmitry D. Kononov, Sergey V. Isaev
Журнал: Программные системы: теория и приложения @programmnye-sistemy
Рубрика: Программное и аппаратное обеспечение распределенных и суперкомпьютерных систем
Статья в выпуске: 4 (67) т.16, 2025 года.
Бесплатный доступ
Network traffic analysis is an integral part of ensuring security in information and telecommunication systems. The use of machine learning provides modern approaches with higher detection rates for cyber threats. A new approach for generating training datasets is proposed, which introduces a new aggregation unit “session”, utilizes signature analysis and multi-level data sources, including heterogeneous ones. A list of requirements for the datasets is generated, which includes preserving the first packets of the connection, preserving hidden areas of the packets, extended information about traffic sources (country, autonomous system number ASN). The additional information will allow to detect attacks of the “hidden communication channel” type. Using the proposed approach, a software package for creating training datasets from multilevel sources at the L7, L4, L3 levels of the OSI model has been developed. In contrast to existing works, real data of network activity as well as long time intervals are used. The proposed approach allows to use the obtained training sets to create more effective methods of intrusion detection and prevention using machine learning techniques.
Internet, network security, cyber threats, network traffic analysis, datasets, machine learning
Короткий адрес: https://sciup.org/143185204
IDR: 143185204 | УДК: 004.89+004.056 | DOI: 10.25209/2079-3316-2025-16-4-267-285