Building robust malware detection through conditional generative adversarial network-based data augmentation
Автор: Baghirov E.
Журнал: Программные системы: теория и приложения @programmnye-sistemy
Рубрика: Программное и аппаратное обеспечение распределенных и суперкомпьютерных систем
Статья в выпуске: 4 (63) т.15, 2024 года.
Бесплатный доступ
Malware detection is essential in cybersecurity, yet its accuracy is often compromised by class imbalance and limited labeled data. This study leverages conditional Generative Adversarial Networks (cGANs) to generate synthetic malware samples, addressing these challenges by augmenting the minority class. The cGAN model generates realistic malware samples conditioned on class labels, balancing the dataset without altering the benign class. Applied to the CICMalDroid2020 dataset, the augmented data is used to train a LightGBM model, leading to improved detection accuracy, particularly for underrepresented malware classes. The results demonstrate the efficacy of cGANs as a robust data augmentation tool, enhancing the performance and reliability of machine learning-based malware detection systems.
Malware detection, generative adversarial networks, machine learning, cybersecurity, data augmentation
Короткий адрес: https://sciup.org/143183792
IDR: 143183792 | DOI: 10.25209/2079-3316-2024-15-4-97-110
Список литературы Building robust malware detection through conditional generative adversarial network-based data augmentation
- D. O. Won, Y. N. Jang, S.W. Lee. “PlausMal-GAN: Plausible malware training based on generative adversarial networks for analogous zero-day malware detection”, IEEE Transactions on Emerging Topics in Computing, 11:1 (2023), pp. 82–94. https://doi.org/10.1109/TETC.2022.3170544
- E. Baghirov. “A comprehensive investigation into robust malware detection with explainable AI”, Cyber Security and Applications, 3 (December 2025), id. 100072. https://doi.org/10.1016/j.csa.2024.100072
- E. Baghirov. “Techniques of malware detection: Research review”, 2021 IEEE 15th International Conference on Application of Information and Communication Technologies, AICT 2021 (13–15 October 2021, Baku, Azerbaijan), IEEE, 2021, ISBN 978-1-6654-3641-0/21, pp. 1–6. https://doi.org/10.1109/AICT52784.2021.9620415
- S. Jang, S. Li, Y. Sung. “Generative adversarial network for global image-based local image to improve malware classification using convolutional neural network”, Applied Sciences, 10:21 (2020), id. 7585, 14 pp. https://doi.org/10.3390/app10217585
- C. Reilly, S. O Shaughnessy, C. Thorpe. “Robustness of image-based malware classification models trained with generative adversarial networks”, EICC’23: Proceedings of the 2023 European Interdisciplinary Cybersecurity Conference (14–15 June 2023, Stavanger, Norway), ACM, New York, 2023, ISBN 978-1-4503-9829-9, pp. 92–99. https://doi.org/10.1145/3590777.3590792
- H. Nguyen, F. Di Troia, G. Ishigaki, M. Stamp. “Generative adversarial networks and image-based malware classification”, Journal of Computer Virology and Hacking Techniques, 19 (2023), pp. 579–595. https://doi.org/10.1007/s11416-023-00465-2
- S. Li, Z. Tang, H. Li, J. Zhang, H. Wang, J. Wang. “GMADV: An android malware variant generation and classification adversarial training framework”, Journal of Information Security and Applications, 84 (August 2024), id. 103800. https://doi.org/10.1016/j.jisa.2024.103800
- I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, Y. Bengio. “Generative adversarial nets”, Advances in Neural Information Processing Systems 27, NIPS 2014 (8–13 December 2014, Montreal, Canada), 2014, ISBN 9781510800410, 9 pp. hUtRtpLs://papers.nips.cc/paper_files/paper/2014/file/5ca3e9b122f61f8f06494c97b1afccf3-Paper.pdf
- R. Yuwana, F. Fauziah, A. Heryana, D. Krisnandi, R. S. Kusumo, H. F. Pardede. “Data augmentation using adversarial networks for tea diseases detection”, Jurnal Elektronika dan Telekomunikasi, 20:1 (2020), pp. 29–35. https://doi.org/10.14203/jet.v20.29-35
- M. Mirza, S. Osindero. Conditional generative adversarial nets, 2014, 7 pp. arXivarXiv 1411.1784~[cs.LG]
- M. Iwayama, S. Wu, C. Liu, R. Yoshida. “Functional output regression for machine learning in materials science”, Journal of chemical information and modeling, 62:20 (2022), pp. 4837–4851. https://doi.org/10.1021/acs.jcim.2c00626
- S. Mahdavifar, A.F. A. Kadir, R. Fatemi, D. Alhadidi, A. A. Ghorbani. “Dynamic android malware category classification using semi-supervised deep learning”, 2020 IEEE Intl Conf on Dependable, Autonomic and Secure Computing, Intl Conf on Pervasive Intelligence and Computing, Intl Conf on Cloud and Big Data Computing, Intl Conf on Cyber Science and Technology Congress, DASC/PiCom/CBDCom/CyberSciTech (17–22 August 2020, Calgary, AB, Canada), 2020, ISBN 978-1-7281-6609-4, pp. 515–522. https://doi.org/10.1109/DASC-PICom-CBDCom-CyberSciTech49142.2020.00094
- S. Mahdavifar, D. Alhadidi, A. A. Ghorbani. “Effective and efficient hybrid android malware classification using pseudo-label stacked auto-encoder”, Journal of Network and Systems Management, 30 (2022), id. 22, 34 pp. https://doi.org/10.1007/s10922-021-09634-4