Object detection in images using deep neural networks and synthetic data in scenarios of partial object occlusion

Algashev Gennady Andreevich; Kremushchenko Polina Alexandrovna; Lezin Ilya Alexandrovich

doi:10.18287/COJ1879

Scientific articles \ Prolegomena. Fundamentals of knowledge and culture. Propaedeutics \ Computer science and technology. Computing. Data processing \ Application-oriented computer-based techniques

Object detection in images using deep neural networks and synthetic data in scenarios of partial object occlusion

Автор: Algashev Gennady Andreevich, Kremushchenko Polina Alexandrovna, Lezin Ilya Alexandrovich

Журнал: Компьютерная оптика @computer-optics

Рубрика: Численные методы и анализ данных

Статья в выпуске: 2 т.50, 2026 года.

Бесплатный доступ

This research addresses the problem of automatic object detection in images under limited-visibility conditions, where objects are partially occluded, the background is complex, and lighting and viewpoints vary widely. The proposed approach combines pretraining on a programmatically generated synthetic dataset of 18,000 images - produced using the Visualization Toolkit (VTK) library - with fine-tuning on a compact real-image dataset of 2,000 annotated photographs (500 per class). Six deep neural network architectures - Faster R-CNN ResNet-50 FPN, SSD MobileNet V3, YOLOv11n, EfficientDet-D7, DETR-DC5, and CenterNet- were evaluated across three training regimes: synthetic-only, real-only, and combined (90% synthetic / 10% real). Hybrid training yielded the most substantial improvements: YOLOv11n achieved mAP@0.5 = 0.91 and mAP@0.75 = 0.86 (Precision = 0.89, Recall = 0.90, F1 = 0.89, 82 FPS), compared to 0.79 (synthetic-only) and 0.78 (real-only), representing a gain of up to +15 percentage points in mAP@0.5. EfficientDet-D7 reached mAP@0.5 = 0.87 and mAP@0.75 = 0.81, while CenterNet achieved mAP@0.5 = 0.88 at 35 FPS. Robustness analysis under simulated occlusion demonstrated that hybrid-trained models maintain reliable detection even under severe conditions: YOLOv11n retained mAP@0.5 = 0.78 at 50% occlusion and mAP@0.5 = 0.65 at 25% object visibility, while the degradation in mAP under 75% occlusion did not exceed 20% of the baseline level. The results confirm the viability of synthetic data as a standalone pretraining resource and validate the proposed hybrid pipeline for applications in autonomous driving, video surveillance, and industrial inspection.

Computer vision, synthetic data, neural networks, object detection, 3d modeling, dataset generation, deep learning

Короткий адрес: https://sciup.org/140314862

IDR: 140314862 | DOI: 10.18287/COJ1879