A ViT-based Model for Detecting Kidney Stones in Coronal CT Images
Автор: An Cong Tran, Huynh Vo-Thuy
Журнал: International Journal of Information Technology and Computer Science @ijitcs
Статья в выпуске: 5 Vol. 17, 2025 года.
Бесплатный доступ
Detecting kidney stones in coronal CT images remains challenging due to the small size of stones, anatomical complexity, and noise from surrounding objects. To address these challenges, we propose a deep learning architecture that augments a Vision Transformer (ViT) with a pre-processing module. This module integrates CSPDarknet for efficient feature extraction, a Feature Pyramid Network (FPN), and Path Aggregation Network (PANet) for multi-scale context aggregation, along with convolutional layers for spatial refinement. Together, these trained components filter irrelevant background regions and highlight kidney-specific features before classification by ViT, thereby improving accuracy and efficiency. This design leverages ViT’s global context modeling while mitigating its sensitivity to irrelevant regions and limited data. The proposed model was evaluated on two coronal CT datasets (one public and one private dataset) comprising 6,532 images under six experimental scenarios with varying training and testing conditions. It achieved 99.3% accuracy, 98.7% F1-score, and 99.4% mAP@0.5, higher than both YOLOv10 and the baseline ViT. The model contains 61.2 million parameters and has a computational cost of 37.3 GFLOPs, striking a balance between ViT (86.0M, 17.6 GFLOPs) and YOLOv10 (22.4M, 92.0GFLOPs). Despite having more parameters than YOLOv10, the model achieved a lower inference time than YOLOv10, approximately 0.06 seconds per image on an NVIDIA RTX 3060 GPU. These findings suggest the potential of our approach as a foundation for clinical decision-support tools, pending further validation on heterogeneous and challenging clinical datasets such as small (<2 mm) or low-contrast stones.
Kidney Stone, Coronal CT, Vision Transformer, CSPDarknet, FPN-PANet
Короткий адрес: https://sciup.org/15020013
IDR: 15020013 | DOI: 10.5815/ijitcs.2025.05.01