Model-driven approach to creating ID document templates for localization and classification based on a single image
Автор: Matalov D.P., Arlazarov V.V.
Журнал: Компьютерная оптика @computer-optics
Рубрика: International conference on machine vision
Статья в выпуске: 6 т.49, 2025 года.
Бесплатный доступ
ID document recognition systems are already deeply integrated into human activity, and the pace of integration is only increasing. The first and most fundamental problems of such systems are document image localization and classification. In this field, template matching-based approaches have become widely used. These methods offer industrial precision, require minimal training data, and provide real-time performance on mobile devices. However, these methods have a significant limitation in scalability: every document type represents a set of local features to store and process, which affects the required computing resources. Moreover, considering the number of different document types supported by modern industrial recognition systems, they become unusable. To mitigate the drawback, we propose a method to select a subset of the most "stable" keypoints. To estimate keypoints' stability we synthesize a dataset of images containing various distortions relevant to the process of taking photos of hand-held documents with a smartphone camera in uncontrolled lighting conditions. To perform experiments we use well-known MIDV datasets, which have been designed to benchmark modern ID document recognition. The experiments show that the proposed method allows for increased ID document detection performance given thousands of document types and with limited computing resources.
One-shot learning, documents recognition, document processing, image augmentation, template matching, local features
Короткий адрес: https://sciup.org/140313277
IDR: 140313277 | DOI: 10.18287/2412-6179-CO-1762