Investigation of the applicability of natural language processing methods to problems of searching and matching of machinery drawing images

Автор: Figura Konstantin Nikolaevich

Журнал: Компьютерная оптика @computer-optics

Рубрика: Обработка изображений, распознавание образов

Статья в выпуске: 4 т.46, 2022 года.

Бесплатный доступ

In this work it is shown that the application of the technique of local feature descriptors in its pure form to the task of searching and matching of drawings is ineffective. It is revealed that this is mainly due to the presence in the drawings of a large number of identical elements (frames, a title block, extension lines, font elements, etc.). It is proposed that this problem should be solved using a tf-idf (term frequency-inverse document frequency) method, which is widely known in natural language processing. In the study, instead of the word vectors used in the original tf-idf technique, descriptors of image feature points calculated using the ORB and BRISK algorithms were used. The study has led to the following conclusions: 1) the proposed approach offers high efficiency in finding a copy of the image-query in the database. Thus, copies of all images presented for search and having their full analogs in the database are revealed. 2) The identification rate of modified image-queries varies, depending on the algorithm used for finding keypoints and descriptors. So, the maximum percentage of identified modified analogs is 60% when using ORB and 80% when using BRISK - out of all image analogs in the database. 3) The proposed approach shows a limited efficiency in finding images that can be attributed to the same class as the image queries (for example, a drawing of an excavator, a bulldozer, or a truck crane). Here, the maximum proportion of false identification has reached 60%.

Еще

Natural language processing, tf-idf method, image retrieval, image analysis, pattern recognition, digital image processing

Короткий адрес: https://sciup.org/140295013

IDR: 140295013   |   DOI: 10.18287/2412-6179-CO-1030

Статья научная