International conference on machine vision 2023. Рубрика в журнале - Компьютерная оптика

Публикации в рубрике (8): International conference on machine vision 2023

Applied aspects of modern non-blind image deconvolution methods

Chaganova O.B., Grigoryev A.S., Nikolaev D.P., Nikolaev I.P.

Статья научная

The focus of this paper is the study of modern non-blind image deconvolution methods and their application to practical tasks. The aim of the study is to determine the current state-of-the-art in non-blind image deconvolution and to identify the limitations of current approaches, with a focus on practical application details. The paper proposes approaches to examine the influence of various effects on the quality of restoration, the robustness of models to errors in blur kernel estimation, and the violation of the commonly assumed uniform blur model. We developed a benchmark for validating non-blind deconvolution methods, which includes datasets of ground truth images and blur kernels, as well as a test scheme for assessing restoration quality and error robustness. Our experimental results show that those neural network models lacking any pre-optimization, such as quantization or knowledge distillation, fall short of classical methods in several key properties, such as inference speed or the ability to handle different types of blur. Nevertheless, neural network models have made notable progress in their robustness to noise and distortions. Based on the results of the study, we provided recommendations for more effective use of modern non-blind image deconvolution methods. We also developed suggestions for improving the robustness, versatility and performance quality of the models by incorporating additional practices into the training pipeline.

Бесплатно

Monitored reconstruction improved by post-processing neural network

Yamaev A.V.

Статья научная

Computed tomography (CT) is widely utilized for analyzing internal structures, but the limitations of traditional reconstruction algorithms, which often require a large number of projections, restrict their effectiveness in time-critical tasks or for biological objects studying. Recently Monitored reconstruction approach was proposed for reducing the requirement of dose load. In this paper, there were investigated the advantages of using post-processing neural networks within a monitored reconstruction approach. Three algorithms, namely FBP, FBPConvNet, and LRFR, are evaluated based on their mean count of projections required for the achievement of target reconstruction accuracy. A novel training method specifically designed for neural network algorithms within the Monitored reconstruction framework is proposed. It is shown that the use of the LRFR approach allows one to achieve both a reduction in the number of measured projections and an improvement in the reconstruction accuracy over a certain range of stopping rules. These findings highlight the significant potential of neural networks to be used in the Monitored reconstruction approach.

Бесплатно

Neural network algorithm for optical-SAR image registration based on a uniform grid of points

Volkov V.V., Shvets E.A.

Статья научная

The paper considers the problem of satellite multimodal image registration, in particular, optical and SAR (Synthetic Aperture Radar). Such algorithms are used in object detection, change detection, navigation. The paper considers algorithms for optical-to-SAR image registration in conditions of rough image pre-alignment. It is known that optical and SAR images have an inaccuracy in registration with georeference (up to 100 pixels with a spatial resolution of 10 m/pixel). This paper presents a neural network algorithm for optical-to-SAR image registration based on descriptors calculated for a uniform grid of points. First, algorithm find uniform grid of points for both images. Next, the neural network calculates descriptors for each point and finds descriptor distances between all possible pairs of points between optical and SAR images. Using obtained descriptor distances, a matching is made between the points on the optical and SAR images. The found matches between points are used to calculate the geometric transformation between images using the RANSAC algorithm with a limited (to combinations of translation, rotation and uniform scaling) affine transformation model. The accuracy of the proposed algorithm for optical-to-SAR image registration was investigated with different distortions in rotation and scale.

Бесплатно

Neural network recognition system for video transmitted through a binary symmetric channel

Baboshina V.A., Orazaev A.R., Lyakhov P.A., Boyarskaya E.E.

Статья научная

The demand for transmitting video data is increasing annually, necessitating the use of high-quality equipment for reception and processing. The paper presents a neural network recognition system for videos transmitted via a binary symmetrical channel. The presence of digital noise in the data makes it challenging to recognize objects in videos even with advanced neural networks. The proposed system consists of a noise interference detector, a noise purification system based on an adaptive median filter, and a neural network for recognition. The experiment results demonstrate that the proposed system effectively reduces video noise and accurately identifies multiple objects. This versatility makes the system applicable in various fields such as medicine, life safety, physics, and chemistry. The direction of further research may be to improve the model neural network, increasing the database for training or using other noises for modeling.

Бесплатно

Uncertainty-based quantization method for stable training of binary neural networks

Trusov A.V., Putintsev D.N., Limonova E.E.

Статья научная

Binary neural networks (BNNs) have gained attention due to their computational efficiency. However, training BNNs has proven to be challenging. Existing algorithms either fail to produce stable and high-quality results or are overly complex for practical use. In this paper, we introduce a novel quantizer called UBQ (Uncertainty-based quantizer) for BNNs, which combines the advantages of existing methods, resulting in stable training and high-quality BNNs even with a low number of trainable parameters. We also propose a training method involving gradual network freezing and batch normalization replacement, facilitating a smooth transition from training mode to execution mode for BNNs. To evaluate UBQ, we conducted experiments on the MNIST and CIFAR-10 datasets and compared our method to existing algorithms. The results demonstrate that UBQ outperforms previous methods for smaller networks and achieves comparable results for larger networks.

Бесплатно

Unfolder: fast localization and image rectification of a document with a crease from folding in half

Ershov A.M., Tropin D.V., Limonova E.E., Nikolaev D.P., Arlazarov V.V.

Статья научная

Presentation of folded documents is not an uncommon case in modern society. Digitizing such documents by capturing them with a smartphone camera can be tricky since a crease can divide the document contents into separate planes. To unfold the document, one could hold the edges potentially obscuring it in a captured image. While there are many geometrical rectification methods, they were usually developed for arbitrary bends and folds. We consider such algorithms and propose a novel approach Unfolder developed specifically for images of documents with a crease from folding in half. Unfolder is robust to projective distortions of the document image and does not fragment the image in the vicinity of a crease after rectification. A new Folded Document Images dataset was created to investigate the rectification accuracy of folded (2, 3, 4, and 8 folds) documents. The dataset includes 1600 images captured when document placed on a table and when held in hand. The Unfolder algorithm allowed for a recognition error rate of 0.33, which is better than the advanced neural network methods DocTr (0.44) and DewarpNet (0.57). The average runtime for Unfolder was only 0.25 s/image on an iPhone XR.

Бесплатно

Verification of color characteristics of document images captured in uncontrolled conditions

Kunina I.A., Padas O.A., Kolomyttseva O.A.

Статья научная

This paper examines a presentation attack when a color photo of a gray copy of a document is presented instead of the original color document during remote user identification. To detect such an attack, we propose an algorithm based on the comparison of chromaticity histograms of presented color images of the document and the ideal template of this type of document. The chromaticity histograms of the original document and the template are expected to be quite identical, while the histograms of the gray copy of the document and the template would be different. The algorithm was tested on images from the open dataset DLC-2021, which contains color images of synthesized identity documents and color images of their gray copies. The precision of the proposed method was 98.99 %, the recall was 84.7 %, that gave 8 times fewer errors than the baseline provided by authors of DLC-2021.

Бесплатно

Yolo-barcode: towards universal real-time barcode detection on mobile devices

Ershova D.M., Gayer A.V., Bezmaternykh P.V., Arlazarov V.V.

Статья научная

Existing approaches to barcode detection have a number of disadvantages, including being tied to specific types of barcodes, computational complexity or low detection accuracy. In this paper, we propose YOLO-Barcode – a deep learning model inspired by the You Only Look Once approach that allows to achieve high detection accuracy with real-time performance on mobile devices. The proposed model copes well with a large number of densely spaced barcodes, as well as highly elongated one-dimensional barcodes. YOLO-Barcode not only successfully detects the huge variety of barcode types, but also classifies them. Comparing with the previous universal barcode detector DilatedModel based on semantic segmentation, the YOLO-Barcode is 4 times faster and achieves state-of-the-art accuracy on the ZVZ-real public dataset: 98.6% versus 88.9% by F1-score. The analysis of existing publicly available datasets reveals the absence of many real-life scenarios of mobile barcode reading. To fill this gap, the new “SE-barcode” dataset is presented. The proposed model, used as a baseline, achieves a 92.11% by F1-score on this dataset.

Бесплатно

Журнал