Обработка изображений, распознавание образов. Рубрика в журнале - Компьютерная оптика

Публикации в рубрике (330): Обработка изображений, распознавание образов

3D-обобщение метода очистки от импульсного шума для обработки видеоданных

Червяков Николай Иванович, Ляхов Павел Алексеевич, Оразаев Анзор Русланович

Статья научная

В статье предложен обобщенный метод адаптивной медианной фильтрации импульсного шума для обработки видеоданных. Метод основан на совместном применении итеративной обработки и преобразования результата медианной фильтрации на основе распределения Лоренца. Предложены четыре различные комбинации алгоритмических блоков метода. В экспериментальной части статьи приведены результаты сравнения качества работы предложенного метода с известными аналогами. Для моделирования было использовано видео, искаженное импульсным шумом с вероятностями искажения пикселей от 1 % до 99 % включительно. Численная оценка качества очистки видеоданных от шума на основе среднеквадратичной ошибки и индекса структурного сходства показала, что предложенный метод показывает лучший результат обработки во всех рассмотренных случаях по сравнению с известными подходами. Полученные в статье результаты могут найти широкое применение в практических приложениях цифровой обработки видео, например, для обработки визуальных данных в системах видеонаблюдения, идентификации и контроля промышленных процессов.

Бесплатно

A color image encryption algorithm for expert detection system based on composite chaotic sequences

Li Z.

Статья научная

With the development of the Internet, the amount of information carried in images is gradually increasing, and image encryption algorithms for data transmission have been developed. The conventional composite chaotic sequence encryption algorithm has the problem of too long convergence speed when applied to images, which can lead to the risk of information leakage in the image. To address this issue, this study first applies chaotic attractors to improve composite chaotic sequences and enhance the search domain of their Leia Index. At the same time, the Arnold transform technology is introduced into the expert monitoring system, and the two systems are fused to generate a fusion algorithm for color image encryption. Finally, the study conducts experiments on the Differ dataset to verify the effectiveness and superiority of the fusion algorithm, and compares it with three algorithms such as artificial fish schools. The image encryption times of the four algorithms are 6 s, 16 s, 29 s, and 33 s respectively, indicating that the fusion algorithm has the highest encryption speed. When facing exhaustive attacks, the image information damage degrees of the four algorithms are 0.014, 0.051, 0.172, and 0.184, respectively. The experimental results show that the proposed algorithm can effectively resist differential attacks and exhaustive attacks, and is suitable for encrypting color images.

Бесплатно

A decade of adversarial examples: a survey on the nature and understanding of neural network non-robustness

Trusov A.V., Limonova E.E., Arlazarov V.V.

Статья научная

Adversarial examples, in the context of computer vision, are inputs deliberately crafted to deceive or mislead artificial neural networks. These examples exploit vulnerabilities in neural networks, resulting in minimal alterations to the original input that are imperceptible by humans but can significantly impact the network’s output. In this paper, we present a thorough survey of research on adversarial examples, with a primary focus on their impact on neural network classifiers. We closely examine the theoretical capabilities and limitations of artificial neural networks. After that, we explore the discovery and evolution of adversarial examples, starting from basic gradient-based techniques and progressing toward the recent trend of employing generative neural networks for this purpose. We discuss the limited effectiveness of existing countermeasures against adversarial examples. Furthermore, we emphasize that the adversarial examples originate the misalignment between human and neural network decision-making processes. That can be attributed to the current methodology for training neural networks. We also argue that the commonly used term “attack on neural networks” is misleading when discussing adversarial deep learning. Through this paper, our objective is to provide a comprehensive overview of adversarial examples and inspire further researchers to develop more robust neural networks. Such networks will align better with human decision-making processes and enhance the security and reliability of computer vision systems in practical applications.

Бесплатно

A framework of reading timestamps for surveillance video

Cheng Jun, Dai Wei

Статья научная

This paper presents a framework to automatically read timestamps for surveillance video. Reading timestamps from surveillance video is difficult due to the challenges such as color variety, font diversity, noise, and low resolution. The proposed algorithm overcomes these challenges by using the deep learning framework. The framework has included: training of both timestamp localization and recognition in a single end-to-end pass, the structure of the recognition CNN and the geometry of its input layer that preserves the aspect of the timestamps and adapts its resolution to the data. The proposed method achieves state-of-the-art accuracy in the end-to-end timestamps recognition on our datasets, whilst being an order of magnitude faster than competing methods. The framework can be improved the market competitiveness of panoramic video surveillance products.

Бесплатно

A joint study of deep learning-based methods for identity document image binarization and its influence on attribute recognition

Snchez-rivero R., Bezmaternykh P.V., Gayer A.V., Morales-gonzlez A., Jos silva-mata F., Bulatov K.B.

Статья научная

Text recognition has benefited considerably from deep learning research, as well as the preprocessing methods included in its workflow. Identity documents are critical in the field of document analysis and should be thoroughly researched in relation to this workflow. We propose to examine the link between deep learning-based binarization and recognition algorithms for this sort of documents on the MIDV-500 and MIDV-2020 datasets. We provide a series of experiments to illustrate the relation between the quality of the collected images with respect to the binarization results, as well as the influence of its output on final recognition performance. We show that deep learning-based binarization solutions are affected by the capture quality, which implies that they still need significant improvements. We also show that proper binarization results can improve the performance for many recognition methods. Our retrained U-Net-bin outperformed all other binarization methods, and the best result in recognition was obtained by Paddle Paddle OCR v2.

Бесплатно

A methodology for automated labelling a geospatial image dataset of applicable locations for installing a wireless nodal seismic system

Uzdiaev M.Y., Astapova M.A., Ronzhin A.L., Saveliev A.I., Agafonov V.M., Erokhin G.N., Nenashev V.A.

Статья научная

A developing area of wireless nodal seismic systems installation rises an urgent problem of identification of applicable areas for mounting wireless seismic modules. The identification of applicable areas could be done using geospatial image analysis methods, which require representative datasets that reflect proper features of the surfaces related exactly to the requirements of seismic module installation. This states the problem of development of a methodology for labelling such datasets. This work is devoted to developing methodology for automated labelling of geospatial images using georeferece data from OpenStreetMap that provides accurate vector georeferences of distinct objects, however, suffer from class labels inconsistence (labelling the same object by multiple classes, labelling mistakes, objects overlapping). The distinctive features of the methodology are the development of system of surface classes specific to the properties of applicable surfaces for seismic modules installation and mapping procedure of OSM objects to the developed classification classes based on manual inspection of the OSM objects. The other features of the methodology are data representativeness in terms of geography, obtaining time, as well as maintaining the same lightning conditions. The collected according to the methodology dataset consists of 200 labelled images. The mapping procedure allows avoiding collisions in classes’ labels caused by OSM class hierarchy inconsistency. OSM labels covers 90% of the obtained images.

Бесплатно

A novel switching bilateral filtering algorithm for depth map

Ruchay Alexey N., Dorofeev Konstantin A., Kalschikov Vsevolod V.

Статья научная

In this paper, we propose a novel switching bilateral filter for depth map from a RGB-D sensor. The switching method works as follows: the bilateral filter is applied not at all pixels of the depth map, but only in those where noise and holes are possible, that is, at the boundaries and sharp changes. With the help of computer simulation we show that the proposed algorithm can effectively and fast process a depth map. The presented results show an improvement in the accuracy of 3D object reconstruction using the proposed depth filtering. The performance of the proposed algorithm is compared in terms of the accuracy of 3D object reconstruction and speed with that of common successful depth filtering algorithms.

Бесплатно

A parallel fusion method of remote sensing image based on NSCT

Xue Xiaorong, Xiang Fang, Wang Hongfu

Статья научная

Remote sensing image fusion is very important for playing the advantages of a variety of remote sensing data. However, remote sensing image fusion is large in computing capacity and time consuming. In this paper, in order to fuse remote sensing images accurately and quickly, a parallel fusion algorithm of remote sensing image based on NSCT (nonsubsampled contourlet transform) is proposed. In the method, two important kinds of remote sensing image, multispectral image and panchromatic image are used, and the advantages of parallel computing in high performance computing and the advantages of NSCT in information processing are combined. In the method, based on parallel computing, some processes with large amount of calculation including IHS (Intensity, Hue, Saturation) transform, NSCT, inverse NSCT, inverse IHS transform, etc., are done. To realize the method, multispectral image is processed with IHS transform, and the three components, I, H, and S are gotten. The component I and the panchromatic image are decomposed with NSCT...

Бесплатно

A remote sensing and GIS based approach for land use/cover, inundation and vulnerability analysis in Moscow, Russia

Choudhary Komal, Boori Mukesh Singh, Kupriyanov Alexander Victorovich

Статья научная

Monitoring of land use/cover (LULC) change is very important for sustainable development planning study. This research work is to understand natural and environmental situation and its cause such as intensity, distribution and socio and economic effects in Moscow, Russia based on remote sensing and Geographical Information System techniques. A model was developed by following thematic layers: land use/cover, vegetation, soil, geomorphology and geology in ArcGIS 10.2 software using multi-spectral satellite data obtained from Landsat 7 and 8 for the years of 1995, 2005 and 2016 respectively. Increasing scientific and political interest in regional aspects of global environmental changes, there is a strong stimulus to better understand the patterns causes and environmental consequences of LULC expansion in the elevation of Moscow state, one of the areas in the nation with fast economic growth and high population density. A 70 to 300 m inundation land loss scenarios for surface water and sea level rise (SLR) were developed using digital elevation models of study site topography through remote sensing and GIS techniques by ASTER GDEM and Landsat OLI data. The most severely impacted sectors are expected to be the vegetation, wetland and the natural ecosystem. Improved understanding of the extent and response of SLR will help in preparing for adaptation.

Бесплатно

A solution method for image distortion correction model based on bilinear interpolation

Li Jun, Su Jie, Zeng Xiliang

Статья научная

In the process of the image generation, because the imaging system itself has differences in terms of nonlinear or cameraman perspective, the generated image will face the geometric distortion. Image distortion in general is also a kind of image degradation, which needs the geometric transform to correct each pixel position of the distorted images, so as to regain the original spatial relationships between pixels and the original grey value relation, and which is also one of important steps of image processing. From the point of view of the digital image processing, the distortion correction is actually a process of image restoration for a degraded image. In image processing, in terms of the image quality improvement and correction technology, namely the image restoration, with the wide expansion of digital image distortion correction processing applied, the processing technology of the image restoration has also become a research hotspot. In view of the image distortion issue, this paper puts forward the image distortion correction algorithm based on two-step and one-dimensional linear gray level interpolation to reduce the computation complexity of the bilinear interpolation method, and divide the distorted image into multiple quadrilaterals, and the area of the quadrilateral is associated with the distortion degree of the image in the given region, and express the region distortion of each quadrilateral with the bilinear model, thus determining parameters of bilinear model according to the position of the quadrilateral vertex in the target image and the distorted image...

Бесплатно

Acute ischemic stroke lesion segmentation in non-contrast CT images using 3D convolutional neural networks

Dobshik A.V., Verbitskiy S.K., Pestunov I.A., Sherman K.M., Sinyavskiy Yu.N., Tulupov A.A., Berikov V.B.

Статья научная

In this paper, an automatic algorithm aimed at volumetric segmentation of acute ischemic stroke lesion in non-contrast computed tomography brain 3D images is proposed. Our deep-learning approach is based on the popular 3D U-Net convolutional neural network architecture, which was modified by adding the squeeze-and-excitation blocks and residual connections. Robust pre-processing methods were implemented to improve the segmentation accuracy. Moreover, a special patches sampling strategy was used to address the large size of medical images and class imbalance and to stabilize neural network training. All experiments were performed using five-fold cross-validation on the dataset containing non-contrast computed tomography volumetric brain scans of 81 patients diagnosed with acute ischemic stroke. Two radiology experts manually segmented images independently and then verified the labeling results for inconsistencies. The quantitative results of the proposed algorithm and obtained segmentation were measured by the Dice similarity coefficient, sensitivity, specificity and precision metrics. The suggested pipeline provides a Dice improvement of 12.0 %, sensitivity of 10.2 % and precision 10.0 % over the baseline and achieves an average Dice of 62.8 ± 3.3 %, sensitivity of 69.9 ± 3.9 %, specificity of 99.7 ± 0.2 % and precision of 61.9 ± 3.6 %, showing promising segmentation results.

Бесплатно

Adaptive color space model based on dominant colors for image and video compression performance improvement

Madenda Sarifuddin, Darmayantie Astie

Статья научная

This paper describes the use of some color spaces in JPEG image compression algorithm and their impact in terms of image quality and compression ratio, and then proposes adaptive color space models (ACSM) to improve the performance of lossy image compression algorithm. The proposed ACSM consists of, dominant color analysis algorithm and YCoCg color space family. The YCoCg color space family is composed of three color spaces, which are YCcCr, YCpCg and YCyCb . The dominant colors analysis algorithm is developed which enables to automatically select one of the three color space models based on the suitability of the dominant colors contained in an image. The experimental results using sixty test images, which have varying colors, shapes and textures, show that the proposed adaptive color space model provides improved performance of 3 % to 10 % better than YCbCr, YDbDr, YCoCg and YCgCo-R color spaces family. In addition, the YCoCg color space family is a discrete transformation so its digital electronic implementation requires only two adders and two subtractors, both for forward and inverse conversions.

Бесплатно

Adjusting U-net for the aortic abdominal aneurysm CT segmentation case

Epifanov R.U., Nikitin N.A., Rabtsun A.A., Kurdyukov L.N., Karpenko A.A., Mullyadzhanov R.I.

Статья научная

In this paper, we address the issue of developing of a convolutional neural network for the problem of aneurysm segmentation into three classes and of exploring ways for improving the quality of final segmentation masks. As a result of our study, macro dice score for classes of interest reaches 83.12% ± 4.27%. We explored different augmentation styles and showed the importance of applying intensity augmentation style to improve segmentation algorithm robustness in conditions of clinical data diversity. Augmentation with spatial and insensitive styles increase macro dice score up to 3%. The comparison of various inference mode indicate that combination of overlapping inference and segmentation window enlargement ameliorate macro dice up to 1.4%. Overall improvement of the quality of segmentation masks by macro dice score amounted up to 6% using combination of data-based augmentation style and advanced inference technique.

Бесплатно

Adjusting videoendoscopic 3D reconstruction results using tomographic data

Halavataya Katsiaryna Aliaksandrauna, Kozadaev Konstantin Vladimirovich, Sadau Vasiliy Sergeevich

Статья научная

Videoendoscopic and tomographic research are the two leading medical imaging solutions for detecting, classifying and characterizing a wide array of pathologies and conditions. However, source information from these types of research is very different, making it hard to cross-correlate them. The paper proposes a novel algorithm for combining results of videoendoscopic and tomographic imaging data based on 3D surface reconstruction methods. This approach allows to align separate parts of two input 3D surfaces: surface obtained by applying bundle adjustment-based 3D surface reconstruction algorithm to the endoscopic video sequence, and surface reconstructed directly from separate tomographic cross-section slice projections with regular density. Proposed alignment method is based on using local feature extractor and descriptor algorithms by applying them to the source surface normal maps. This alignment allows both surfaces to be equalized and normalized relative to each other. Results show that such an adjustment allows to reduce noise, correct reconstruction artifacts and errors, increase representative quality of the resulting model and establish correctness of the reconstruction for hyperparameter tuning.

Бесплатно

Advanced Hough-based method for on-device document localization

Tropin Daniil Vyacheslavovich, Ershov Alexandr Mikhailovich, Nikolaev Dmitry Petrovich, Arlazarov Vladimir Viktorovich

Статья научная

The demand for on-device document recognition systems increases in conjunction with the emergence of more strict privacy and security requirements. In such systems, there is no data transfer from the end device to a third-party information processing servers. The response time is vital to the user experience of on-device document recognition. Combined with the unavailability of discrete GPUs, powerful CPUs, or a large RAM capacity on consumer-grade end devices such as smartphones, the time limitations put significant constraints on the computational complexity of the applied algorithms for on-device execution. In this work, we consider document location in an image without prior knowledge of the document content or its internal structure. In accordance with the published works, at least 5 systems offer solutions for on-device document location. All these systems use a location method which can be considered Hough-based. The precision of such systems seems to be lower than that of the state-of-the-art solutions which were not designed to account for the limited computational resources. We propose an advanced Hough-based method. In contrast with other approaches, it accounts for the geometric invariants of the central projection model and combines both edge and color features for document boundary detection. The proposed method allowed for the second best result for SmartDoc dataset in terms of precision, surpassed by U-net like neural network. When evaluated on a more challenging MIDV-500 dataset, the proposed algorithm guaranteed the best precision compared to published methods. Our method retained the applicability to on-device computations.

Бесплатно

Agricultural plant hyperspectral imaging dataset

Gaidel Andrey Viktorovich, Podlipnov Vladimir Vladimirovich, Ivliev Nikolay Aleksandrovich, Paringer Rustam Alexandrovich, Ishkin Pavel Aleksandrovich, Mashkov Sergey Vladimirovich, Skidanov Roman Vasilyevich

Статья научная

Detailed automated analysis of crop images is critical to the development of smart agriculture and can significantly improve the quantity and quality of agricultural products. A hyperspectral camera potentially allows to extract more information about the observed object than a conventional one, so its use can help in solving problems that are difficult to solve with conventional methods. Often, predictive models that solve such problems require a large dataset for training. However, sufficiently large datasets of hyperspectral images of agricultural plants are not currently publicly available. Therefore, we present a new dataset of hyperspectral images of plants in this paper. This dataset can be accessed via URL https://pypi.org/project/HSI-Dataset-API/. It contains 385 hyperspectral images with a spatial resolution of 512 by 512 pixels and spectral resolution of 237 spectral bands. The images were captured in the summer of 2021 in Samara and Novocherkassk (Russia) using Offner based Imaging Hyperspectrometer of our own production. The article demonstrates the work of some basic approaches to the analysis of hyperspectral images using the dataset and states problems for further solving.

Бесплатно

An adaptive image in painting method based on the modified Mumford-Shah model and multiscale parameter estimation

Thanh Dang Ngoc Hoang, Surya Prasath V. B., Son Nguyen Van, Hieu Le Minh

Статья научная

Image inpainting is a process of filling missing and damaged parts of image. By using the Mumford-Shah image model, the image inpainting can be formulated as a constrained optimization problem. The Mumford-Shah model is a famous and effective model to solve the image inpainting problem. In this paper, we propose an adaptive image inpainting method based on multiscale parameter estimation for the modified Mumford-Shah model. In the experiments, we will handle the comparison with other similar inpainting methods to prove that the combination of classic model such the modified Mumford-Shah model and the multiscale parameter estimation is an effective method to solve the inpainting problem.

Бесплатно

An adaptive radial object recognition algorithm for lightweight drones in different environments

Song S., Liu J., Shleimovich M.P., Shakirzyanov R.M., Novikova S.V.

Статья научная

The paper proposes a group of radial shape object recognition methods capable of finding many different-sized circular objects in an image with high accuracy in minimum time and conditions of uneven brightness of frame areas. The methods are not computationally demanding, making them suitable for use in computer vision systems of light unmanned vehicles, which cannot carry powerful computing devices on board. The methods are also suitable for unmanned vehicles traveling at high speed, where image processing must be performed in real-time. The proposed algorithms are robust to noise. When combined into a single group, the developed algorithms constitute a customizable set capable of adapting to different imaging conditions and computing power. This property allows the method to be used for detecting objects of interest in different environments: from the air, from the ground, underwater, and when moving the vehicle between these environments. We proposed three methods: a hybrid FRODAS method combines the FRST and Hough methods to increase accuracy and reduce the time to search for circles in the image; a PaRCIS method based on sequential image compression and reconstruction to increase the speed of searching for multiple circles of different radii and removing noise; an additional modification of LIPIS is used with any of the primary or developed methods to reduce the sensitivity to sharp variations in the frame's brightness. The paper presents comparative experiments demonstrating the advantages of the developed methods over classical circle recognition methods regarding accuracy and speed. It shows the advantage of recognizing circles of different brightness. Experiments on recognizing multiple real-world objects in photographs taken on the ground, in the air, and underwater, with complex scenes under distortion and blurring with different degrees of illumination, demonstrate the effectiveness of the set of methods.

Бесплатно

An automated method for finding the optimal parameters of adaptive filters for speckle denoising of SAR images

Pavlov Vitalii, Tuzova Anna, Belov Andrei, Matveev Yurij

Статья научная

Many different filters can be used to reduce multiplicative speckle noise on radar images. Most of these filters have some parameters whose values influence the result of filtering. Finding optimal values of such parameters may be a non-trivial task. In this paper, a formal automated method for finding optimal parameters of speckle noise reduction filters is proposed. Using a specially designed test image, optimal parameters for the most commonly used filters were found using several image quality assessment metrics, including the Structural Similarity Index (SSIM) and Gradient Magnitude Similarity Deviation (GMSD). The use of filters with optimal parameters allows processing (detection, segmentation, etc.) of radar images with minimal influence of speckle noise.

Бесплатно

An efficient U-shaped transformer network for low-light power image denoising

Zhang J., Huang W.X., Lu M.X., Li L.W., Wang X., Shen Y.P., Wang Y.F.

Статья научная

Unmanned aerial vehicle (UAV) inspection of transmission lines has been widely applied in recent years. However, in low-light weather conditions, random noise often appears in the captured transmission line images due to the combined effects of brightness, electromagnetic interference, and camera sensor limitations. This noise significantly undermines the quality and accuracy of the inspection. To address this challenge, we propose a novel transformer-based image denoising method called EUformer. First, we propose the Global Feature Compensator (GFC) module, which adaptively captures remote pixel dependencies for improved global image modelling. Second, we designed the Mixed-Gated feed-forward network (MG-FFN), to enhance the aggregation of local contextual information. Finally, the loss function is optimized by introducing a new regular term, effectively addressing negative effects such as artefacts in the reconstructed images. To assess the denoising capabilities of the EUformer model proposed in this study for transmission line images, we developed a benchmark dataset specifically for low-light transmission line image denoising. The results of extensive experiments demonstrate that the EUformer model achieves competitive performance while maintaining low complexity.

Бесплатно

Следующая страница →

1
2
3
4
5
6
7
...
В конец

Журнал