Обработка изображений, распознавание образов. Рубрика в журнале - Компьютерная оптика

Публикации в рубрике (280): Обработка изображений, распознавание образов
все рубрики
Generation and study of the synthetic brain electron microscopy dataset for segmentation purpose

Generation and study of the synthetic brain electron microscopy dataset for segmentation purpose

Sokolov N.A., Vasiliev E.P., Getmanskaya A.A.

Статья научная

Advanced microscopy technologies such as electron microscopy have opened up a new field of vision for biomedical researchers. The use of artificial intelligence methods for processing EM data is largely difficult due to the small amount of annotated data at the training stage. Therefore, we add synthetic images to an annotated real EM dataset or use a fully synthetic training dataset. In this work, we present an algorithm for the synthesis of 6 types of organelles. Based on the EPFL dataset, a training set of 1161 real fragments 256×256 (ORG) and 2000 synthetic ones (SYN), as well as their combination (MIX), were generated. The experiment of training models for 6, 5-classes and binary segmentation showed that, despite the imperfections of synthetics, training on a mixed (MIX) dataset gave a significant increase (about 0.1) in the Dice metric for 6 and 5 and same results at binary. The synthetic data strategy gives annotations for free, but shifts the effort to producing sufficiently realistic images.

Бесплатно

Gradient-based technique for image structural analysis and applications

Gradient-based technique for image structural analysis and applications

Asatryan David G.

Статья научная

This paper is devoted to application of gradients field characteristics in selected problems of image intellectual analysis and processing. To analyse the properties and structure of an image several approaches and models based on the use of the gradients field characteristics, are proposed. In this paper, models based on Weibull distribution are considered, an image dominant direction estimation algorithm using the parameters of scattering ellipse of gradients field components is proposed, and a similarity measure of two images with arbitrary dimensions and orientation is proposed. Some examples of applications of these models for estimation of blur and structuredness of an image, for the quality assessment of resizing and rotating algorithms, as well as for detection of a specified object on the image delivered by an unmanned aerial vehicle, are given.

Бесплатно

Head model reconstruction and animation method using color image with depth information

Head model reconstruction and animation method using color image with depth information

Kozlova Yu.kh., Myasnikov V.V.

Статья научная

The article presents a method for reconstructing and animating a digital model of a human head from a single RGBD image, a color RGB image with depth information. An approach is proposed for optimizing the parametric FLAME model using a point cloud of a face corresponding to a single RGBD image. The results of experimental studies have shown that the proposed optimization approach makes it possible to obtain a head model with more prominent features of the original face compared to optimization approaches using RGB images or the same approaches generalized to RGBD images.

Бесплатно

High-speed recursive-separable image processing filters with variable scanning aperture sizes

High-speed recursive-separable image processing filters with variable scanning aperture sizes

Kamenskiy A.V., Kuryachiy M.I., Krasnoperova A.S., Ilyin Yu.V., Akaeva T.M., Boyarkin S.E.

Статья научная

In the process of development of computer technologies, the number of areas of their application naturally grows and, along with it, the complexity of the tasks to be solved, which entails the need for new research. Similar tasks include digital filtering of images in the field of medical technologies and active-pulse television measuring systems. There are many methods and algorithms of digital filtering designed to solve the problem of improving the quality; algorithms that can improve the quality of images while reducing computational costs are widely used. High demands, which are made due to the constant growth in the size of the generated images, as well as the requirement for modern television systems, is real-time operation. When solving practical problems, it is required to use different filter aperture sizes, which provide an increase in quality and preservation of image details. The solution of these problems was the reason for the emergence of adaptive filters that are able to change the parameters in the process of processing the received data, while not spending additional time on processing with an increase in the size of the aperture. The paper presents the principles of constructing adaptive image processing filters, which, by obtaining an input parameter indicating the required dimension of a multi-element aperture, are able to implement the construction of the required aperture. The Laplacian “Truncated Pyramid” filter and the “double pyramid” Laplacian were modified. A feature of these filters is the oddness of the multi-element aperture, so the coefficient used to build the mask is always set to odd. When using these filters, it is possible to use two coefficients that are responsible for increasing the filtration efficiency, since, in their original form, the Laplacian filters have a sum of coefficients equal to zero. The experiment shows a comparison with high-dimensional filters that work when using classical two-dimensional convolution. The next stage of the presented research will be the application of parallel computing techniques, which will increase the speed of the developed filters.

Бесплатно

Image compression and encryption based on wavelet transform and chaos

Image compression and encryption based on wavelet transform and chaos

Gao Haibo, Zeng Wenjuan

Статья научная

With the rapid development of network technology, more and more digital images are transmitted on the network, and gradually become one important means for people to access the information. The security problem of the image information data increasingly highlights and has become one problem to be attended. The current image encryption algorithm basically focuses on the simple encryption in the frequency domain or airspace domain, and related methods also have some shortcomings. Based on the characteristics of wavelet transform, this paper puts forward the image compression and encryption based on the wavelet transform and chaos by combining the advantages of chaotic mapping. This method introduces the chaos and wavelet transform into the digital image encryption algorithm, and transforms the image from the spatial domain to the frequency domain of wavelet transform, and adds the hybrid noise to the high frequency part of the wavelet transform, thus achieving the purpose of the image degradation and improving the encryption security by combining the encryption approaches in the spatial domain and frequency domain based on the chaotic sequence and the excellent characteristics of wavelet transform...

Бесплатно

Improvements of programing methods for finding reference lines on X-ray images

Improvements of programing methods for finding reference lines on X-ray images

Al-Temimi Ammar Mudheher Sadeq, Pilidi Vladimir Stavrovich

Статья научная

The paper gives an overview of the algorithms developed to obtain reference lines and angles on X-ray images. These geometrical characteristics are used in the medical analysis of human joints. We propose the algorithm’s modifications based on the analysis of numerous X-ray images. These modifications allowed obtaining a great increase in calculation speed and the improvement of final results quality given by the corresponding application. They also lead to a significant reduction of manual tuning of the program, arising only in the rare cases when the properties of given images differ significantly from the mean ones.

Бесплатно

Integrating landscape ecological risk with ecosystem services in the Republic of Tatarstan, Russia

Integrating landscape ecological risk with ecosystem services in the Republic of Tatarstan, Russia

Boori Mukesh Singh, Choudhary Komal, Kupriyanov Alexander

Статья научная

It is a novel approach to linking landscape ecological risk (LER) and ecosystem services (ESs) for environmental management and sustainable development, since it enables real-time decision-making. This study used 12 natural factors relevant to LER and 11 ESs factors to analyze spatiotemporal changes and establish a relationship between them in Tatarstan, Russia, for the years 2010, 2015, and 2020. The statistical tests (Global Moran's I, Getis-Ord Gi*), analysis of habitat vulnerability, and ecological loss in the ArcGIS platform reveal a consistent variance in factor clustering and pattern as well as the impact of governmental policies in the studied area. According to analysis findings, 2015 had the best ecological conditions of the three years because 44.79 % of the research area had decreased landscape ecological risk, which increased ecosystem services. Additionally, the results show that both maps have significant spatial disparities and that LER and ESs are negatively impacted by high human-socioeconomic activity. The integration of LER and ESs through the overlap of both maps provides a significant amount of spatial information for mapping, monitoring, management, and the protection of the fragile environment for sustainable landscape development and management.

Бесплатно

MIDV-2020: a comprehensive benchmark dataset for identity document analysis

MIDV-2020: a comprehensive benchmark dataset for identity document analysis

Bulatov Konstantin Bulatovich, Emelianova Ekaterina Vladimirovna, Tropin Daniil Vyacheslavovich, Skoryukina Natalya Sergeevna, Chernyshova Yulia Sergeevna, Sheshkus Alexander Vladimirovich, Usilin Sergey Alexandrovich, Ming Zuheng, Burie Jean-Christophe, Luqman Muhammad Muzzamil, Arlazarov Vladimir Viktorovich

Статья научная

Identity documents recognition is an important sub-field of document analysis, which deals with tasks of robust document detection, type identification, text fields recognition, as well as identity fraud prevention and document authenticity validation given photos, scans, or video frames of an identity document capture. Significant amount of research has been published on this topic in recent years, however a chief difficulty for such research is scarcity of datasets, due to the subject matter being protected by security requirements. A few datasets of identity documents which are available lack diversity of document types, capturing conditions, or variability of document field values. In this paper, we present a dataset MIDV-2020 which consists of 1000 video clips, 2000 scanned images, and 1000 photos of 1000 unique mock identity documents, each with unique text field values and unique artificially generated faces, with rich annotation. The dataset contains 72409 annotated images in total, making it the largest publicly available identity document dataset to the date of publication. We describe the structure of the dataset, its content and annotations, and present baseline experimental results to serve as a basis for future research. For the task of document location and identification content-independent, feature-based, and semantic segmentation-based methods were evaluated. For the task of document text field recognition, the Tesseract system was evaluated on field and character levels with grouping by field alphabets and document types. For the task of face detection, the performance of Multi Task Cascaded Convolutional Neural Networks-based method was evaluated separately for different types of image input modes. The baseline evaluations show that the existing methods of identity document analysis have a lot of room for improvement given modern challenges. We believe that the proposed dataset will prove invaluable for advancement of the field of document analysis and recognition.

Бесплатно

MIDV-500: a dataset for identity document analysis and recognition on mobile devices in video stream

MIDV-500: a dataset for identity document analysis and recognition on mobile devices in video stream

Arlazarov Vladimir Viktorovich, Bulatov Konstantin Bulatovich, Chernov Timofey Sergeevich, Arlazarov Vladimir Lvovich

Статья научная

A lot of research has been devoted to identity documents analysis and recognition on mobile devices. However, no publicly available datasets designed for this particular problem currently exist. There are a few datasets which are useful for associated subtasks but in order to facilitate a more comprehensive scientific and technical approach to identity document recognition more specialized datasets are required. In this paper we present a Mobile Identity Document Video dataset (MIDV-500) consisting of 500 video clips for 50 different identity document types with ground truth which allows to perform research in a wide scope of document analysis problems. The paper presents characteristics of the dataset and evaluation results for existing methods of face detection, text line recognition, and document fields data extraction. Since an important feature of identity documents is their sensitiveness as they contain personal data, all source document images used in MIDV-500 are either in public domain or distributed under public copyright licenses. The main goal of this paper is to present a dataset. However, in addition and as a baseline, we present evaluation results for existing methods for face detection, text line recognition, and document data extraction, using the presented dataset.

Бесплатно

Method for removing haze from images, captured under a wide range of lighting conditions

Method for removing haze from images, captured under a wide range of lighting conditions

Filin A.I., Kopylov A.V., Gracheva I.A.

Статья научная

The presence of haze on images degrades the quality of perception and automatic analysis of scenes. One of the most popular methods of haze removal is the dark channel prior method, which is based on the Koschmieder atmospheric scattering model. However, its underlying assumptions are not met for nighttime, since localized light sources make a significant, if not the main, contribution to lighting. We propose here to use the degree of belonging of an image element to a localized light source, determined based on a one-class classifier, as a value that characterizes the confidence of the corresponding element of the estimated transmission map during its rectifi-cation based on the gamma-normal model, which makes it possible to increase the accuracy of dehazing when processing images, captured in low-light or nighttime conditions.

Бесплатно

Multispectral optoelectronic device for controlling an autonomous mobile platform

Multispectral optoelectronic device for controlling an autonomous mobile platform

Titov Vitaliy Semenovich, Spevakov Alexander Gennadyevich, Primenko Dmitry Vladimirovich

Статья научная

The paper substantiates the use of multispectral optoelectronic sensors intended to solve the problem of improving the positioning accuracy of autonomous mobile platforms. A mathematical model of the developed device operation has been suggested in the paper. Its distinctive feature is the cooperative processing of signals obtained from sensors operating in ultraviolet, visible, and infrared ranges and lidar. It reduces the computational complexity of detecting dynamic and stationary objects within the field of view of the device by processing data on the diffuse reflectivity of materials. The paper presents the functional organization of a multispectral optoelectronic device that makes it possible to detect and classify working scene objects with less time spending as compared to analogs. In the course of experimental research, the validity of the mathematical model was evaluated and there were obtained empirical data by means of the proposed hardware and software test stand. The accuracy evaluation of the detected object, at a distance of up to 100m inclusive, is within 0.95. At a distance of more than 100 m, it decreases. This is due to the operating range of a lidar. Error in determining spatial coordinates is of exponential character and it also increases sharply at a distance close to 100 m.

Бесплатно

Mutual modality learning for video action classification

Mutual modality learning for video action classification

Komkov S.A., Dzabraev M.D., Petiushko A.A.

Статья научная

The construction of models for video action classification progresses rapidly. However, the performance of those models can still be easily improved by ensembling with the same models trained on different modalities (e.g. Optical flow). Unfortunately, it is computationally expensive to use several modalities during inference. Recent works examine the ways to integrate advantages of multi-modality into a single RGB-model. Yet, there is still room for improvement. In this paper, we explore various methods to embed the ensemble power into a single model. We show that proper initialization, as well as mutual modality learning, enhances single-modality models. As a result, we achieve state-of-the-art results in the Something-Something-v2 benchmark.

Бесплатно

Noise reduction and mammography image segmentation optimization with novel QIMFT-SSA method

Noise reduction and mammography image segmentation optimization with novel QIMFT-SSA method

Soewondo Widiastuti, Haji Salih Omer, Eftekharian Mohsen, Marhoon Haydar A., Dorofeev Aleksei Evgenievich, Jawad Mohammed Abed, Jabbar Abdullah Hasan, Jalil Abduladheem Turki

Статья научная

Breast cancer is one of the most dreaded diseases that affects women worldwide and has led to many deaths. Early detection of breast masses prolongs life expectancy in women and hence the development of an automated system for breast masses supports radiologists for accurate diagnosis. In fact, providing an optimal approach with the highest speed and more accuracy is an approach provided by computer-aided design techniques to determine the exact area of breast tumors to use a decision support management system as an assistant to physicians. This study proposes an optimal approach to noise reduction in mammographic images and to identify salt and pepper, Gaussian, Poisson and impact noises to determine the exact mass detection operation after these noise reduction. It therefore offers a method for noise reduction operations called Quantum Inverse MFT Filtering and a method for precision mass segmentation called the Optimal Social Spider Algorithm (SSA) in mammographic images. The hybrid approach called QIMFT-SSA is evaluated in terms of criteria compared to previous methods such as peak Signal-to-Noise Ratio (PSNR) and Mean-Squared Error (MSE) in noise reduction and accuracy of detection for mass area recognition. The proposed method presents more performance of noise reduction and segmentation in comparison to state-of-arts methods. supported the work.

Бесплатно

Novel approach of simplification detected contours on X-ray medical images

Novel approach of simplification detected contours on X-ray medical images

Al-Temimi Ammar Mudheher Sadeq, Pilidi Vladimir Stavrovich, Ibraheem Murooj Khalid Ibraheem

Статья научная

This paper gives description of a method for simplifying the number of points representing detected contours of the bones on digital X-ray images. Such simplification permits simplify way for correction the location of these points in the cases, if the analyzed image has poor quality, and to reduces the time of analysis it to get the reference lines and angles for diagnosis purposes of the area under investigation.

Бесплатно

On the automation of gestalt perception in remotely sensed data

On the automation of gestalt perception in remotely sensed data

Michaelsen Eckart

Статья научная

Gestalt perception, the laws of seeing, and perceptual grouping is rarely addressed in the con-text of remotely sensed imagery. The paper at hand reviews the corresponding state as well in ma-chine vision as in remote sensing, in particular concerning urban areas. Automatic methods can be separated into three types: 1) knowledge-based inference, which needs machine-readable knowl-edge, 2) automatic learning methods, which require labeled or un-labeled example images, and 3) perceptual grouping along the lines of the laws of seeing, which should be pre-coded and should work on any kind of imagery, but in particular on urban aerial or satellite data. Perceptual group-ing of parts into aggregates is a combinatorial problem. Exhaustive enumeration of all combina-tions is intractable. The paper at hand presents a constant-false-alarm-rate search rationale. An open problem is the choice of the extraction method for the primitive objects to start with. Here super-pixel-segmentation is used.

Бесплатно

One-shot learning with triplet loss for vegetation classification tasks

One-shot learning with triplet loss for vegetation classification tasks

Uzhinskiy Alexander Vladimirovich, Ososkov Gennady Alexeevich, Goncharov Pavel Vladimirovich, Nechaevskiy Andrey Vasilevich, Smetanin Artem Alekseevich

Статья научная

Triplet loss function is one of the options that can significantly improve the accuracy of the One-shot Learning tasks. Starting from 2015, many projects use Siamese networks and this kind of loss for face recognition and object classification. In our research, we focused on two tasks related to vegetation. The first one is plant disease detection on 25 classes of five crops (grape, cotton, wheat, cucumbers, and corn). This task is motivated because harvest losses due to diseases is a serious problem for both large farming structures and rural families. The second task is the identification of moss species (5 classes). Mosses are natural bioaccumulators of pollutants; therefore, they are used in environmental monitoring programs. The identification of moss species is an important step in the sample preprocessing. In both tasks, we used self-collected image databases. We tried several deep learning architectures and approaches. Our Siamese network architecture with a triplet loss function and MobileNetV2 as a base network showed the most impressive results in both above-mentioned tasks. The average accuracy for plant disease detection amounted to over 97.8 % and 97.6 % for moss species classification.

Бесплатно

Research on foreign body detection in transmission lines based on a multi-UAV cooperative system and YOLOV7

Research on foreign body detection in transmission lines based on a multi-UAV cooperative system and YOLOV7

Chang R., Mao Zh., Hu J., Bai H., Zhou Ch., Yang Ya., Gao Sh.

Статья научная

The unique plateau geographical features and variable weather of Yunnan, China make transmission lines in this region more susceptible to coverage and damage by various foreign bodies compared to flat areas. The mountainous terrain also presents great challenges for inspecting and removing such objects. In order to improve the efficiency and detection accuracy of foreign body inspection of transmission lines, we propose a multi-UAV collaborative system specifically designed for the geographical characteristics of Yunnan's transmission lines in this paper. Additionally, the image data of foreign bodies was augmented, and the YOLOv7 target detection model, which offers a more balanced trade-off between precision and speed, was adopted to improve the accuracy and speed of foreign body detection.

Бесплатно

Rice growth vegetation index 2 for improving estimation of rice plant phenology in costal ecosystems

Rice growth vegetation index 2 for improving estimation of rice plant phenology in costal ecosystems

Choudhary Komal, Shi Wen-Zhong John, Dong Yanni

Статья научная

Crop growth is one of the most important parameters of a crop and its knowledge before harvest is essential to help farmers, scientists, governments and agribusiness. This paper provides a novel demonstration of the use of freely available Sentinel-2 data to estimate rice crop growth in a single year. Sentinel 2 data provides frequent and consistent information to facilitate coastal monitoring from field scales. The aims of this study were to modify the rice growth vegetation index to improve rice growth phenology in the coastal areas. The rice growth vegetation index 2 is the best vegetation index, compared with 11 vegetation indices, plant height and biomass. The results demonstrate that the coefficient of rice growth vegetation index 2 was 0.83, has the highest correlation with plant height. Rice growth vegetation index 2 is more appropriate for enhancing and obtaining rice phenology information. This study analyses the best spectral vegetation indices for estimating rice growth.

Бесплатно

Road images augmentation with synthetic traffic signs using neural networks

Road images augmentation with synthetic traffic signs using neural networks

Konushin Anton Sergeevich, Faizov Boris Vladimirovich, Shakhuro Vladislav Igorevich

Статья научная

Traffic sign recognition is a well-researched problem in computer vision. However, the state of the art methods works only for frequent sign classes, which are well represented in training datasets. We consider the task of rare traffic sign detection and classification. We aim to solve that problem by using synthetic training data. Such training data is obtained by embedding synthetic images of signs in the real photos. We propose three methods for making synthetic signs consistent with a scene in appearance. These methods are based on modern generative adversarial network (GAN) architectures. Our proposed methods allow realistic embedding of rare traffic sign classes that are absent in the training set. We adapt a variational autoencoder for sampling plausible locations of new traffic signs in images. We demonstrate that using a mixture of our synthetic data with real data improves the accuracy of both classifier and detector.

Бесплатно

Robust hybrid technique for moving object detection and tracking using cartoon features and fast PCP

Robust hybrid technique for moving object detection and tracking using cartoon features and fast PCP

Jeevith S.H., Lakshmikanth S.

Статья научная

In various computer vision applications, the moving object detection is an essential step. Principal Component Analysis (PCA) techniques are often used for this purpose. However, the performance of this method is degraded by camera shake, hidden moving objects, dynamic background scenes, and / or fluctuating exposure. Robust Principal Component Analysis (RPCA) is a useful approach for reducing stationary background noise as it can recover low rank matrices. That is, moving object is formed by the low power models and the static background of RPCA. This paper proposes a simple alternative minimization algorithm to fix minor discrepancies in the original Principal Component Pursuit (PCP) or RPCA function. A novel hybrid method of cartoon texture features used as a data matrix for RPCA taking into account low-ranking and rare matrix is presented. A new non-convex function is proposed to better control the low-range properties of the video background. Simulation results demonstrate that the proposed algorithm is capable of giving consistent random estimates and can indeed improve the accuracy of object recognition in comparison with existing methods.

Бесплатно

Журнал