Обработка изображений, распознавание образов. Рубрика в журнале - Компьютерная оптика
Building detection by local region features in SAR images
Статья научная
The buildings are very complex for detection on SAR images, where the basic features of those are shadows. There are many different representations for SAR shadow. As result it is no possible to use convolutional neural network for building detection directly. In this article we give property analysis of SAR shadows of different type buildings. After that, each region (ROI) prepared for training of building detection is corrected with its own SAR shadow properties. Reconstructions of ROI will be put in a modified YOLO network for building detection with better quality result.
Бесплатно
Camera parameters estimation from pose detections
Статья научная
Some computer vision tasks become easier with known camera calibration. We propose a method for camera focal length, location and orientation estimation by observing human poses in the scene. Weak requirements to the observed scene make the method applicable to a wide range of scenarios. Our evaluation shows that even being trained only on synthetic dataset, the proposed method outperforms known solution. Our experiments show that using only human poses as the input also allows the proposed method to calibrate dynamic visual sensors.
Бесплатно
Central Russia heavy metal contamination model based on satellite imagery and machine learning
Статья научная
Atmospheric heavy metal contamination is a real threat to human health. In this work, we examined several models trained on in situ data and indices got from satellite images. During 2018-2019, 281 samples of naturally growing mosses were collected in the Vladimir, Yaroslavl, and Moscow regions in Russia. The samples were analyzed using Neutron Activation Analysis to get the contamination levels of 18 heavy metals. The Google Earth Engine platform was used to calculate indices from satellite images that represent summarized information about sampling sites. Statistical and neural models were trained on in situ data and the indices. We focused on the classification task with 8 levels of contamination and used balancing techniques to extend the training data. Three approaches were tested: variations of gradient boosting, multilayer perceptron, and Siamese networks. All these approaches produced results with minute differences, making it difficult to judge which one is better in terms of accuracy and graphical outputs. Promising results were shown for 9 heavy metals with an overall accuracy exceeding 89 %. Al, Fe, and Sb contamination was predicted for 3,000 and 12,100 grid nodes on a 500 km2 area in the Central Russia region for 2019 and 2020. The results, methods, and perspectives of the adopted approach of using satellite data together with machine learning for HM contamination prediction are presented.
Бесплатно
Статья научная
The amount of ultrasound (US) breast exams continues to grow because of the wider endorsement of breast cancer screening programs. When a solid lesion is found during the US the primary task is to decide if it requires a biopsy. Therefore, our goal was to develop a noninvasive US grayscale image analysis for benign and malignant solid breast lesion differentiation. We used a dataset consisting of 105 ultrasound images with 50 benign and 55 malignant non-cystic lesions. Features were extracted from the source image, the image of the gradient module after applying the Sobel filter, and the image after the Laplace filter. Subsequently, eight gray-level co-occurrence matrices (GLCM) were constructed for each lesion, and 13 Haralick textural features were calculated for each GLCM. Additionally, we computed the differences in feature values at different spatial shifts and the differences in feature values between the inner and outer areas of the lesion. The LASSO method was employed to determine the most significant features for classification. Finally, the lesion classification was carried out by various methods. The use of LASSO regression for feature selection enabled us to identify the most significant features for classification. Out of the 13 features selected by the LASSO method, four described the perilesional tissue, two represented the inner area of the lesion and five described the image of the gradient module. The final model achieved a sensitivity of 98%, specificity of 96%, and accuracy of 97%. Considering the perilesional area, Haralick feature differences, and the image of the gradient module can provide crucial parameters for accurate classification of US images. Features with a low AUC index (less than 0.6 in our case) can also be important for improving the quality of classification.
Бесплатно
Статья научная
A computer vision based real-time object detection on low-power devices is economically attractive, yet a technically challenging task. The paper presents results of benchmarks on popular deep neural network models, which are often used for this task. The results of experiments provide insights into trade-offs between accuracy, speed, and computational efficiency of MobileNetV2 SSD, CenterNet MobileNetV2 FPN, EfficientDet, YoloV5, YoloV7, YoloV7 Tiny and YoloV8 neural network models on Raspberry Pi 4B, Raspberry Pi 3B and NVIDIA Jetson Nano with TensorFlow Lite. We fine-tuned the models on our custom dataset prior to benchmarking and used post-training quantization (PTQ) and quantization-aware training (QAT) to optimize the models’ size and speed. The experiments demonstrated that an appropriate algorithm selection depends on task requirements. We recommend EfficientDet Lite 512×512 quantized or YoloV7 Tiny for tasks that require around 2 FPS, EfficientDet Lite 320×320 quantized or SSD Mobilenet V2 320×320 for tasks with over 10 FPS, and EfficientDet Lite 320×320 or YoloV5 320×320 with QAT for tasks with intermediate FPS requirements.
Бесплатно
Статья научная
The study is a comparative analysis of two fast reflection symmetry axis detection methods: an algorithm to refine the symmetry axis found with a chain of skeletal primitives and a boundary method based on the Fourier descriptor. We tested the algorithms with binary raster images of plant leaves (FLAVIA database). The symmetry axis detection quality and performance indicate that both methods can be used to solve applied problems. Neither method demonstrated any significant advantage in terms of accuracy or performance. It is advisable to integrate both methods for solving real-life problems.
Бесплатно
Статья научная
The main aim of this research work is to compare k-nearest neighbor algorithm(KNN)super-vised classification with migrating means clustering unsupervised classification (MMC) method on the performance of hyperspectral and multispectral data for spectral land cover classes and de-velop their spectral library in Samara, Russia. Accuracy assessment of the derived thematic maps was based on the analysis of the classification confusion matrix statistics computed for each classi-fied map, using for consistency the same set of validation points. We were analyzed and compared Earth Observing-1 (EO-1) Hyperion hyperspectral data to Landsat 8 Operational Land Imager (OLI) and Advance Land Imager (ALI) multispectral data. Hyperspectral imagers, currently avail-able on airborne platforms, provide increased spectral resolution over existing space based sensors that can document detailed information on the distribution of land cover classes, sometimes spe-cies level. Results indicate that KNN (95, 94, 88 overall accuracy and .91, .89, .85 kappa coeffi-cient for Hyp, ALI, OLI respectively) shows better results than unsupervised classification (93, 90, 84 overall accuracy and .89, .87, .81 kappa coefficient for Hyp, ALI, OLI respectively). Develop-ment of spectral library for land cover classes is a key component needed to facilitate advance ana-lytical techniques to monitor land cover changes. Different land cover classes in Samara were sampled to create a common spectral library for mapping landscape from remotely sensed data. The development of these libraries provides a physical basis for interpretation that is less subject to conditions of specific data sets, to facilitate a global approach to the application of hyperspectral imagers to mapping landscape. In addition, it is demonstrated that the hyperspectral satellite image provides more accurate classification results than those extracted from the multispectral satellite image. The higher classification accuracy by KNN supervised was attributed principally to the ability of this classifier to identify optimal separating classes with low generalization error, thus producing the best possible classes’ separation.
Бесплатно
Constraints for Jaccard index-based rotational symmetry focus position in binary images
Статья научная
This study proposes analytical estimate for the size of a binary raster figure region which is guaranteed to contain the rotational symmetry focus. Focus here is the point a maximum Jaccard index between initial figure and rotated one. The size of the region is determined by the lower estimate of the intersection area during the rotation of the approximating primitives, considering the sizes of the inner and outer parts of the figure relative to the primitive. The smallest circumscribed circle or ellipse and sets of concentric circles and ellipses produced by the principal component analysis were used as the approximating figure. To verify the hypothesis that the size of the region is insignificant compared to the area of the figure, we numerically simulated the proposed method with test image datasets.
Бесплатно
Copy move forgery detection using key point localized super pixel based on texture features
Статья научная
The most important barrier in the image forensic is to ensue a forgery detection method such can detect the copied region which sustains rotation, scaling reflection, compressing or all. Traditional SIFT method is not good enough to yield good result. Matching accuracy is not good. In order to improve the accuracy in copy move forgery detection, this paper suggests a forgery detection method especially for copy move attack using Key Point Localized Super Pixel (KLSP). The proposed approach harmonizes both Super Pixel Segmentation using Lazy Random Walk (LRW) and Scale Invariant Feature Transform (SIFT) based key point extraction. The experimental result indicates the proposed KLSP approach achieves better performance than the previous well known approaches.
Бесплатно
Crop growth monitoring through Sentinel and Landsat data based NDVI time-series
Статья научная
Crop growth monitoring is an important phenomenon for agriculture classification, yield estimation, agriculture field management, improve productivity, irrigation, fertilizer management, sustainable agricultural development, food security and to understand how environment and climate change effect on crops especially in Russia as it has a large and diverse agricultural production. In this study, we assimilated monthly crop phenology from January to December 2018 by using the NDVI time series derived from moderate to high Spatio-temporal resolution Sentinel and Landsat data in cropland field at Samara airport area, Russia. The results support the potential of Sentinel and Landsat data derived NDVI time series for accurate crop phenological monitoring with all crop growth stages such as active tillering, jointing, maturity and harvesting according to crop calendar with reasonable thematic accuracy. This satellite data generated NDVI based work has great potential to provide valuable support for assessing crop growth status and the above-mentioned objectives with sustainable agriculture development.
Бесплатно
Cross-layer optimization technology for wireless network multimedia video
Статья научная
With the development of communication technology, wireless Internet has become more and more popular. The traditional network layered protocols cannot meet the increasingly rich network services, especially video. This paper briefly introduced the cross-layer transmission of video in wireless network and the cross-layer optimization algorithm used for improving video transmission quality and improved the traditional cross-layer algorithm. Then, the two cross-layer algorithms were simulated and analyzed on MATLAB software. The results showed that the packet delivery rate, peak signal to noise ratio and downlink throughput of the improved cross-layer algorithm were significantly higher than those of the traditional cross-layer algorithm under the same signal to interference plus noise ratio of receiving users in wireless network; meanwhile, with the increase of signal to interference plus noise ratio of the receiving user, the packet delivery rate and peak signal to noise ratio of the two algorithms increased, and tended to be stable after some signal to interference plus noise ratio, while the throughput of the two algorithms increased linearly. In the established real wireless network, the package delivery rate, peak signal to noise ratio and throughput of video after application of cross-layer algorithm were significantly improved, and the wireless network applying the improved cross-layer algorithm improved more. In summary, compared with the traditional cross-layer algorithm, the improved cross-layer algorithm can better improve the transmission quality of video in wireless network.
Бесплатно
Статья научная
The purpose of research to create automated personalization of diabetic macular edema laser treatment. The results are based on analysis of large semi-structured data, methods and algorithms for fundus image processing. The technology improves the quality of retina laser coagulation in the treatment of diabetic macular edema, which is one of the main reasons for pronounced vision decrease. The proposed technology includes original solutions to establish an optimal localization of multitude burns by determining zones exposed to laser. It also includes the recognition of large amount of unstructured data on the anatomical and pathological locations' structures in the area of edema and data optical coherent tomography. As a result, a uniform laser application on the pigment epithelium of the affected retina is ensured. It will increase the treatment safety and its effectiveness, thus avoiding the use of more expensive treatment methods. Assessment of retinal lesions volume and quality will allow predicting the laser photocoagulation results and will contribute to the improvement of laser surgeon's skills. The architecture of a software complex comprises a number of modules, including image processing methods, algorithms for photocoagulation pattern mapping, and intelligent analysis methods.
Бесплатно
Document image analysis and recognition: a survey
Статья научная
This paper analyzes the problems of document image recognition and the existing solutions. Document recognition algorithms have been studied for quite a long time, but despite this, currently, the topic is relevant and research continues, as evidenced by a large number of associated publications and reviews. However, most of these works and reviews are devoted to individual recognition tasks. In this review, the entire set of methods, approaches, and algorithms necessary for document recognition is considered. A preliminary systematization allowed us to distinguish groups of methods for extracting information from documents of different types: single-page and multi-page, with text and handwritten contents, with a fixed template and flexible structure, and digitalized via different ways: scanning, photographing, video recording. Here, we consider methods of document recognition and analysis applied to a wide range of tasks: identification and verification of identity, due diligence, machine learning algorithms, questionnaires, and audits. The groups of methods necessary for the recognition of a single page image are examined: the classical computer vision algorithms, i.e., keypoints, local feature descriptors, Fast Hough Transforms, image binarization, and modern neural network models for document boundary detection, document classification, document structure analysis, i.e., text blocks and tables localization, extraction and recognition of the details, post-processing of recognition results. The review provides a description of publicly available experimental data packages for training and testing recognition algorithms. Methods for optimizing the performance of document image analysis and recognition methods are described.
Бесплатно
Efficiency of object identification for binary images
Статья научная
In this paper, a comparative analysis of the correlation-extreme method, the method of contour analysis and the method of stochastic gradient identification in the objects identification for a binary image is carried out. The results are obtained for a situation where possible deformations of an identified object with respect to a pattern can be reduced to a similarity model, that is, the pattern and the object may differ in scale, orientation angle, shift along the base axes, and additive noise. The identification of an object is understood as the recognition of its image with an estimate of the strain parameters relative to the template.
Бесплатно
Статья научная
Change detection from synthetic aperture radar images becomes a key technique to detect change area related to some phenomenon as flood and deformation of the earth surface. This paper proposes a transfer learning and Residual Network with 18 layers (ResNet-18) architecture-based method for change detection from two synthetic aperture radar images. Before the application of the proposed technique, batch denoising using convolutional neural network is applied to the two input synthetic aperture radar image for speckle noise reduction. To validate the performance of the proposed method, three known synthetic aperture radar datasets (Ottawa; Mexican and for Taiwan Shimen datasets) are exploited in this paper. The use of these datasets is important because the ground truth is known, and this can be considered as the use of numerical simulation. The detected change image obtained by the proposed method is compared using two image metrics. The first metric is image quality index that measures the similarity ratio between the obtained image and the image of the ground truth, the second metrics is edge preservation index, it measures the performance of the method to preserve edges. Finally, the method is applied to determine the changed area using two Sentinel 1 B synthetic aperture radar images of Eddahbi dam situated in Morocco.
Бесплатно
Статья научная
This paper considers an experimental study of the layout of an active-pulse television measuring system in the problem of assessing the accuracy of measuring the distance to objects using the depth maps. The main technical characteristics and structure of the active-pulse television measuring system layout are described, the description of the multi-zone ranging method used in the experiment is given. The field tests were carried out using a system for terrain orthophotomaps construction by an unmanned aerial vehicle and a geodetic measuring instrument, which is a reference for building a terrain plan and fixing distances between objects on the ground. The technique of carrying out aerial work is described to obtain the necessary data array, on which a digital model and an orthophotomap of the area were subsequently built. Conclusions are drawn about the accuracy of digital terrain models built based on the results of aerial photography from an unmanned aerial vehicle with a geodetic receiver on board and the applicability of these data as reference data for testing a prototype of an active-pulse television measuring system.
Бесплатно
Face anti-spoofing with joint spoofing medium detection and eye blinking analysis
Статья научная
Modern biometric systems based on face recognition demonstrate high recognition quality, but they are vulnerable to face presentation attacks, such as photo or replay attack. Existing face anti-spoofing methods are mostly based on texture analysis and due to lack of training data either use hand-crafted features or fine-tuned pretrained deep models. In this paper we present a novel CNN-based approach for face anti-spoofing, based on joint analysis of the presence of a spoofing medium and eye blinking. For training our classifiers we propose the procedure of synthetic data generation which allows us to train powerful deep models from scratch. Experimental analysis on the challenging datasets (CASIA-FASD, NUUA Imposter) shows that our method can obtain state-of-the-art results.
Бесплатно
Facedetectnet: face detection via fully-convolutional network
Статья научная
Ace detection is one of the most popular computer vision tasks. There are a lot of face detection approaches proposed including different CNN-based techniques, but the problem of optimal balancing between detection quality and computational speed is still relevant. In this paper we propose new CNN-based solution for face detection called FaceDetectNet. Our CNN architecture is based on ideas of YOLO/DetectNet and GoogleNet architecture supported with some new tools and implementation details created especially for our face detection application. We propose: original iterative proposal clustering (IPC) algorithm for aggregation of output face proposals formed by CNN and the 2-level “weak pyramid” providing better detection quality on the testing sets containing both small and huge images. Our face detection approach is close to previously proposed SSD-based face detection, but the principal difference is that we use the deep features of top hidden CNN layer for forming the face proposals of any size...
Бесплатно
Fusion of information from multiple kinect sensors for 3D object reconstruction
Статья научная
In this paper, we estimate the accuracy of 3D object reconstruction using multiple Kinect sen-sors. First, we discuss the calibration of multiple Kinect sensors, and provide an analysis of the ac-curacy and resolution of the depth data. Next, the precision of coordinate mapping between sen-sors data for registration of depth and color images is evaluated. We test a proposed system for 3D object reconstruction with four Kinect V2 sensors and present reconstruction accuracy results. Ex-periments and computer simulation are carried out using Matlab and Kinect V2.
Бесплатно
Статья научная
Accurate detection of air bubbles boundaries is of crucial importance in determining the performance and in the study of various gas/liquid two-phase flow systems. The main goal of this work is edge extraction of air bubbles rising in two-phase flow in real-time. To accomplish this, a fast algorithm based on local variance is improved and accelerated on the GPU to detect bubble contour. The proposed method is robust against changes of intensity contrast of edges and capable of giving high detection responses on low contrast edges. This algorithm is performed in two steps: in the first step, the local variance of each pixel is computed based on integral image, and then the resulting contours are thinned to generate the final edge map. We have implemented our algorithm on an NVIDIA GTX 780 GPU. The parallel implementation of our algorithm gives a speedup factor equal to 17x for high resolution images (1024×1024 pixels) compared to the serial implementation. Also, quantitative and qualitative assessments of our algorithm versus the most common edge detection algorithms from the literature were performed. A remarkable performance in terms of results accuracy and computation time is achieved with our algorithm.
Бесплатно