Статьи журнала - Компьютерная оптика
Все статьи: 2382

Three-dimensional model of quantum dots' self-assembly under the action of laser radiation
Статья научная
This study considered a process of quantum dots' self-assembly into nanostructure arrays with predefined geometry, which proceeds in the external resonant laser field. We considered the simplest case of assembling a stable structure of two particles. The problem was solved numerically using a three-dimensional model of Brownian dynamics. The idea of the method is that the attraction of the dots occurs due to the interaction of resonantly induced dipole moments, with the dots being then captured by the Van der Waals force. Finally, a three-dimensional model was considered; the average nanoparticle aggregation time as a function of the laser radiation wavelength was calculated; the probability of such structures' being formed was estimated for the calculated average aggregation time and for the laser pulse duration used in the experiment. The wavelength of the maximum probability was found to be shifted from the single particle resonance wavelength of 525 nm to the red area of 535 nm, which is in qualitative agreement with the redshift of the resonance wavelength of interacting particles.
Бесплатно

Threshold image target segmentation technology based on intelligent algorithms
Статья научная
This paper briefly introduces the optimal threshold calculation model and particle swarm optimization (PSO) algorithm for image segmentation and improves the PSO algorithm. Then the standard PSO algorithm and improved PSO algorithm were used in MATLAB software to make simulation analysis on image segmentation. The results show that the improved PSO algorithm converges faster and has higher fitness value; after the calculation of the two algorithms, it is found that the improved PSO algorithm is better in the subjective perspective, and the image obtained by the improved PSO segmentation has higher regional consistency and takes shorter time in the perspective of quantitative objective data. In conclusion, the improved PSO algorithm is effective in image segmentation.
Бесплатно

Статья научная
In this work a solitary surface plasmon-polariton was obtained by using a frequency-dependent finite difference time-domain method for the TM- and radially polarized light at 532 nm, which was propagated through silver nano-elements (a nano-strip and a nano-ring), placed in an aqueous medium. The device's height and width were equal to 20 nm and 215 nm respectively. The intensity of surface plasmon-polariton was four times higher than that of the incident radiation. The full width at half maximum of the nanojet was 138 nm and 158 nm for the case of using a nano-strip and a nano-ring respectively. The results can be used to design devices that allow capturing and moving particles in water or other biofluids.
Бесплатно

Tightly focused laser light with azimuthal polarization and singular phase
Статья научная
Using simplified Richards-Wolf formulas we show that laser light with azimuthal polarization and singular phase can produce a smaller focal spot than that from a laser beam with radial polarization, other conditions remaining the same. It is numerically shown that when focusing an azimuthally polarized laser beam with phase singularity using a zone plate a 1.3 times smaller focal spot can be attained than when an aplanatic lens is used. A spiral phase plate can be replaced with a phase step with a π phase shift. In this case the subwavelength focal spot from a laser beam with azimuthal polarization, which is formed near the zone plate surface, loses circular symmetry, while becoming smaller and acquiring an elliptical form with radiuses of 0.273λ and 0.314λ (NA=1).
Бесплатно

Time-optimal algorithms focused on the search for random pulsed-point sources
Статья научная
The article describes methods and algorithms related to the analysis of dynamically changing discrete random fields. Time-optimal strategies for the localization of pulsed-point sources having a random spatial distribution and indicating themselves by generating instant delta pulses at random times are proposed. An optimal strategy is a procedure that has a minimum (statistically) average localization time. The search is performed in accordance with the requirements for localization accuracy and is carried out by a system with one or several receiving devices. Along with the predetermined accuracy of localization of a random pulsed-point source, a significant complicating factor of the formulated problem is that the choice of the optimal search procedure is not limited to one-step algorithms that end at the moment of first pulse generation. Moreover, the article shows that even with relatively low requirements for localization accuracy, the time-optimal procedure consists of several steps, and the transition from one step to another occurs at the time of registration of the next pulse by the receiving system. In this case, the situation is acceptable when during the process of optimal search some of the generated pulses are not fixed by the receiving system. The parameters of the optimal search depending on the number of receiving devices and the required accuracy of localization are calculated and described in the paper.
Бесплатно

Tiny CNN for feature point description for document analysis: approach and dataset
Статья научная
In this paper, we study the problem of feature points description in the context of document analysis and template matching. Our study shows that specific training data is required for the task especially if we are to train a lightweight neural network that will be usable on devices with limited computational resources. In this paper, we construct and provide a dataset of photo and synthetically generated images and a method of training patches generation from it. We prove the effectiveness of this data by training a lightweight neural network and show how it performs in both general and documents patches matching. The training was done on the provided dataset in comparison with HPatches training dataset and for the testing, we solve HPatches testing framework tasks and template matching task on two publicly available datasets with various documents pictured on complex backgrounds: MIDV-500 and MIDV-2019.
Бесплатно

Towards a unified framework for identity documents analysis and recognition
Статья научная
Identity documents recognition is far beyond classical optical character recognition problems. Automated ID document recognition systems are tasked not only with the extraction of editable and transferable data but with performing identity validation and preventing fraud, with an increasingly high cost of error. A significant amount of research is directed to the creation of ID analysis systems with a specific focus for a subset of document types, or a particular mode of image acquisition, however, one of the challenges of the modern world is an increasing demand for identity document recognition from a wide variety of image sources, such as scans, photos, or video frames, as well as in a variety of virtually uncontrolled capturing conditions. In this paper, we describe the scope and context of identity document analysis and recognition problem and its challenges; analyze the existing works on implementing ID document recognition systems; and set a task to construct a unified framework for identity document recognition, which would be applicable for different types of image sources and capturing conditions, as well as scalable enough to support large number of identity document types. The aim of the presented framework is to serve as a basis for developing new methods and algorithms for ID document recognition, as well as for far more heavy challenges of identity document forensics, fully automated personal authentication and fraud prevention.
Бесплатно

Towards monitored tomographic reconstruction: algorithm-dependence and convergence
Статья научная
The monitored tomographic reconstruction (MTR) with optimized photon flux technique is a pioneering method for X-ray computed tomography (XCT) that reduces the time for data acquisition and the radiation dose. The capturing of the projections in the MTR technique is guided by a scanning protocol built on similar experiments to reach the predetermined quality of the reconstruction. This method allows achieving a similar average reconstruction quality as in ordinary tomography while using lower mean numbers of projections. In this paper, we, for the first time, systematically study the MTR technique under several conditions: reconstruction algorithm (FBP, SIRT, SIRT-TV, and others), type of tomography setup (micro-XCT and nano-XCT), and objects with different morphology. It was shown that a mean dose reduction for reconstruction with a given quality only slightlyvaries with choice of reconstruction algorithm, and reach up to 12.5 % in case of micro-XCT and 8.5 % for nano-XCT. The obtained results allow to conclude that the monitored tomographic reconstruction approach can be universally combined with an algorithm of choice to perform a controlled trade-off between radiation dose and image quality. Validation of the protocol on independent common ground truth demonstrated a good convergence of all reconstruction algorithms within the MTR protocol.
Бесплатно

Traffic extreme situations detection in video sequences based on integral optical flow
Статья научная
Road traffic analysis is an important task in many applications and it can be used in video surveillance systems to prevent many undesirable events. In this paper, we propose a new method based on integral optical flow to analyze cars movement in video and detect flow extreme situations in real-world videos. Firstly, integral optical flow is calculated for video sequences based on optical flow, thus random background motion is eliminated; secondly, pixel-level motion maps which describe cars movement from different perspectives are created based on integral optical flow; thirdly, region-level indicators are defined and calculated; finally, threshold segmentation is used to identify different cars movements. We also define and calculate several parameters of moving car flow including direction, speed, density, and intensity without detecting and counting cars. Experimental results show that our method can identify cars directional movement, cars divergence and cars accumulation effectively.
Бесплатно

Статья научная
The three-dimensional perception applications have been growing since Light Detection and Ranging devices have become more affordable. On those applications, the navigation and collision avoidance systems stand out for their importance in autonomous vehicles, which are drawing an appreciable amount of attention these days. The on-road object classification task on three-dimensional information is a solid base for an autonomous vehicle perception system, where the analysis of the captured information has some factors that make this task challenging. On these applications, objects are represented only on one side, its shapes are highly variable and occlusions are commonly presented. But the highest challenge comes with the low resolution, which leads to a significant performance dropping on classification methods. While most of the classification architectures tend to get bigger to obtain deeper features, we explore the opposite side contributing to the implementation of low-cost mobile platforms that could use low-resolution detection and ranging devices. In this paper, we propose an approach for on-road objects classification on extremely low-resolution conditions. It uses directly three-dimensional point clouds as sequences on a transformer-convolutional architecture that could be useful on embedded devices. Our proposal shows an accuracy that reaches the 89.74 % tested on objects represented with only 16 points extracted from the Waymo, Lyft’s level 5 and Kitti datasets. It reaches a real time implementation (22 Hz) in a single core processor of 2.3 Ghz.
Бесплатно

Tree-serial parametric dynamic programming with flexible prior model for image denoising
Статья научная
We consider here image denoising procedures, based on computationally effective tree-serial pa-rametric dynamic programming procedures, different representations of an image lattice by the set of acyclic graphs and non-convex regularization of a new type which allows to flexibly set a priori pref-erences. Experimental results in image denoising, as well as comparison with related methods, are provided. A new extended version of multi quadratic dynamic programming procedures for image denoising, proposed here, shows an improved accuracy for images of a different type.
Бесплатно

Статья научная
A tunable diffraction grating based on an electrooptic X-cut lithium niobate crystal has been manufactured and experimentally analyzed. The period of electrodes is 290 μm, the electrode width is 117.5 μm, and the thickness of an electrode is 150 - 160 nm. The electrodes are made of a transparent conducting indium-tin oxide that serves as an antireflection coating with the aim of increasing the optical transmission. In order to prevent crystal polarization switching and electrical breakdown an optimized electrode topology with end ellipticity 1:1 and increased interelectrode gap is used. The optical diagram of the tunable grating with alternating electrode potentials for various gap voltages is analyzed. The intensity of the zero order of diffraction is shown to decrease by 40 % at a voltage of 800 V. At the same time, the origination of new diffraction orders at angles ± λ / (2 d ) is noted. The measurement of the forward-bias and reverse-bias regions of the modulation characteristic reveals the absence of hysteresis, which confirms the correctness of the electrode topology design.
Бесплатно

Статья научная
In this paper, we examine the applicability limits of different methods of compensation of the individual properties of self-emitting displays with significant non-uniformity of chromaticity and maximum brightness. The aim of the compensation is to minimize the perceived image non-uniformity. Compensation of the displayed image non-uniformity is based on minimizing the perceived distance between the target (ideally displayed) and the simulated image displayed by the calibrated screen. The S-CIELAB model of the human visual system properties is used to estimate the perceived distance between two images. In this work, we compare the efficiency of the channel-wise and linear (with channel mixing) compensation models depending on the models of variation in the characteristics of display elements (subpixels). It was found that even for a display with uniform chromatic subpixels characteristics, the linear model with channel mixing is superior in terms of compensation accuracy.
Бесплатно

U-net-bin: hacking the document image binarization contest
Статья научная
Image binarization is still a challenging task in a variety of applications. In particular, Document Image Binarization Contest (DIBCO) is organized regularly to track the state-of-the-art techniques for the historical document binarization. In this work we present a binarization method that was ranked first in the DIBCO' 17 contest. It is a convolutional neural network (CNN) based method which uses U-Net architecture, originally designed for biomedical image segmentation. We describe our approach to training data preparation and contest ground truth examination and provide multiple insights on its construction (so called hacking). It led to more accurate historical document binarization problem statement with respect to the challenges one could face in the open access datasets. A docker container with the final network along with all the supplementary data we used in the training process has been published on Github.
Бесплатно

Uncertainty-based quantization method for stable training of binary neural networks
Статья научная
Binary neural networks (BNNs) have gained attention due to their computational efficiency. However, training BNNs has proven to be challenging. Existing algorithms either fail to produce stable and high-quality results or are overly complex for practical use. In this paper, we introduce a novel quantizer called UBQ (Uncertainty-based quantizer) for BNNs, which combines the advantages of existing methods, resulting in stable training and high-quality BNNs even with a low number of trainable parameters. We also propose a training method involving gradual network freezing and batch normalization replacement, facilitating a smooth transition from training mode to execution mode for BNNs. To evaluate UBQ, we conducted experiments on the MNIST and CIFAR-10 datasets and compared our method to existing algorithms. The results demonstrate that UBQ outperforms previous methods for smaller networks and achieves comparable results for larger networks.
Бесплатно

Unfolder: fast localization and image rectification of a document with a crease from folding in half
Статья научная
Presentation of folded documents is not an uncommon case in modern society. Digitizing such documents by capturing them with a smartphone camera can be tricky since a crease can divide the document contents into separate planes. To unfold the document, one could hold the edges potentially obscuring it in a captured image. While there are many geometrical rectification methods, they were usually developed for arbitrary bends and folds. We consider such algorithms and propose a novel approach Unfolder developed specifically for images of documents with a crease from folding in half. Unfolder is robust to projective distortions of the document image and does not fragment the image in the vicinity of a crease after rectification. A new Folded Document Images dataset was created to investigate the rectification accuracy of folded (2, 3, 4, and 8 folds) documents. The dataset includes 1600 images captured when document placed on a table and when held in hand. The Unfolder algorithm allowed for a recognition error rate of 0.33, which is better than the advanced neural network methods DocTr (0.44) and DewarpNet (0.57). The average runtime for Unfolder was only 0.25 s/image on an iPhone XR.
Бесплатно

Unsupervised color texture segmentation based on multi-scale region-level Markov random field models
Статья научная
In the field of color texture segmentation, region-level Markov random field model (RMRF) has become a focal problem because of its efficiency in modeling the large-range spatial constraints. However, the RMRF defined on a single scale cannot describe the un-stationary essence of the image, which highly limits its robustness. Hence, by combining wavelet transformation and the RMRF model, we present a multi-scale RMRF (MsRMRF) model in wavelet domainin this paper. In the Bayesian framework, the proposed model seamlessly integrates the multi-scale information stemmed from both the original image and the region-level spatial constraints. Therefore, the new model can accurately describe the characteristics of different kinds of texture. Based on MsRMRF, an unsupervised segmentation algorithm is designed for segmenting color texture images. Both synthetic color texture images and remote sensing images are employed in the comparative experiments, and the experimental results show that the proposed method can obtain more accurate segmentation results than the competitors.
Бесплатно

Vanishing point detection with direct and transposed fast hough transform inside the neural network
Статья научная
In this paper, we suggest a new neural network architecture for vanishing point detection in images. The key element is the use of the direct and transposed fast Hough transforms separated by convolutional layer blocks with standard activation functions. It allows us to get the answer in the coordinates of the input image at the output of the network and thus to calculate the coordinates of the vanishing point by simply selecting the maximum. Besides, it was proved that calculation of the transposed fast Hough transform can be performed using the direct one. The use of integral operators enables the neural network to rely on global rectilinear features in the image, and so it is ideal for detecting vanishing points. To demonstrate the effectiveness of the proposed architecture, we use a set of images from a DVR and show its superiority over existing methods. Note, in addition, that the proposed neural network architecture essentially repeats the process of direct and back projection used, for example, in computed tomography.
Бесплатно

Vehicle wheel weld detection based on improved YOLO V4 algorithm
Статья научная
In recent years, vision-based object detection has made great progress across different fields. For instance, in the field of automobile manufacturing, welding detection is a key step of weld inspection in wheel production. The automatic detection and positioning of welded parts on wheels can improve the efficiency of wheel hub production. At present, there are few deep learning based methods to detect vehicle wheel welds. In this paper, a method based on YOLO v4 algorithm is proposed to detect vehicle wheel welds. The main contributions of the proposed method are the use of k-means to optimize anchor box size, a Distance-IoU loss to optimize the loss function of YOLO v4, and non-maximum suppression using Distance-IoU to eliminate redundant candidate bounding boxes. These steps improve detection accuracy. The experiments show that the improved methods can achieve high accuracy in vehicle wheel weld detection (4.92 % points higher than the baseline model with respect to AP75 and 2.75 % points higher with respect to AP50). We also evaluated the proposed method on the public KITTI dataset. The detection results show the improved method’s effectiveness.
Бесплатно

Veiling glare removal: synthetic dataset generation, metrics and neural network architecture
Статья научная
In photography, the presence of a bright light source often reduces the quality and readability of the resulting image. Light rays reflect and bounce off camera elements, sensor or diaphragm causing unwanted artifacts. These artifacts are generally known as “lens flare” and may have different influences on the photo: reduce contrast of the image (veiling glare), add circular or circular-like effects (ghosting flare), appear as bright rays spreading from light source (starburst pattern), or cause aberrations. All these effects are generally undesirable, as they reduce legibility and aesthetics of the image. In this paper we address the problem of removing or reducing the effect of veiling glare on the image. There are no available large-scale datasets for this problem and no established metrics, so we start by (i) proposing a simple and fast algorithm of generating synthetic veiling glare images necessary for training and (ii) studying metrics used in related image enhancement tasks (dehazing and underwater image enhancement). We select three such no-reference metrics (UCIQE, UIQM and CCF) and show that their improvement indicates better veil removal. Finally, we experiment on neural network architectures and propose a two-branched architecture and a training procedure utilizing structural similarity measure.
Бесплатно