Статьи журнала - Компьютерная оптика

Все статьи: 2511

Laser beam characterization by means of diffractive optical correlation filters

Laser beam characterization by means of diffractive optical correlation filters

Pavelyev V.S., Soifer V.A., Duparre M., Luedge B.

Статья научная

Analyzing of amplitude-phase characteristics of laser beam is topical in experimental physics and in a great number of laser applications, such as, for example, laser material treatment. The task of analyzing the amplitude-phase beam structure may be treated as that of analyzing the modal composition, if this is thought of as both analyzing individual modal powers and intermode phase shifts. In this paper the problem is tackled using a special diffractive optical element (DOE), called MODAN, matched to a group of laser radiation modes and their special combinations. The experimental results reported indicate that such an approach shows promise. Key words: laser beam, Gaussian modes, intermode power distribution, intermode phase shifts.

Бесплатно

Localization of mobile robot in prior 3D lidar maps using stereo image sequence

Localization of mobile robot in prior 3D lidar maps using stereo image sequence

Belkin I.V., Abramenko A.A., Bezuglyi V.D., Yudin D.A.

Статья научная

The paper studies the real-time stereo image-based localization of a vehicle in a prior 3D LiDAR map. A novel localization approach for mobile ground robot, which successfully combines conventional computer vision techniques, neural network based image analysis and numerical optimization, is proposed. It includes matching a noisy depth image and visible point cloud based on the modified Nelder-Mead optimization method. Deep neural network for image semantic segmentation is used to eliminate dynamic obstacles. The visible point cloud is extracted using a 3D mesh map representation. The proposed approach is evaluated on the KITTI dataset and a custom dataset collected from a ClearPath Husky mobile robot. It shows a stable absolute translation error of about 0.11 – 0.13 m. and a rotation error of 0.42 – 0.62 deg. The standard deviation of the obtained absolute metrics for our method is the smallest among other state-of-the-art approaches. Thus, our approach provides more stability in the estimated pose. It is achieved primarily through the use of multiple data frames during the optimization step and dynamic obstacles elimination on depth image. The method’s performance is demonstrated on different hardware platforms, including energy-efficient Nvidia Jetson Xavier AGX. With parallel code implementation, we achieve an input stereo image processing speed of 14 frames per second on Xavier AGX.

Бесплатно

Losses and orbital part of the poynting vector of air-core modes in hollow-core fibers

Losses and orbital part of the poynting vector of air-core modes in hollow-core fibers

Alagashev Grigory Konstantinovich, Stafeev Sergey Sergeevich, Pryamikov Andrey Dmitrievich

Статья научная

In our earlier works, we investigated a relationship between the formation of vortices in the transverse component of the Poynting vector of core modes and the regimes of strong localization of these modes in solid core micro-structured optical fibers. In this paper, we consider the behavior of the orbital part of the Poynting vector of fundamental and high-order modes in hollow-core fibers, and make comparisons with similar fundamental core mode behavior in solid core micro- structured optical fibers. We then demonstrated the impact of the “negative” curvature of the core-cladding boundary of a hollow-core fiber on the behavior of the orbital part of the Poynting vector of the air-core modes.

Бесплатно

MIDV-2020: a comprehensive benchmark dataset for identity document analysis

MIDV-2020: a comprehensive benchmark dataset for identity document analysis

Bulatov Konstantin Bulatovich, Emelianova Ekaterina Vladimirovna, Tropin Daniil Vyacheslavovich, Skoryukina Natalya Sergeevna, Chernyshova Yulia Sergeevna, Sheshkus Alexander Vladimirovich, Usilin Sergey Alexandrovich, Ming Zuheng, Burie Jean-Christophe, Luqman Muhammad Muzzamil, Arlazarov Vladimir Viktorovich

Статья научная

Identity documents recognition is an important sub-field of document analysis, which deals with tasks of robust document detection, type identification, text fields recognition, as well as identity fraud prevention and document authenticity validation given photos, scans, or video frames of an identity document capture. Significant amount of research has been published on this topic in recent years, however a chief difficulty for such research is scarcity of datasets, due to the subject matter being protected by security requirements. A few datasets of identity documents which are available lack diversity of document types, capturing conditions, or variability of document field values. In this paper, we present a dataset MIDV-2020 which consists of 1000 video clips, 2000 scanned images, and 1000 photos of 1000 unique mock identity documents, each with unique text field values and unique artificially generated faces, with rich annotation. The dataset contains 72409 annotated images in total, making it the largest publicly available identity document dataset to the date of publication. We describe the structure of the dataset, its content and annotations, and present baseline experimental results to serve as a basis for future research. For the task of document location and identification content-independent, feature-based, and semantic segmentation-based methods were evaluated. For the task of document text field recognition, the Tesseract system was evaluated on field and character levels with grouping by field alphabets and document types. For the task of face detection, the performance of Multi Task Cascaded Convolutional Neural Networks-based method was evaluated separately for different types of image input modes. The baseline evaluations show that the existing methods of identity document analysis have a lot of room for improvement given modern challenges. We believe that the proposed dataset will prove invaluable for advancement of the field of document analysis and recognition.

Бесплатно

MIDV-500: a dataset for identity document analysis and recognition on mobile devices in video stream

MIDV-500: a dataset for identity document analysis and recognition on mobile devices in video stream

Arlazarov Vladimir Viktorovich, Bulatov Konstantin Bulatovich, Chernov Timofey Sergeevich, Arlazarov Vladimir Lvovich

Статья научная

A lot of research has been devoted to identity documents analysis and recognition on mobile devices. However, no publicly available datasets designed for this particular problem currently exist. There are a few datasets which are useful for associated subtasks but in order to facilitate a more comprehensive scientific and technical approach to identity document recognition more specialized datasets are required. In this paper we present a Mobile Identity Document Video dataset (MIDV-500) consisting of 500 video clips for 50 different identity document types with ground truth which allows to perform research in a wide scope of document analysis problems. The paper presents characteristics of the dataset and evaluation results for existing methods of face detection, text line recognition, and document fields data extraction. Since an important feature of identity documents is their sensitiveness as they contain personal data, all source document images used in MIDV-500 are either in public domain or distributed under public copyright licenses. The main goal of this paper is to present a dataset. However, in addition and as a baseline, we present evaluation results for existing methods for face detection, text line recognition, and document data extraction, using the presented dataset.

Бесплатно

MIMO communication system capacity in random visible light channel

MIMO communication system capacity in random visible light channel

Parshin A.Y., Parshin Y.N.

Статья научная

Being a promising one, optical information transmission standard expands capabilities of communication systems in the conditions of heavy frequency band load. Optical communication system efficiency in a room can be improved by multi-antenna systems. The aim of this paper is a theoretical study of MIMO Li-Fi communication system capacity. The calculation of ergodic capacity is performed for MIMO optical communication system in terms of various scenarios of light propagation. Receiving and transmitting system is modeled in the form of receivers and transmitters randomly placed in a room with randomly oriented light-emitting and photo diodes. A matrix of channel parameters is modeled using corresponding probability density functions and additive Gaussian noise at receiver inputs. The paper also considers various scenarios of optical signal propagation and their influence on optical channel capacity. The comparison of various methods of power distribution between original modes of MIMO optical communication system as well as their influence on capacity is carried out. Optimal power distribution between MIMO system eigenmodes is determined by maximum capacity criterion.

Бесплатно

Machine learning-based voice assistant: optimizing the efficiency of speech conversion for people with speech disorders

Machine learning-based voice assistant: optimizing the efficiency of speech conversion for people with speech disorders

Antor M.H., Chudinovskikh N.V., Bachurin M.V., Shurpikov A.A., Khlebnikov N.A., Bredikhin B.A.

Статья научная

An automatic speech recognition system has the possibility of enhancing the standard of living for persons with disabilities by solving issues such as dysarthria, stuttering, and other speech defects. In this paper, we introduce a voice assistant using hyperkinetic dysarthria (HD) defect speeches. It contains the data preprocessing steps and the development of a novel convolutional recurrent network (CRN) model that is built depending on the convolutional neural networks and recurrent neural networks. We implemented data preprocessing methods, including filtering, down-sampling, and splitting, to prevent overfitting and decrease processing power as well as time. In addition, the technique of Mel Frequency Cepstral Coefficients (MFCC) has been utilized to extract speech characteristics. The proposed model is trained to recognize HD speech disorders using a dataset including 2000 Russian speeches. The experimental results demonstrate that the proposed method obtains a character error rate (CER) of 14.76 %. It indicates that approximately 85 % of characters are able to correctly recognize on the test dataset. We have created a telegram bot that utilizes our trained model to help people with hyperkinetic dysarthria speech disorder. This bot is capable of providing assistance independently, without the need for any third-party assistance.

Бесплатно

Many heads but one brain: fusionbrain - a single multimodal multitask architecture and a competition

Many heads but one brain: fusionbrain - a single multimodal multitask architecture and a competition

Bakshandaeva Daria Dmitrievna, Dimitrov Denis Valerievich, Arkhipkin Vladimir Sergeyevich, Shonenkov Alex Vladimirovich, Potanin Mark Stanislavovich, Karachev Denis Konstantinovich, Kuznetsov Andrey Vladimirovich, Voronov Anton Dmitrievich, Petiushko Aleksandr Alexandrovich, Davydova Vera Fedorovna, Tutubalina Elena Viktorovna

Статья научная

Supporting the current trend in the AI community, we present the AI Journey 2021 Challenge called FusionBrain, the first competition which is targeted to make a universal architecture which could process different modalities (in this case, images, texts, and code) and solve multiple tasks for vision and language. The FusionBrain Challenge combines the following specific tasks: Code2code Translation, Handwritten Text recognition, Zero-shot Object Detection, and Visual Question Answering. We have created datasets for each task to test the participants' submissions on it. Moreover, we have collected and made publicly available a new handwritten dataset in both English and Russian, which consists of 94,128 pairs of images and texts. We also propose a multimodal and multitask architecture - a baseline solution, in the centre of which is a frozen foundation model and which has been trained in Fusion mode along with Single-task mode. The proposed Fusion approach proves to be competitive and more energy-efficient compared to the task-specific one.

Бесплатно

Many-parameter m-complementary Golay sequences and transforms

Many-parameter m-complementary Golay sequences and transforms

Labunets Valeri Grigorievich, Chasovskih Victor Petrovich, Smetanin Yuri Gennadievich, Ostheimer Rundblad Ekaterina

Статья научная

In this paper, we develop the family of Golay–Rudin–Shapiro (GRS) m-complementary many-parameter sequences and many-parameter Golay transforms. The approach is based on a new gen-eralized iteration generating construction, associated with n unitary many-parameter transforms and n arbitrary groups of given fixed order. We are going to use multi-parameter Golay transform in Intelligent-OFDM-TCS instead of discrete Fourier transform in order to find out optimal values of parameters optimized PARP, BER, SER, anti-eavesdropping and anti-jamming effects.

Бесплатно

Mapping and evaluating urban density patterns in Moscow, Russia

Mapping and evaluating urban density patterns in Moscow, Russia

Choudhary Komal, Boori Mukesh Singh Boori, Kupriyanov Alexander Victorovich

Статья научная

The defense of the notion of ‘compact city’ as a strategy to reduce urban sprawl to support greater utilization of existing infrastructure and services in more compact areas and to improve the connectivity of employment hubs is actively discussed in urban research. Using the urban residential density as a surrogate measure for urban compactness, this paper empirically examines a cadaster database that contains details of every property with a view of capturing changes in urban residential density patterns across Moscow using geospatial techniques. The policy of densification in chase of a more compact city has produced mixed results. Findings of this study signal that the urban densities across the buffer zones around Moscow city are significantly different. The Landsat images from 1995, 2005 and 2016 are classified based on the maximum likelihood to expand the land use/cover maps and identify the land cover. Then, the area coverage for all the land use/cover types at different points in time is combined with the distance from the city center. After that, urbanization densities from the city center toward the outskirts for every 1-km distance from 1 to 60 km are calculated. The city density on the distance of 1 to 35 km is found to be very high in the years 1995 to 2016. As usual, the population, traffic conditions, industrialization and government policy are the major factors that influenced the urban expansion.

Бесплатно

Master equation averaged over stochastic process realizations for the description of a three-level atom relaxation

Master equation averaged over stochastic process realizations for the description of a three-level atom relaxation

Mikhailov Victor Alexandrovich, Troshkin Nikolay Vyacheslavovich

Статья научная

The relaxation of a three-level atom interacting with a photon heat bath and an external stochastic field is investigated. For the reduced density matrix, a master equation averaged over stochastic process realizations is derived. An exact solution is obtained and the radiation line shapes are calculated.

Бесплатно

Mathematical modeling of processes in quantum computer elements based on methods of quantum theory to improve their efficiency

Mathematical modeling of processes in quantum computer elements based on methods of quantum theory to improve their efficiency

Biryukov A.A., Shleenkov M.A.

Статья научная

The paper studies entangled states of two qubits interacting with each other and with an electromagnetic field. The state of the qubits is determined by a statistical density matrix. The degree of entanglement of the state is characterized by the Peres-Gorodeckii (PG) parameter. The statistical density matrix and its evolution are determined in the energy representation within the framework of the path integral formalism. The obtained equations determine the dependence of the PG parameter on the parameters of qubit dipole-dipole interaction and the acting electromagnetic field. The results of numerical calculations are presented in graphs for the PG parameter. It is shown that it is possible to choose parameters corresponding to qubit states with a high degree of entanglement (0.99).

Бесплатно

Matrix arithmetic based on fibonacci matrices

Matrix arithmetic based on fibonacci matrices

Stakhov Alexey

Статья

Бесплатно

Method for removing haze from images, captured under a wide range of lighting conditions

Method for removing haze from images, captured under a wide range of lighting conditions

Filin A.I., Kopylov A.V., Gracheva I.A.

Статья научная

The presence of haze on images degrades the quality of perception and automatic analysis of scenes. One of the most popular methods of haze removal is the dark channel prior method, which is based on the Koschmieder atmospheric scattering model. However, its underlying assumptions are not met for nighttime, since localized light sources make a significant, if not the main, contribution to lighting. We propose here to use the degree of belonging of an image element to a localized light source, determined based on a one-class classifier, as a value that characterizes the confidence of the corresponding element of the estimated transmission map during its rectifi-cation based on the gamma-normal model, which makes it possible to increase the accuracy of dehazing when processing images, captured in low-light or nighttime conditions.

Бесплатно

Methods, algorithms and programs of computer algebra in problems of registration and analysis of random point structures

Methods, algorithms and programs of computer algebra in problems of registration and analysis of random point structures

Reznik A.L., Soloviev A.A.

Статья научная

An original approach to solving difficult time-consuming problems of registration and analysis of random point images is described. The approach is based on the development and application of high-performance specialized computer algebra systems. Three software packages have been created specifically for carrying out equivalent analytical transformations on a computer. The first software system is designed to calculate formulas describing the volumes of convex polyhedra with parametrically specified boundaries in n -dimensional space. The second system is based on the calculation of multidimensional integral expressions by the method of cyclic differentiation of the integral with respect to the parameter. The third system is based on the accelerated implementation of complex combinatorial-recursive transformations on a computer. Another distinctive feature of the work is the extension of the classical Catalan numbers to the multidimensional case (they were required to solve a number of intermediate probabilistic-combinatorial problems). The implementation of the above computer algebra software systems on a multi-core cluster of Novosibirsk State University, together with the direct use of the explicit form of generalized Catalan numbers, allowed the authors to obtain several new previously unknown probabilistic formulas and dependencies required for solving problems in the field of analysis of random point images.

Бесплатно

Modeling of spontaneous emission in presence of cylindrical nanoobjects: the scattering matrix approach

Modeling of spontaneous emission in presence of cylindrical nanoobjects: the scattering matrix approach

Nikolaev Valentin, Girshova Elizaveta Ilinichna, Kaliteevski Mikhail Alekseevich

Статья научная

We propose a method of analysis of spontaneous emission of a quantum emitter (an atom, a luminescence center, a quantum dot) inside or in vicinity of a cylinder. At the focus of our method are analytical expressions for the scattering matrix of the cylindrical nanoobject. We propose the approach to electromagnetic field quantization based of eigenvalues and eigenvectors of the scattering matrix. The method is applicable for calculation and analysis of spontaneous emission rates and angular dependences of radiation for a set of different systems: semiconductor nanowires with quantum dots, plasmonic nanowires, cylindrical hollows in dielectrics and metals. Relative simplicity of the method allows obtaining analytical and semi-analytical expressions for both cases of radiation into external medium and into guided modes.

Бесплатно

Modeling the influence of the geometrical unsharpness on the neutron radiography and tomography images of porous materials

Modeling the influence of the geometrical unsharpness on the neutron radiography and tomography images of porous materials

Zel I.Y.

Статья научная

Beam divergence is one of the instrument resolution parameters in neutron computed tomography. In pinhole geometry, due to the finite size of the source, geometric unsharpness affects the transmission images and therefore influences the reconstructed data. In this paper, we propose an approach for deterministic simulation of this effect for a voxelized 3D object. The idea behind the proposed approach is to use multiple point sources at a pinhole position and collect transmission images from each of them. The implementation was done using the ASTRA toolbox by calculating cone beam projections from each point source. This approach was applied to a porous phantom. Artifacts associated with beam divergence were identified in the reconstructed data. The influence of beam divergence on the segmentation of pores by binarization of the reconstructed data has been considered.

Бесплатно

Modeling the light diffraction by micro-optics elements using the finite element method

Modeling the light diffraction by micro-optics elements using the finite element method

Nesterenko D.V., Kotlyar V.V., Wangimage Y.

Статья

Бесплатно

Modelling of multilayer dielectric filters based on TIO2 / SIO2 and TIO2 / MgF2 for fluorescence microscopy imaging

Modelling of multilayer dielectric filters based on TIO2 / SIO2 and TIO2 / MgF2 for fluorescence microscopy imaging

Butt Muhammad Ali, Fomchenkov Sergey Alexandrovich, Ullah Anayat, Habib Mohsin, Ali Rabia Zafar

Статья научная

We report a design for creating multilayer dielectric optical filters based on TiO2 and SiO2/MgF2 alternating layers. We have selected Titanium dioxide (TiO2) for high refractive index (2.5), Silicon dioxide (SiO2) and Magnesium fluoride (MgF2) as a low refractive index layer (1.45 and 1.37) respectively. Miniaturized visible spectrometers are useful for quick and mobile characterization of biological samples. Such devices can be fabricated by using Fabry-Perot (FP) filters consisting of two highly reflecting mirrors with a central cavity in between. Distributed Bragg Re-flectors (DBRs) consisting of alternating high and low refractive index material pairs are the most commonly used mirrors in FP filters, due to their high reflectivity. However, DBRs have high re-flectivity for a selected range of wavelengths known as the stopband of the DBR. This range is usually much smaller than the sensitivity range of the spectrometer. Therefore, bandpass filters are required to restrict the wavelength outside the stopband of the FP DBRs. The proposed filter shows high quality with an average transmission of 97 % within the passbands and the transmission outside the passband is around 3 %. Special attention has been given to keep the thickness of the filters within the economic limits. It can be suggested that these filters are exceptionally promising for florescence imaging and narrow-band imaging endoscopy.

Бесплатно

Monitored reconstruction improved by post-processing neural network

Monitored reconstruction improved by post-processing neural network

Yamaev A.V.

Статья научная

Computed tomography (CT) is widely utilized for analyzing internal structures, but the limitations of traditional reconstruction algorithms, which often require a large number of projections, restrict their effectiveness in time-critical tasks or for biological objects studying. Recently Monitored reconstruction approach was proposed for reducing the requirement of dose load. In this paper, there were investigated the advantages of using post-processing neural networks within a monitored reconstruction approach. Three algorithms, namely FBP, FBPConvNet, and LRFR, are evaluated based on their mean count of projections required for the achievement of target reconstruction accuracy. A novel training method specifically designed for neural network algorithms within the Monitored reconstruction framework is proposed. It is shown that the use of the LRFR approach allows one to achieve both a reduction in the number of measured projections and an improvement in the reconstruction accuracy over a certain range of stopping rules. These findings highlight the significant potential of neural networks to be used in the Monitored reconstruction approach.

Бесплатно

Журнал