Статьи журнала - Компьютерная оптика

Все статьи: 2553

Improving generalization in classification novel bacterial strains: a multi-headed resnet approach for microscopic image classification

Improving generalization in classification novel bacterial strains: a multi-headed resnet approach for microscopic image classification

Yachnaya V.O., Mikhalkova M.A., Malashin R.O., Lutsiv V.R., Kraeva L.A., Khamdulayeva G.N., Nazarov V.E., Chelibanov V.P.

Статья научная

The purpose of this work is to design a system for microscopic bacterial images classification that can be generalized to new data. In the course of work, a dataset containing 23 bacterial species was collected. We use a strain-wise method for dividing the dataset into training and test sets. Such splitting (in contrast to random division) allows evaluating the performance of classifiers on new strains in the case of intra-species visual variability of bacteria. We propose a “Multi-headed” ResNet (ResNet-MH) for the analysis of microscopic images of bacterial colonies. This approach forces the neural network to analyze features of different resolutions, such as the shape of individual bacterial cells and the shape and number of bacterial clusters during training. Our network achieves the 41.6% accuracy species-wise and 64.06% accuracy genera-wise. The proposed method of dataset splitting guarantees generalization to new unseen strains, whereas random splitting into training and test sets leads to overfitting of the system (accuracy is over 90%). For the 10 visually strain-wise stable species, the accuracy of the proposed system reaches 83.6% species-wise.

Бесплатно

Improving plot-level growing stock volume estimation using machine learning and remote sensing data fusion

Improving plot-level growing stock volume estimation using machine learning and remote sensing data fusion

Mirpulatov I., Kedrov A., Illarionova S.

Статья научная

Forest characteristics estimation is a vital task for ecological monitoring and forest management. Forest owners make decisions based on timber type and its quality. It usually requires field based observations and measurements that is time- and labor-intensive especially in remote and vast areas. Remote sensing technologies aim at solving the challenge of large area monitoring by rapid data acquisition. To automate the data analysis process, machine learning (ML) algorithms are widely applied, particularly in forestry tasks. As ground truth values for ML models training, forest inventory data are usually leveraged. Commonly it involves individual forest stand measurements that are less precise than sample plots. In this study, we delve into ML-based solution development to create spatial-distributed maps with volume stock using sample plot measurements as reference data. The proposed pipeline includes medium-resolution freely available Sentinel-2 data. The experiments are conducted in the Perm region, Russia, and show a high capacity of ML application for forest volume stock estimation based on multispectral satellite observations. Gradient boosting achieves the highest quality with MAPE equal to 30.5%. In future, the proposed solution can be used by forest owners and integrated in advanced systems for ecological monitoring.

Бесплатно

Improving the quality of building space depths maps using multi-area active-pulse television measuring systems in dynamic scenes

Improving the quality of building space depths maps using multi-area active-pulse television measuring systems in dynamic scenes

Zabuga S.A., Kapustin V.V., Musikhin I.D.

Статья научная

The purpose of this work is software implementation of the temporal frame interpolation, the formation of selection criteria and the choice of a suitable neural network model based on the obtained practical data. And also, evaluation of its efficiency for eliminating the interframe shift effect of dynamic objects on the depth maps of multi-area active-pulse television measuring systems in order to improve the accuracy of map building. As initial data for the experiments, static frames were recorded while moving the test rig along the X and Z axes. The static frames are images of the test rig, averaged 100 times, at a distance of 13 meters, which moved along an automated linear guide with a step of 1 mm. As a result of the work, an assessment of the interframe shift effect influence on space depth maps of multi-area active-pulse television measuring systems containing dynamic objects was made. The implementation and testing of the temporal frame interpolation algorithm for suppressing the interframe shift effect of dynamic objects on depth maps was also performed. The algorithm was implemented using Python and the PyCharm IDE with SciPy, NumPy, OpenCV, PyTorch, Threading and other libraries. Numerical values of the RMSE, PSNR, and SSIM metrics were obtained before and after eliminating the effect of interframe shift of dynamic objects on depth maps. The use of the temporal frame interpolation algorithm allows more accurate measurement of distance to moving object in the field of view of multi-area active-pulse television measuring systems.

Бесплатно

Indexing of computer optics in the emerging sources citation index database

Indexing of computer optics in the emerging sources citation index database

Stafeev Sergey S.

Ред. заметка

Inclusion of the journal Computer Optics in the Emerging Sources Citation Index database is described in this editorial.

Бесплатно

Innovative Integration of Residual Networks for Enhanced In-loop Filtering in VVC Using Deep Convolutional Neural Networks

Innovative Integration of Residual Networks for Enhanced In-loop Filtering in VVC Using Deep Convolutional Neural Networks

Ibraheem M.K.I., Dvorkovich A.V., Al-Temimi A.M.S.

Статья научная

This paper explores the integration of Residual Networks (ResNets) into the in-loop filtering (ILF) process of the Versatile Video Coding (VVC) standard, aiming to enhance video compression efficiency and video quality through the application of Deep Convolutional Neural Networks (DCNNs). The study introduces a novel architecture, the Residual Deep Convolutional Neural Network (RDCNN), designed to replace conventional VVC in-loop filtering modules, including Deblocking Filter (DBF), Sample Adaptive Offset (SAO), and Adaptive Loop Filter (ALF). By leveraging the Rate Distortion Optimization (RDO) technique, the RDCNN model is applied to every coding unit (CU) to optimize the balance between video quality and bitrate. The proposed methodology involves offline training with specific parameters using the TensorFlow-GPU platform, followed by feature extraction and prediction of optimal filtering decisions for each video frame during the encoding process. The results demonstrate the effectiveness of the proposed RDCNN in significantly reducing the bitrate while maintaining high visual quality, outperforming existing methods in terms of compression efficiency and peak signal-to-noise ratio (PSNR) values across various video files (YUV color space). Specifically, the RDCNN achieved a YUV PSNR of 41.2 dB and a BD-rate reduction of – 2.43% for the Y component, – 6.96% for the U component, and – 9.43% for the V component. These results underscore the potential of deep learning techniques, particularly ResNets, in addressing the complexities of video compression and enhancing the VVC standard. The evaluation across various YUV video files, including Stefan_cif, Soccer, Mobile, Harbour, Crew, and Bus, revealed consistently higher average YUV PSNR values compared to both VTM 22.2 and other related methods. This indicates not only improved compression efficiency but also enhanced visual quality, crucial for diverse video processing tasks.

Бесплатно

Insight into plasmonics: resurrection of modern-day science (invited)

Insight into plasmonics: resurrection of modern-day science (invited)

Butt M.A.

Статья научная

Plasmonics is a field of research and technology that focuses on the interaction between light and free electrons in a metal structure called plasmon. The study of plasmonics has gained significant attention in recent years due to its potential for several applications and its ability to manipulate light at nanoscale dimensions. Plasmonics enables the control of light at the nanoscale, far beyond the diffraction limit of conventional optics. This allows for the development of new devices and technologies with enhanced performance and functionality. In this paper, recent advances in plasmonics in medicine, agriculture, agriculture, environmental monitoring, lasers and solar energy harvesting are reviewed. Despite these promising prospects, plasmonic devices must overcome obstacles such as significant energy losses, complicated production processes, and the need for better material characteristics. Plasmonics will continue to advance because of ongoing work in nanotechnology, material science, and engineering, which will make it a more significant field with a wide range of usages in the future. In the end, the advantages and the limitations related to the realization of plasmonic devices in the real world are discussed.

Бесплатно

Integrated fiber-based transverse mode converter

Integrated fiber-based transverse mode converter

Gavrilov Andrey Vadimovich, Pavelyev Vladimir Sergeevich

Статья научная

A transverse mode converter based on a binary microrelief implemented directly on the end-face of a few-mode fiber was numerically investigated. The results of numerical simulation demonstrated the converter to form LP-11 and LP-21 modes with high efficiency, providing a more-than 92 % mode purity. Transformations of modes excited by a fiber microbending were also numerically investigated. The excited beams were shown to save their mode purity even in a strong bending as the arising parasitical modes were mostly unguided by the fiber. The resulting beam power and mode content were also demonstrated to depend on the beam and bending mutual orientation for beams with strong rotational symmetry.

Бесплатно

Integrating landscape ecological risk with ecosystem services in the Republic of Tatarstan, Russia

Integrating landscape ecological risk with ecosystem services in the Republic of Tatarstan, Russia

Boori Mukesh Singh, Choudhary Komal, Kupriyanov Alexander

Статья научная

It is a novel approach to linking landscape ecological risk (LER) and ecosystem services (ESs) for environmental management and sustainable development, since it enables real-time decision-making. This study used 12 natural factors relevant to LER and 11 ESs factors to analyze spatiotemporal changes and establish a relationship between them in Tatarstan, Russia, for the years 2010, 2015, and 2020. The statistical tests (Global Moran's I, Getis-Ord Gi*), analysis of habitat vulnerability, and ecological loss in the ArcGIS platform reveal a consistent variance in factor clustering and pattern as well as the impact of governmental policies in the studied area. According to analysis findings, 2015 had the best ecological conditions of the three years because 44.79 % of the research area had decreased landscape ecological risk, which increased ecosystem services. Additionally, the results show that both maps have significant spatial disparities and that LER and ESs are negatively impacted by high human-socioeconomic activity. The integration of LER and ESs through the overlap of both maps provides a significant amount of spatial information for mapping, monitoring, management, and the protection of the fragile environment for sustainable landscape development and management.

Бесплатно

Interative-phase method for diffractively levelling the Gauss beam intensity

Interative-phase method for diffractively levelling the Gauss beam intensity

Golub М.А., Doskolovich L.L., Kotlyar V.V., Nikolsky I.V., Soifer V.A.

Статья научная

The phase diffractive optical element that transforms the Gaussian collimated beam into the uniformly illuminated rectangle has been calculated. In computing the phase function we have employed an adaptive iterative algorithm which is generalization of the Gerchberg-Saxton method. The smooth phase function derived using geometrical optical methods has been used as an initial approximation.

Бесплатно

Interferometric testing of steep cylindrical surfaces with on-axis CGHs

Interferometric testing of steep cylindrical surfaces with on-axis CGHs

Poleshchuk Alexander Grigorievich, Nasyrov Ruslan Kamilyevich, Asfour Jean-Michel

Статья научная

We present a new approach for testing cylindrical optical surfaces using a Null-test. We suggest using a Co mp uter Ge nerated Hologra m (CG H) in co mbi natio n w ith a Trans mission Sp here. It is shown that in such an optical layout the period of the diffractive structure is larger than in the case of a conventional scheme using a collimated beam. Therefore, this kind of hologram enables the test of cylinder surfaces with higher numerical apertures.

Бесплатно

Interpretable graph methods for determining nanoparticles ordering in electron microscopy images

Interpretable graph methods for determining nanoparticles ordering in electron microscopy images

Kurbakov M.Y., Sulimova V.V., Seredin O.S., Kopylov A.V.

Статья научная

An important step in determining the properties of carbon materials is the analysis of images from a scanning electron microscope (SEM). These images show the material surface after the application of metal nanoparticles. The order of these nanoparticles is a key characteristic that affects the material properties. We have previously proposed an approach to formalize the order features based on the identification of lines by nanoparticles in the SEM image. This paper proposes a novel approach to line allocation that is based on the concept of constructing a minimum spanning forest. Additionally, it introduces a set of novel ordering functions that are derived from this approach. The experimental study demonstrates that the combination of these new and previously extracted features improves the recognition quality of SEM images with ordered and disordered nanoparticles arrangements. This approach allows us to gain a better understanding of the nanoparticles arrangement and their effect on the material properties.

Бесплатно

Invariant laser beams - fundamental properties and their investigation by computer simulation and optical experiment

Invariant laser beams - fundamental properties and their investigation by computer simulation and optical experiment

Pavelyev Vladimir S., Michael Duparr Michael, Luedge Barbara, Soifer Victor A., Kowarschik Richard, Golovashkin Dimitriy L.

Статья научная

Laser light modes are beams in whose cross-section the complex amplitude is described by eigenfunctions of the operator of light propagation in the waveguide medium. The fundamental properties of modes are their orthogonality and their ability to retain their structure during propagation for example in a lenslike medium, in free space or a Fourier stage. Novel Diffractive Optical Elements (DOEs) of MODAN-type [1] open up new promising potentialities of solving the tasks of generation, transformation, superposition and subsequent separation again of different laser modes. Now we present new results obtained by synthesis and investigation of beams consisting of more than one twodimensional Gaussian laser modes with the same value of propagation constant (invariant multimode beams) formed by DOEs. The exploitation of these phenomena could enhance the fiber optical system transfer capacity without pulse enlargement.

Бесплатно

Inverse scattering transform algorithm for the Manakov system

Inverse scattering transform algorithm for the Manakov system

Chernyavsky A.E., Frumin L.L.

Статья научная

A numerical algorithm is described for solving the inverse spectral scattering problem associated with the Manakov model of the vector nonlinear Schrödinger equation. This model of wave processes simultaneously considers dispersion, nonlinearity and polarization effects. It is in demand in nonlinear physical optics and is especially perspective for describing optical radiation propagation through the fiber communication lines. In the presented algorithm, the solution to the inverse scattering problem based on the inversion of a set of nested matrices of the discretized system of Gelfand-Levitan-Marchenko integral equations, using a block version of the Levinson-type Toeplitz bordering algorithm. Numerical tests carried out by comparing calculations with known exact analytical solutions confirm the stability and second order of accuracy of the proposed algorithm. We also give an example of the algorithm application to simulate the collision of a differently polarized pair of Manakov optical vector solitons.

Бесплатно

Investigation of the resolution of phase correcting Fresnel lenses with small values of F/D and subwavelength focus

Investigation of the resolution of phase correcting Fresnel lenses with small values of F/D and subwavelength focus

Minin I.V., Minin O.V., Gagnon N., Petosa A.

Статья научная

The focusing properties of phase correcting Fresnel lenses with small values of focal length - to - diameter (F/D) and with focal lengths of two wavelengths or less are investigated. For these lenses, the paraxial approximation for the Rayleigh resolution criterion is no longer valid. For Fresnel lenses designed with F/DF ≤ λ, spatial resolutions of less than 0.5λ are possible, which is finer than what can typically be achieved for conventional (paraxial) designs. The spot beams in these cases are not quite axially symmetrical due to the presence of anti-symmetric field components, which vanish for larger values of F/D.

Бесплатно

Laser beam characterization by means of diffractive optical correlation filters

Laser beam characterization by means of diffractive optical correlation filters

Pavelyev V.S., Soifer V.A., Duparre M., Luedge B.

Статья научная

Analyzing of amplitude-phase characteristics of laser beam is topical in experimental physics and in a great number of laser applications, such as, for example, laser material treatment. The task of analyzing the amplitude-phase beam structure may be treated as that of analyzing the modal composition, if this is thought of as both analyzing individual modal powers and intermode phase shifts. In this paper the problem is tackled using a special diffractive optical element (DOE), called MODAN, matched to a group of laser radiation modes and their special combinations. The experimental results reported indicate that such an approach shows promise. Key words: laser beam, Gaussian modes, intermode power distribution, intermode phase shifts.

Бесплатно

Laser generation thresholds of the cholesteric liquid crystal layer

Laser generation thresholds of the cholesteric liquid crystal layer

Malinchenko A.A., Vanyushkin N.A., Bulanov A.V., Gevorgyan A.H.

Статья научная

The laser threshold of the eigenmodes in cholesteric liquid crystal (CLC) cells are calculated. The influence of gain on light localization was investigated. The influence of absorption and gain on the light energy density in the CLC layer both at isotropic and anisotropic absorption and gain were investigated for the first time. The calculated threshold values were compared with analytical expression for laser thresholds obtained under the condition ImK<<1/d, where d is the CLC layer thickness, and K is the resonance wave vector.

Бесплатно

Lightweight neural network-based pipeline for barcode image preprocessing

Lightweight neural network-based pipeline for barcode image preprocessing

Zlobin P.K., Karnaushko V.A., Ershova D.M., Sánchez-Rivero R., Bezmaternykh P.V., Nikolaev D.P.

Статья научная

Barcode scanning greatly benefited from deep learning research, as well as the image processing stages included in its workflow. These stages commonly handle pre-processing tasks like localizing barcode symbols in the input image, identifying their type, and normalizing the found regions. They are especially important when there is no a priori knowledge of input image capturing conditions. Thus, a case of multiple barcode recognition within a unique image drastically differs from a single barcode processing in video stream via smartphone. We assess how accuracy of these stages affects the accuracy of the whole barcode scanning as its best and propose a lightweight neural network-based pipeline implementing tasks listed above. To perform this assessment and evaluate the performance of the proposed pipeline elements, we conduct a series of experiments using the set of popular open source scanners, including OpenCV, WeChat, ZBar, ZXing and ZXing-cpp over the SE-barcode and Dubska datasets. These experiments reveal how the proposed pipeline can be configured for optimum speed and accuracy performance depending on the objective and the chosen scanner.

Бесплатно

Localization of mobile robot in prior 3D lidar maps using stereo image sequence

Localization of mobile robot in prior 3D lidar maps using stereo image sequence

Belkin I.V., Abramenko A.A., Bezuglyi V.D., Yudin D.A.

Статья научная

The paper studies the real-time stereo image-based localization of a vehicle in a prior 3D LiDAR map. A novel localization approach for mobile ground robot, which successfully combines conventional computer vision techniques, neural network based image analysis and numerical optimization, is proposed. It includes matching a noisy depth image and visible point cloud based on the modified Nelder-Mead optimization method. Deep neural network for image semantic segmentation is used to eliminate dynamic obstacles. The visible point cloud is extracted using a 3D mesh map representation. The proposed approach is evaluated on the KITTI dataset and a custom dataset collected from a ClearPath Husky mobile robot. It shows a stable absolute translation error of about 0.11 – 0.13 m. and a rotation error of 0.42 – 0.62 deg. The standard deviation of the obtained absolute metrics for our method is the smallest among other state-of-the-art approaches. Thus, our approach provides more stability in the estimated pose. It is achieved primarily through the use of multiple data frames during the optimization step and dynamic obstacles elimination on depth image. The method’s performance is demonstrated on different hardware platforms, including energy-efficient Nvidia Jetson Xavier AGX. With parallel code implementation, we achieve an input stereo image processing speed of 14 frames per second on Xavier AGX.

Бесплатно

Losses and orbital part of the poynting vector of air-core modes in hollow-core fibers

Losses and orbital part of the poynting vector of air-core modes in hollow-core fibers

Alagashev Grigory Konstantinovich, Stafeev Sergey Sergeevich, Pryamikov Andrey Dmitrievich

Статья научная

In our earlier works, we investigated a relationship between the formation of vortices in the transverse component of the Poynting vector of core modes and the regimes of strong localization of these modes in solid core micro-structured optical fibers. In this paper, we consider the behavior of the orbital part of the Poynting vector of fundamental and high-order modes in hollow-core fibers, and make comparisons with similar fundamental core mode behavior in solid core micro- structured optical fibers. We then demonstrated the impact of the “negative” curvature of the core-cladding boundary of a hollow-core fiber on the behavior of the orbital part of the Poynting vector of the air-core modes.

Бесплатно

MIDV-2020: a comprehensive benchmark dataset for identity document analysis

MIDV-2020: a comprehensive benchmark dataset for identity document analysis

Bulatov Konstantin Bulatovich, Emelianova Ekaterina Vladimirovna, Tropin Daniil Vyacheslavovich, Skoryukina Natalya Sergeevna, Chernyshova Yulia Sergeevna, Sheshkus Alexander Vladimirovich, Usilin Sergey Alexandrovich, Ming Zuheng, Burie Jean-Christophe, Luqman Muhammad Muzzamil, Arlazarov Vladimir Viktorovich

Статья научная

Identity documents recognition is an important sub-field of document analysis, which deals with tasks of robust document detection, type identification, text fields recognition, as well as identity fraud prevention and document authenticity validation given photos, scans, or video frames of an identity document capture. Significant amount of research has been published on this topic in recent years, however a chief difficulty for such research is scarcity of datasets, due to the subject matter being protected by security requirements. A few datasets of identity documents which are available lack diversity of document types, capturing conditions, or variability of document field values. In this paper, we present a dataset MIDV-2020 which consists of 1000 video clips, 2000 scanned images, and 1000 photos of 1000 unique mock identity documents, each with unique text field values and unique artificially generated faces, with rich annotation. The dataset contains 72409 annotated images in total, making it the largest publicly available identity document dataset to the date of publication. We describe the structure of the dataset, its content and annotations, and present baseline experimental results to serve as a basis for future research. For the task of document location and identification content-independent, feature-based, and semantic segmentation-based methods were evaluated. For the task of document text field recognition, the Tesseract system was evaluated on field and character levels with grouping by field alphabets and document types. For the task of face detection, the performance of Multi Task Cascaded Convolutional Neural Networks-based method was evaluated separately for different types of image input modes. The baseline evaluations show that the existing methods of identity document analysis have a lot of room for improvement given modern challenges. We believe that the proposed dataset will prove invaluable for advancement of the field of document analysis and recognition.

Бесплатно

Журнал