Программные системы: теория и приложения @programmnye-sistemy
Статьи журнала - Программные системы: теория и приложения
Все статьи: 445

Статья научная
This article is devoted to applying mathematical models in the differential diagnosis of venous diseases based on microwave radiometry data. A modified approach for transforming feature space in thermometric data is described. After constructing features, a multiclass classification problem is solved in several ways: by reducing to binary classification problems using “one versus rest” and “one versus one” methods and building a multivariate logistic regression model. The best classification model achieved an average balanced accuracy score of 0.574. A key feature of the approach is that classification result can be explained and justified in terms understandable to a diagnostician. This article presents the most significant patterns in thermometric data and the accuracy with which they can identify different classes of diseases.
Бесплатно

Multimodal stock price prediction: a case study of the Russian securities market
Статья научная
Classical asset price forecasting methods primarily rely on numerical data, such as price time series, trading volumes, limit order book data, and technical analysis indicators. However, the news flow plays a significant role in price formation, making the development of multimodal approaches that combine textual and numerical data for improved prediction accuracy highly relevant. This paper addresses the problem of forecasting financial asset prices using the multimodal approach that combines candlestick time series and textual news flow data. A unique dataset was collected for the study, which includes time series for 176 Russian stocks traded on the Moscow Exchange and $79,555$ financial news articles in Russian. For processing textual data, pre-trained models RuBERT and Vikhr-Qwen2.5-0.5b-Instruct (a large language model) were used, while time series and vectorized text data were processed using an LSTM recurrent neural network. The experiments compared models based on a single modality (time series only) and two modalities, as well as various methods for aggregating text vector representations. Prediction quality was estimated using two key metrics: Accuracy (direction of price movement prediction: up or down) and Mean Absolute Percentage Error (MAPE), which measures the deviation of the predicted price from the true price. The experiments showed that incorporating textual modality reduced the MAPE value by 55%. The resulting multimodal dataset holds value for the further adaptation of language models in the financial sector. Future research directions include optimizing textual modality parameters, such as the time window, sentiment, and chronological order of news messages.
Бесплатно

Multiple-precision matrix-vector multiplication on graphics processing units
Статья научная
We are considering a parallel implementation of matrix-vector multiplication (GEMV, Level 2 of the BLAS) for graphics processing units (GPUs) using multiple-precision arithmetic based on the residue number system. In our GEMV implementation, element-wise operations with multiple-precision vectors and matrices consist of several parts, each of which is calculated by a separate CUDA kernel. This feature eliminates branch divergence when performing sequential parts of multiple-precision operations and allows the full utilization of the GPU's resources. An efficient data structure for storing arrays with multiple-precision entries provides a coalesced access pattern to the GPU global memory. We have performed a rounding error analysis and derived error bounds for the proposed GEMV implementation. Experimental results show the high efficiency of the proposed solution compared to existing high-precision packages deployed on GPU.
Бесплатно

Nodules detection on computer tomograms using neural networks
Статья научная
Results of neural networks (NN) application to the problem of detecting neoplasms on computer tomograms of the lungs with limited amount of data are presented. Much attention is paid to the analysis and preprocessing of images as a factor improving the NN quality. The problem of NN overfitting and ways to solve it are considered. Results of the presented experiments allow drawing a conclusion about the efficiency of applying individual NN architectures in combination with data preprocessing methods to detection problems even in cases of a limited training set and a small size of detected objects.
Бесплатно

On the free Carnot (2, 3, 5, 8) group
Статья научная
We consider the free nilpotent Lie algebra with 2 generators, of step 4, and the corresponding connected simply connected Lie group 𝐺, with the aim to study the left-invariant sub-Riemannian structure on defined by the generators of as an orthonormal frame. We compute two vector field models of by polynomial vector fields in R8, and find an infinitesimal symmetry of the sub-Riemannian structure. Further, we compute explicitly the product rule in and the right-invariant frame on 𝐺.
Бесплатно

Parus - синтаксически аннотированный корпус русского языка
Статья научная
В статье представлен новый аннотированный корпус русского языка PaRuS (Parsed Russian Sentences). Корпус имеет объем свыше 2,5 миллиардов токенов и предназначен для решения задач компьютерной лингвистики методами машинного обучения. PaRuS состоит из предложений русского литературного языка. Каждое предложение снабжено лингвистической разметкой: морфологической в формате MULTEXT-East и синтаксической в нотации СинТагРус. В статье рассмотрена методология создания корпуса, описан гибридный лингвистический конвейер PaRuS_pipe, разработанный для порождения разметки. Обсуждаются вопросы качества аннотирования языкового материала в корпусе PaRuS, выполнена оценка морфологического анализатора конвейера PaRuS_pipe по методологии соревнования MorphoRuEval-2017.
Бесплатно

Recovering text sequences using deep learning models
Статья научная
This article presents the results of the formation, training and performance evaluation of models with the Encoder-Decoder and Sequence-To-Sequence (Seq2Seq) architectures for solving the problem of supplementing incomplete texts. Problems of this type often arise when restoring the contents of documents from their low-quality images. The studies conducted in the work are aimed at solving the practical problem of forming electronic copies of scanned documents of the «Roskadastr» PLC, the recognition of which is difficult or impossible with standard means. The formation and study of models was carried out in Python using the high-level API of the Keras package. A dataset consisting of several thousand pairs was formed for the purpose of training and studying the models. Each pair in this set represented an incomplete and corresponding full text. To evaluate the quality of the models, the values of the loss function and the accuracy, BLEU and ROUGE-L metrics were calculated. Loss and accuracy made it possible to evaluate the effectiveness of the models at the level of predicting individual words. The BLEU and ROUGE-L metrics were used to evaluate the similarity between the full and reconstructed texts. The results showed that both the Encoder-Decoder and Seq2Seq models cope with the task of reconstructing text sequences from their fixed set, but the Seq2Seq transformer-based model achieves better results in terms of training speed and quality.
Бесплатно

Robust algorithmic binding to arbitrary fragment of program code
Статья научная
When solving a task, a programmer actively interacts with a finite set of code fragments. The information about their locations is important for quick navigation, for other developers, and as a kind of documentation. Integrated development environments (IDEs) provide tools for marking code fragments with labels, displaying lists of labels, and using these labels for quick navigation. However, they often lose the correspondence between the label and the marked place when the code is edited, in particular when changes are made outside the IDE. In previous works, the authors propose a tool to be integrated into various IDEs for ``binding'' to large syntactic entities of a program and building a~markup that is robust to code editing. The description of the marked element is built on the basis of the abstract syntax tree (AST) of the program. Later it is used to algorithmically search for the element in an edited code. The search has a success rate from 99 to 100.. This article aims at robust algorithmic binding to an~arbitrary section of the code. For binding to a single-line code fragment, we propose an extension of the model describing the marked fragment, and an additional search algorithm. We also propose an algorithm for embedding nodes corresponding to multi-line fragments in an AST. We show that the correctness of the AST is not violated by these embeddings. Bindings to randomly selected lines were made in the code of three large C. projects. Manual check of these lines search results in~the edited code has confirmed that the bindings are robust to code editing.
Бесплатно

Simulation of a multifunctional micromechanical gyroscope
Статья научная
The possibility of constructing a multifunctional inertial navigation device based on a hybrid-type modulation micromechanical gyroscope is considered. A mathematical model of the device ("heavy" gyroscope) as a high-quality three-dimensional oscillatory system is constructed. It is numerically shown that, under certain conditions, the reaction of the system to the motion of an object has, along with precession, the observed nutation, which carries information about the linear motion of the gyroscope base. It is noted that the possibility of measuring linear accelerations is ensured by the presence of a small symmetrical distance between the axes of the elastic suspension relative to the center of mass of the sensing element. The results obtained make it possible to implement a two-component angular velocity meter and a two-component linear acceleration meter in one device.
Бесплатно

Simulation of the effect of short optical pulses on graphene
Статья научная
The interaction of high-frequency pulsed electric fields with graphene is currently the subject of intense research. The paper presents the results of testing a software system for modeling such processes using the example of ultrashort laser pulses of the optical range with different polarizations. The authors develop the system on a base of a new theoretical approach based on the quantum kinetic equation. The approach contains a computational model for a new system of ordinary differential equations with non-linearly dependent on time and problem parameters coefficients.The need to analyze the behavior of solutions of this system of equations in the field of changing several parameters leads to the polynomial computational complexity. The lack of knowledge of the nature of the parametric dependence of solutions requires several iterations of the choice of covering grids. The paper describes the adaptation of this modeling system for use in massively parallel computing systems.
Бесплатно

Sufficient relative minimum conditions for discrete-continuous control systems
Статья научная
In this paper, we derive sufficient relative minimum conditions for discrete-continuous control systems on the base of Krotov’s sufficient optimality conditions counterpart. These conditions can be used as verification conditions for suggested control mode and enable one to construct new numerical methods.
Бесплатно

Статья научная
This paper proposes decentralized processes for synchronizing the actions of a distributed group of active components (objects) in supercomputers and computer clusters, allowing them to move to specified states or influence the external environment synchronously. The object action depends on the current state of the object and the external environment. The actions should start with the minimum delay after the possibility of their execution is detected. Synchronization is performed by exchanging optical signals over wireless communication channels through an optical signal repeater, combining one group of objects or sequences of groups of objects (layers). Accurate distance measurement performs the compensation of possible changes in distances between objects. Group operations accelerate synchronize and simultaneously receive data from a group of distributed objects. Data processing occurs during their transfer, without increasing the time. The operation time does not depend on the quantity of data processed by the operation. A group operation is performed in a repeater containing no computational means.
Бесплатно

The optimal control of two work-stealing deques, moving one after another in a shared memory
Статья научная
In the parallel work-stealing load balancers, each core owns personal buffer of tasks called deque. One end of the deque is used by its owner to add and retrieve tasks, while the second end is used by other cores to steal tasks. In the paper two representation methods of deques are analyzed: partitioned serial cyclic representation of deques (one of the conventional techniques); and the new approach proposed by our team, without partition of shared memory in advance between deques moving one after another in a circle. Previously we analyzed these methods for representing FIFO queues in network applications, where the “One after another” way gave the best result for some values of the system parameters.Purpose of this research is to construct and analyze models of the process of work with two circular deques located in shared memory, where they movie one after another in a circle. The mathematical model is constructed in the form of a random walk by integer points in the pyramid. The simulation model is constructed using the Monte Carlo method. The used work-stealing strategy is stealing of one element. We propose the mathematical and simulation models of this process and carry out numerical experiments.
Бесплатно

The platform approach to research and development using high-performance computing
Статья научная
In this paper, we analyze the prerequisites and substantiate the relevance for creating an open Internet platform that employs big data technologies, highperformance computing, and multilateral markets in a unified way. Conceived as an ecosystem for the development and use of applied software (including in the field of design and scientific research), the platform should reduce time/costs and improve the quality of software development for solving analytical problems arising in industrial enterprises, scientific research organizations, state bodies and private individuals. The article presents a working prototype of the platform using supercomputer technologies and desktop virtualization systems.
Бесплатно

Статья научная
The success of using mathematical models that determine the behavior of quantum field systems in parametric spaces critically depends on the level of optimization of the procedure of finding the solution. The paper considers the problem of calculating the density of carriers arising in graphene as a result of the action of a pulsed electric field. The basis of the model is a system of kinetic equations that provide the calculation of the residual distribution function. Its integration over momentum space gives the desired carrier density.The problem lies in the high computational complexity of covering the momentum space with a uniform mesh, which provides an accurate calculation of the density for various parameters of the field momentum. Moreover, the model does not contain criteria for determining satisfactory mesh parameters. The article proposes and implements a procedure for constructing an adaptive mesh in the form of a quadtree having a variable size of covering squares. The procedure is iterative and combined with the process of calculating the values of the distribution function.
Бесплатно

TimeML для разметки русскоязычных текстов. Оценка перспектив
Ред. заметка
Статья посвящена анализу возможности применения языка TimeML для разметки временных выражений и их связей с упоминаниями событий в русскоязычных текстах. Выявлен ряд специфических для русского языка конструкций, требующих внесения корректив в инструкцию для аннотаторов, предложены варианты изменений отдельных пунктов инструкции. В заключении делается вывод о целесообразности практического приложения доработанной версии языка TimeML к русскоязычным текстам как в качестве языка разметки, так и в качестве формата представления извлекаемой автоматически темпоральной информации
Бесплатно

Turnpike solutions in the problem of excitation transfer along a spin chain
Статья научная
It is considered the problem of excitation transfer along a spin chain related to the applied problem of quantum computations. The model of a quantum system of interacting spins based on the Shr¨ odinger equation with unbounded linear control is transformed to an equivalent derived system (known from the degenerate problems theory), and then approximately to derived systems of higher stages with reducing order. Their investigation performed analytically or via simple computations leads at least to approximate solutions and lower estimates of the transfer time, which can be used in subsequent improving procedures
Бесплатно

Using a convolutional neural network to recognize text elements in poor quality scanned images
Статья научная
The paper proposes a method for recognizing the content of scanned images of poor quality using convolutional neural networks (CNNs). The method involves the implementation of three main stages. At the first stage, image preprocessing is implemented, which consists of identifying the contours of its alphabetic and numeric elements and basic punctuation marks. At the second stage, the content of the image fragments inside the identified contours is sequentially fed to the input of the CNN, which implements a multiclass classification. At the third and final stage, the post-processing of the set of SNA responses and the formation of a text document with recognition results are implemented. An experimental study of all stages was carried out in Python using the Keras deep learning libraries and OpenCV computer vision and showed fairly good results for the main types of deterioration in the quality of a scanned image: geometric distortions, blurring of borders, the appearance of extra lines and spots during scanning, etc.
Бесплатно

Using the Mask R-CNN model for segmentation of real estate objects in aerial photographs
Статья научная
The mass appearance of illegal and unregistered in the Unified State Register of Real Estate (USRRE) real estate objects complicates cadastral registration for many entities at the territorial and administrative levels. Traditional methods of identifying objects of this type, based on manual analysis of geospatial data, are labor-intensive and time-consuming. To improve the efficiency of this process, it is proposed to automate the detection of objects in aerial photographs by solving the instance segmentation problem using the Mask R-CNN deep learning model. The article describes the preparation of a dataset for this model, examines the main quality metrics, and analyzes the results obtained. The efficiency of the Mask R-CNN model in practice is shown for solving the problem of detecting construction projects that are not registered in the USRRE.
Бесплатно