Modeling of raw data from a single-molecule sequencer. Part I: simulating detection of emission from a zero-mode waveguide
Автор: E.K. Vasil’eva, I.V. Chubinskiy-Nadezhdin
Журнал: Научное приборостроение @nauchnoe-priborostroenie
Рубрика: Приборостроение физико-химической биологии
Статья в выпуске: 1, 2026 года.
Бесплатный доступ
This work proposes a stochastic simulation model for the fluorescence detection process of labeled nucleotides during single-molecule sequencing across four spectral channels. A multi-factor detector signal simulation is implemented. The first part of the article describes an algorithmic template for modeling the response of a camera's photosensitive elements to incident light from a zero-mode waveguide within a single spectral range during DNA single-molecule sequencing. The concept of "base image" is introduced, representing the number of photoelectrons in sensors during an exposure time. The area simulation of such an image is a square field measuring 2 × 2 to 5 × 5 pixels, with total brightness corresponding to a single trace count in the given channel. Three modes of base image formation are considered: 1) dark current only, 2) background radiation in the absence of incorporation of a nucleotide, and 3) a fluorescent spot with a background level. A step-by-step algorithm for modeling the base image is provided. It includes: defining the size and offset of the spot, fluorescence and background amplitudes; generating two-dimensional Gaussian distributions for background and signal of fluorescence; and summing them; accounting for signal-dependent noise using experimental data; modeling signalindependent noise with a normal distribution; setting the zero-level offset; and finally summing all noise components to create the image. The algorithm for modeling background signal considering spectral crosstalk from the nearest signal registration channel is also described. A program simulator for primary digital data from a single-molecule sequencer has been developed, allowing for the simulation of sequencing results in FASTA, trace, and movie formats. The simulator accepts a real DNA sequence from a genomic database as input.
Sequencing simulator, long reads, simulation modeling, SMRT, zero-mode waveguide
Короткий адрес: https://sciup.org/142247129
IDR: 142247129 | УДК: 519.856, 004.942, 519.876.5, 57.081.23