Big data storage and processing technologies for purposes of training scoring models

Бесплатный доступ

The article contains actual approaches to storing and processing big data for training scoring models for assessing credit risks. A model of the data used for training scoring models was designed, and the volumes of data in the scheme were calculated. The research shows the effectiveness of using the Apache Hadoop and Nifi ecosystem technologies for distributed storage, writing and reading data, and the Apache Spark framework for processing them. An architectural solution has been developed to manage data flows received from source product systems. The solution allows you to store large volumes of data, and the framework used allows you to process it and solve the problem of training a scoring model for assessing credit risks.

Еще

Hadoop, apache spark

Короткий адрес: https://sciup.org/170208586

IDR: 170208586   |   DOI: 10.24412/2500-1000-2024-12-3-55-59

Статья научная