Mixture of probability distributions in the problems of regression and anomaly detection and its applications to PVT properties

Автор: Volkov N.A., Budennyy S.A., Andrianova A.M.

Журнал: Труды Московского физико-технического института @trudy-mipt

Рубрика: Информатика и управление

Статья в выпуске: 3 (47) т.12, 2020 года.

Бесплатный доступ

This paper describes the main mathematical properties of a mixture of probability distributions. Special attention is paid to the multivariate Student distribution and related distributions, for which the properties necessary for practical application are proved. Also we introduce EM algorithm for a mixture of Student distributions where at the E-step we apply the variational Bayesian inference to parameters estimation. Based on a mixture of student distributions, a machine learning method is constructed that allows using a single model to solve regression problems for any set of features, clustering, and anomaly detection. Each of these problems can be solved by the model if there are gaps in the data. The method is tested on data from PVT properties of reservoir fluids, where the model results do not contradict the main physical properties, and the predictions are in many cases more accurate than widely known machine learning methods based on MAPE and RMSPE metrics.

Еще

Student mixture, em algorithm, variational bayesian inference, clustering, regression, anomalies, missing values, pvt properties

Короткий адрес: https://sciup.org/142230086

IDR: 142230086

Статья научная