Automatic selection of ARIMA model parameters to forecast COVID-19 infection and death cases

Бесплатный доступ

In our paper we explore the use of the ARIMA model for forecasting time series for the analysis of open data on the spread of the coronavirus infection in a number of the Russian Federation regions. The possibility of using the existing methods and algorithms of R programming language is considered, algorithms for selecting the parameters of the ARIMA model are presented. We have developed and uploaded the script in R programming language, which allows using the standard library auto.arima to predict the total cases of infection and deaths for a selected period. The paper shows that the parameters of the model are different for time series of different lengths, for different regions; in addition, the parameters of the model change over time. The available toolkit of the R language is investigated and it is shown that there are data sets for which it does not allow obtaining the parameters of the model that gives the smallest error. The frequency of model retraining is investigated, data on changes in the model parameters for time series of different lengths are presented. Investigation of cases of errors in automatic selection of model parameters is a topic for further research. We have presented a meaningful interpretation of the data obtained. A comparison of the forecasts obtained at the end of October, 2020 and actual data for the middle of November, 2020 is carried out. We have shown that the obtained forecast made it possible to accurately predict the total number of infections and deaths for 7-10 days for any further period.

Еще

ARIMA, Covid-19, forecasting, script, parameters selection

Короткий адрес: https://sciup.org/147234292

IDR: 147234292   |   DOI: 10.14529/cmse210202

Статья научная