Predicting hotel booking cancellation: a comparative analysis of models
Автор: Rusakova E.I., Radionova M.V.
Журнал: Вестник Пермского университета. Серия: Экономика @economics-psu
Рубрика: Математические, статистические и инструментальные методы в экономике
Статья в выпуске: 4 т.16, 2021 года.
Бесплатный доступ
B ooking a hotel room is an integral part of any trip. Therefore, recent years are characterized by an increasing popularity of and demand for online travel agencies which save clients’ time and efforts applied to the communication with the hotels, as well as cancel a booking with no fines and charges. Hotel booking cancellations are on the rise in recent several years, which has its adverse effect on the financial status and reputations of the hotels. They have to follow a strict booking policy and overbooking strategy to reduce the risks. This problem is particularly burning today due to a significant decrease in tourist flows induced by the coronavirus pandemic. This issue can be solved by developing the predictive models of hotel booking cancellation with a high confidence index and a high prediction accuracy rate. An overview of the existing solutions shows that the following machine learning methods give the best predictive results: Random Forest, neuron networks, CatBoost, and XGBoost. Thus, the purpose of the research is to develop different machine learning based predictive models for hotel booking cancellation and to compare them in order to justify the choice of the best model with such metrics as Accuracy, Precision, Recall, F-measures, and the area under the ROC curve. The information database for the research was Hotel Booking Demand Dataset prepared by N. Antonio, A. de Almeida and L. Nunes and published on ScienceDirect platform. The research found out that a Random Forest Model gives the best prediction for hotel booking cancellation. For example, this model shows the percentage of the correct answers from a text set, 84.5% is among all predictions; 87.3% is the percentage of the bookings which are actually cancelled and referred to as cancelled by a classifier. Further research is seen to be focused on improving the Random Forest Model and other models of machine learning with additional unaccounted hyperparameters.
Catboost классификация, xgboost классификация
Короткий адрес: https://sciup.org/147246846
IDR: 147246846 | DOI: 10.17072/1994-9960-2021-4-327-345