Улучшение качества алгоритма рекомендательной системы с помощью методов ассоциативного анализа
Автор: Стубарев Игорь Михайлович, Альсова Ольга Константиновна
Журнал: Проблемы информатики @problem-info
Рубрика: Теоретическая и системная информатика
Статья в выпуске: 2 (55), 2022 года.
Бесплатный доступ
В сфере развития CRM систем растет спрос на вспомогательные системы, реализующие методы и технологии интеллектуального анализа данных и машинного обучения (Data mining) и способные генерировать полезные знания из огромных массивов собранных в CRM данных. В статье приведены результаты разработки и исследования алгоритма рекомендательного сервиса CRM системы с применением методов ассоциативного анализа данных. Ранее авторами был разработан и реализован базовый вариант алгоритма рекомендательного сервиса, основанный на использовании методов кластерного анализа данных и коллаборативной фильтрации [1-2]. В новой версии алгоритма дополнительно используются методы ассоциативного анализа для формирования рекомендаций по выбору продуктов (услуг), что позволило увеличить точность рекомендательной системы (сервиса) по метрике F2 в среднем с 67,98 % до 81,24 % при несущественном увеличении времени выдачи рекомендаций (в среднем на 2,47 мс). Исследование и сравнение базовой и модифицированной версий алгоритма проводилось на данных страховых компаний, предоставленных компанией „ФБ Консалт“.
Рекомендательная система (сервис), коллаборативная фильтрация, кластерный анализ, ассоциативный анализ, алгоритм apriori
Короткий адрес: https://sciup.org/143179385
IDR: 143179385 | УДК: 004.89 | DOI: 10.24412/2073-0667-2022-2-17-26
Improving the quality of recommender system algorithm using associative analysis methods
FB Consult specializes in the development, implementation, and support of full-featured CRM solutions for banks, insurance, commercial and industrial, pharmaceutical companies. A customer relationship management system (CRM-system) is an information system designed to collect and process customer data. The data obtained from this system can be used in a recommendation system, helping managers to determine the needs of customers more accurately. Understanding the diverse insurance needs of the population and comparing them with related products offered by insurance companies makes insurance more effective and makes insurance companies more successful. Earlier, FB Consult developed an analytical platform that includes services for recommendations and time series analysis. The objective of the study is to test the impact of the affinity analysis algorithm for the F2-score metric-evaluation of the recommendation algorithm based on collaborative filtering and cluster analysis of data. The article describes the developed algorithm, which consists of 2 stages. At the training stage, which takes a long time, but is carried out only when there is a significant change in customer data, a recommendation model is created. First of all, customers are divided into clusters based on metadata using the EM algorithm, and a list of the most popular products is generated for each cluster. This is necessary to solve the cold start problem. In addition, customers are divided into clusters according to shopping lists in order to further speed up the collaborative filtering algorithm, since customers from another cluster will not be close to the customer for whom the recommendation is calculated, and the association rules are calculated using the Apriori algorithm. As a result, the model consists of a list of the most popular products for each cluster, a customer classifier by metadata, a customer classifier by shopping lists, customer lists divided into clusters by shopping and a list of found association rules. The recommendation phase is for each customer and therefore must be fast. If the customer does not have purchased products yet, then he is classified by his metadata and receives a recommendation from the list of popular products for his cluster. Otherwise, the customer is classified according to the shoppinglist, then, using collaborative filtering, the closest customers are found among the customers of his cluster and recommendations are formed on the basis of their purchases. In addition, if a customer has a cause for a previously found association rule in the purchased products, he is recommended its effect along with recommendations based on purchases of similar customers. Testing and analysis of the effectiveness of the developed algorithm was carried out on the data of insurance company. The data includes 30 thousand customers and 21 types of products from 2010 to 2020. As a result of testing, it was revealed that the proportion of correctly found products for recommendation among the products that needed to be recommended increased, but also the proportion of recommended products that were clearly not necessary for recommendations (were not removed from the customer during testing) increased. Should take into account that these could be products that should be recommended to customers, but that they have not purchased yet. In this article, a study was carried out of the impact of affinity analysis on the recommendation algorithm. The main result of this work is to improve the F2-score metric in comparison with the basic implementation of the recommendation algorithm. With the help of affinity analysis, you can generate not only positive, but also negative association rules. In future work, it is planned to investigate the use of such rules in order to reduce the likelihood of recommending products that are contained in the effect of these rules, thereby increasing the accuracy of the system.
Список литературы Улучшение качества алгоритма рекомендательной системы с помощью методов ассоциативного анализа
- Stubarcv I. М., Bclov А. I. , Alsova О. К Development of the analytical platform for CRMsystem /7 Actual problems of electronic instrument engineering (APEIE 2018). Новосибирск: НГТУ, 2018. С. 546 551.
- Стубарев И. М., А.льсова О. К. Рекомендательный сервис на базе CRM системы: Свидетельство о государственной регистрации программы для ЭВМ № 2019617387. 2019.
- Soh Н., Sanner S., White М., Jamieson G. Deep sequential recommendation for personalized adaptive user interfaces /7 IUI ACM. 2017. C. 589 593.
- Yu W., He X., Qin Z., Chen X., Zhang H., Xiong L. Aesthetic-based clothing recommendation /7 Proceedings of the 2018 world wide web conference. 2018. C. 649 658.
- Liang D., Krishnan R. G., Hofman M.D., Jebara T. Variational autoencoders for collaborative filtering /7 Proceedings of the 2018 world wide web conference. 2018. C. 689 698.
- Lin W., Alvarez S. A., Ruiz C. Efficient adaptive-support association rule mining for recommender systems /7 Data Min. Knowl. Discov. 2002. C. 83 105.
- Lin W., Alvarez S.A., Ruiz C. Collaborative recommendation via adaptive association rule mining /7 Data Min. Knowl. Discov. 2000. C. 83 105.
- Sasaki Y. The truth of the F-measure /7 Teach Tutor Mater. 2007. С. 1 5.
- Agrawal R., Srikant R. Fast Discovery of Association Rules /7 Proc. of the 20th International Conference on VLDB. 1994.
- Agrawal R., Imielinski Т., Swami A. Mining Associations between Sets of Items in Massive Databases /7 Proc. of the 1993 ACM-SIGMOD Intl Conf. on Management of Data. 1993. C. 207 216.
- Han J, Kamber M. Data mining: concepts and techniques /7 Burlington: Morgan Kaufmann Publishers. 2012.
- Bagui S., Dhar P. C. Positive and negative association rule mining in Hadoop's MapReduee environment /7 Journal of Big Data. 2019. N T. 6. N 1. С. 1 16.