Overview of the uncertainty estimation methods in offline reinforcement learning
Автор: Nikulin A.P.
Журнал: Труды Московского физико-технического института @trudy-mipt
Рубрика: Информатика и управление
Статья в выпуске: 3 (67) т.17, 2025 года.
Бесплатный доступ
Offline reinforcement learning involves training an agent exclusively on pre-collected trajectories without any further interaction with the environment, a setup that is especially advantageous for real-world applications. However, foregoing interactive exploration introduces a new class of challenges related to distributional shift: the behaviors encountered during deployment may differ substantially from those represented in the training dataset. This paper presents a survey of key uncertainty-estimation methods in offline RL designed to mitigate distributional-shift issues. We distinguish between two fundamental types of uncertainty—epistemic uncertainty, which arises from limited data, and aleatoric uncertainty, which stems from the stochastic nature of the environment. Uncertainty-estimation techniques are categorized into three primary areas: uncertainty in modeling environment dynamics, uncertainty in value-function estimation, and uncertainty in policy optimization.
Offline reinforcement learning, uncertainty estimation, survey
Короткий адрес: https://sciup.org/142245838
IDR: 142245838 | УДК: 004.89