Overview of the uncertainty estimation methods in offline reinforcement learning

Автор: Nikulin A.P.

Журнал: Труды Московского физико-технического института @trudy-mipt

Статья в выпуске: 3 (67) т.17, 2025 года.

Бесплатный доступ

Offline reinforcement learning involves training an agent exclusively on pre-collected trajectories without any further interaction with the environment, a setup that is especially advantageous for real-world applications. However, foregoing interactive exploration introduces a new class of challenges related to distributional shift: the behaviors encountered during deployment may differ substantially from those represented in the training dataset. This paper presents a survey of key uncertainty-estimation methods in offline RL designed to mitigate distributional-shift issues. We distinguish between two fundamental types of uncertainty—epistemic uncertainty, which arises from limited data, and aleatoric uncertainty, which stems from the stochastic nature of the environment. Uncertainty-estimation techniques are categorized into three primary areas: uncertainty in modeling environment dynamics, uncertainty in value-function estimation, and uncertainty in policy optimization.

Еще

Offline reinforcement learning, uncertainty estimation, survey

Короткий адрес: https://sciup.org/142245838

IDR: 142245838 | УДК: 004.89