Training long-term memory by predicting high uncertainty events

Автор: Sorokin A. Y., Pugachev L. P., Burtsev M. S.

Журнал: Труды Московского физико-технического института @trudy-mipt

Рубрика: Информатика и управление

Статья в выпуске: 4 (52) т.13, 2021 года.

Бесплатный доступ

In many environments, a reinforcement learning agent needs to remember relevant events from a distant past to solve a task. These important events might be observed thousands or even millions time steps before a decision point. Unfortunately, a straightforward application of backpropagation methods for training such a long term memory requires activation data to be stored for every single step of the forward computation, potentially over thousands or millions of steps. However, if the crucial moments of a memory application are known beforehand, we can avoid these computational constraints. We extend the neural architecture of an agent with a memory subnetwork trained to predict the outcome of critical decisions characterized by high uncertainty. This predictive memory architecture is tested in simple yet challenging T-Maze environments as well as in ViZDoom settings. The experiments demonstrate that our method learns faster and more stable than the baselines when the similar length of input sequences is given.

Еще

Reinforcement learning, deep learning, artificial neural networks, partially observable evironments

Короткий адрес: https://sciup.org/142231497

IDR: 142231497   |   DOI: 10.53815/20726759_2021_13_4_39

Статья научная