THE WISDOM OF THE CROWD: RELIABLE DEEP REINFORCEMENT LEARNING THROUGH ENSEMBLES OF Q--FUNCTIONS

最新推荐文章于 2022-03-21 17:25:05 发布

Adam婷

最新推荐文章于 2022-03-21 17:25:05 发布

阅读量1.2k

点赞数

CC 4.0 BY-SA版权

分类专栏： ICLR 论文研读强化学习深度强化学习机器学习

本文链接：https://blog.youkuaiyun.com/weixin_41697507/article/details/93856742

机器学习同时被 3 个专栏收录

161 篇文章 ¥19.90 ¥99.00

订阅专栏

超级会员免费看

强化学习

26 篇文章 ¥19.90 ¥99.00

订阅专栏

超级会员免费看

论文研读

38 篇文章

订阅专栏

本文提出了一种利用人群智慧的集合方法，通过结合多个Q-函数近似器的估计来提高深度强化学习（DQN）的稳定性和性能。研究发现，这种方法在多个任务中显著改善了强化学习的表现，同时增强了选择动作的稳定性，减少了训练过程中的不稳定性。通过借鉴群体决策的理论，该方法在不牺牲与环境交互次数的情况下，降低了训练的不稳定性。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

ABSTRACT

Reinforcement learning agents learn by exploring the environment and then ex-ploiting what they have learned. This frees the human trainers from having to know the preferred action or intrinsic value of each encountered state. The cost of this freedom is reinforcement learning is slower and more unstable than su-pervised learning. We explore the possibility that ensemble methods can remedy these shortcomings and do so by investigating a novel technique which harnesses the wisdom of the crowds by bagging Q-function approximator estimates.

Our results show that this proposed approach improves all three tasks and rein-forcement learning approaches attempted. We are able to demonstrate that this is

adirect result of the increased sta