
强化学习
文章平均质量分 93
DarrenXf
这个作者很懒,什么都没留下…
展开
-
SAC:Soft Actor-Critic Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor
Soft Actor-Critic Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor 1801.01290论文地址https://arxiv.org/abs/1801.01290个人翻译,并不权威Tuomas HaarnojaAurick ZhouPieter AbbeelSergey LevineAbstract 摘要无模型深度强化学习(RL)算法Model-free deep翻译 2021-05-22 22:00:51 · 1827 阅读 · 0 评论 -
PPO:Proximal Policy Optimization Algorithms
Proximal Policy Optimization Algorithms 近端策略优化算法论文地址https://arxiv.org/abs/1707.06347个人翻译,并不权威John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, Oleg KlimovOpenAI{joschu, filip, prafulla, alec, oleg}@openai.comABSTRACT 摘要我们提出了一种新的强化学习策略梯翻译 2021-03-14 11:02:15 · 2131 阅读 · 0 评论 -
DDPG:CONTINUOUS CONTROL WITH DEEP REINFORCEMENT LEARNING
CONTINOUS CONTROL WITH DEEP REINFORCEMENT LEARNING论文地址https://arxiv.org/abs/1509.02971个人翻译,并不权威Timothy P.Lilicrp,Jonathan J.Hunt,Alexander Pritzel, Nicolas Heess,Tom Erez, Yuval Tassa, David Silver & Daan WierstraGoogle DeepmindLondon,UK{count翻译 2021-02-28 17:18:27 · 1435 阅读 · 0 评论 -
DQN:Playing Atari with Deep Reinfocement Learning
Playing Atari with Deep Reinfocement Learning论文地址https://www.cs.toronto.edu/~vmnih/docs/dqn.pdf个人翻译,并不权威深度强化学习玩Atari摘要我们提出了第一个成功通过强化学习直接从高维感官输入中学习控制策略的深度学习模型。该模型是一个卷积神经网络,用Q-learing的变种训练,输入为原始像素,输出为估计未来reward的值函数。我们将我们的方法应用于七个来自街机学习环境的Atari 2600游戏翻译 2021-02-23 23:23:09 · 1173 阅读 · 0 评论 -
key papers in deep rl 深度强化学习的关键论文
key_papers_in_deep_rlWhat follows is a list of papers in deep RL that are worth reading. This is far from comprehensive, but should provide a useful starting point for someone looking to do research ...原创 2020-03-16 20:35:18 · 882 阅读 · 0 评论