DQN总述

Charcy阳

于 2020-11-02 19:34:31 发布

阅读量203

点赞数

文章标签： github

原文链接：https://ml-python.blog.youkuaiyun.com/article/details/83311202?utm_medium=distribute.pc_relevant.none-task-blog-BlogCommendFromMachineLearnPai2-2.add_param_isCf&depth_1-utm_source=distribute.pc_relevant.none-task-blog-BlogCommendFromMachineLearnPai2-2.add_param

版权

Human Level Control Through Deep Reinforement Learning

[Publication] https://deepmind.com/research/publications/human-level-control-through-deep-reinforcement-learning/

[code] https://github.com/qfettes/DeepRL-Tutorials/blob/master/01.DQN.ipynb

Multi-Step Learning (from Reinforcement Learning: An Introduction, Chapter 7)

[Publication] https://github.com/qfettes/DeepRL-Tutorials/blob/master/01.DQN.ipynb

[code] https://github.com/qfettes/DeepRL-Tutorials/blob/master/02.NStep_DQN.ipynb

Deep Reinforcement Learning with Double Q-learning

[Publication] https://arxiv.org/abs/1509.06461

[code] https://github.com/qfettes/DeepRL-Tutorials/blob/master/03.Double_DQN.ipynb

Dueling Network Architectures for Deep Reinforcement Learning

[Publication] https://arxiv.org/abs/1511.06581

[code] https://github.com/qfettes/DeepRL-Tutorials/blob/master/04.Dueling_DQN.ipynb

Noisy Networks for Exploration

[Publication] https://github.com/qfettes/DeepRL-Tutorials/blob/master/04.Dueling_DQN.ipynb

[code] https://github.com/qfettes/DeepRL-Tutorials/blob/master/05.DQN-NoisyNets.ipynb

Prioritized Experience Replay

[Publication] https://arxiv.org/abs/1511.05952?context=cs

[code] https://github.com/qfettes/DeepRL-Tutorials/blob/master/06.DQN_PriorityReplay.ipynb

A Distributional Perspective on Reinforcement Learning

[Publication] https://arxiv.org/abs/1707.06887

[code] https://github.com/qfettes/DeepRL-Tutorials/blob/master/07.Categorical-DQN.ipynb

Rainbow: Combining Improvements in Deep Reinforcement Learning

[Publication] https://arxiv.org/abs/1710.02298

[code] https://github.com/qfettes/DeepRL-Tutorials/blob/master/08.Rainbow.ipynb

Distributional Reinforcement Learning with Quantile Regression

[Publication] https://arxiv.org/abs/1710.10044

[code] https://github.com/qfettes/DeepRL-Tutorials/blob/master/09.QuantileRegression-DQN.ipynb

Rainbow with Quantile Regression

[code] https://github.com/qfettes/DeepRL-Tutorials/blob/master/10.Quantile-Rainbow.ipynb

Deep Recurrent Q-Learning for Partially Observable MDPs

[Publication] https://arxiv.org/abs/1507.06527

[code] https://github.com/qfettes/DeepRL-Tutorials/blob/master/11.DRQN.ipynb

Advantage Actor Critic (A2C)

[Publication1] https://arxiv.org/abs/1602.01783

[Publication2] https://blog.openai.com/baselines-acktr-a2c/

[code] https://github.com/qfettes/DeepRL-Tutorials/blob/master/12.A2C.ipynb

High-Dimensional Continuous Control Using Generalized Advantage Estimation

[Publication] https://arxiv.org/abs/1506.02438

[code] https://github.com/qfettes/DeepRL-Tutorials/blob/master/13.GAE.ipynb

Proximal Policy Optimization Algorithms

[Publication] https://arxiv.org/abs/1707.06347

[code] https://github.com/qfettes/DeepRL-Tutorials/blob/master/14.PPO.ipynb

评论

被折叠的条评论为什么被折叠?

到【灌水乐园】发言

查看更多评论

添加红包

成就一亿技术人!

hope_wisdom

发出的红包

实付元

使用余额支付

点击重新获取

扫码支付

钱包余额 0

抵扣说明：

1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载，可以购买VIP、付费专栏及课程。