强化学习
Charel_CHEN
这个作者很懒,什么都没留下…
展开
专栏收录文章
- 默认排序
- 最新发布
- 最早发布
- 最多阅读
- 最少阅读
-
OpenAI_gym的官网案例
OpenAI_gym的官网案例Tags: openAI_gym创建,渲染,随机选择动作当然这只是gym的一个游戏,还有一些如: MountainCar-v0, MsPacman-v0 (requires the Atari dependency), or Hopper-v1 (requires the MuJoCo dependencies). Environments all descend fr原创 2017-11-23 08:46:23 · 7492 阅读 · 0 评论 -
FeUdal Networks for Hierarchical Reinforcement Learning 阅读笔记
FeUdal Networks for Hierarchical Reinforcement Learning标签(空格分隔): 论文笔记 增强学习算法FeUdal Networks for Hierarchical Reinforcement LearningAbstractIntroductionmodelLearningTransition Policy GradientsArch原创 2017-11-23 09:44:36 · 5368 阅读 · 2 评论 -
DARLA: Improving Zero-Shot Transfer in Reinforcement Learning 阅读笔记
DARLA: Improving Zero-Shot Transfer in Reinforcement Learning标签(空格分隔): 论文笔记 增强学习算法DARLA Improving Zero-Shot Transfer in Reinforcement Learning目的和意义训练领域和应用领域source domain and target domain算法细则该论文主要讲原创 2017-11-23 10:11:28 · 1705 阅读 · 0 评论 -
DARLA 源码解析
DARLA 源码解析标签(空格分隔): 增强学习算法 源码'''Implementation of DARLA preprocessing, as found in DARLA: Improving Zero-Shot Transfer in Reinforcement Learningby Higgins and Pal et al (https://arxiv.org/pdf/1707.08原创 2017-11-23 11:07:44 · 463 阅读 · 0 评论 -
Dynamic Programming - reinforcement learning
Dynamic Programming标签(空格分隔): Reinforcement Learning: An Introduction Dynamic ProgrammingDynamic ProgrammingPolicy EvaluationPolicy ImprovementPolicy IterationValue IterationAsynchronous Dynamic P原创 2017-11-21 23:37:22 · 756 阅读 · 0 评论 -
A2C Advantage Actor-Critic源码
A2C Advantage Actor-Critic (离散空间)标签(空格分隔): 增强学习算法 源码import numpy as npimport tensorflow as tfimport gymnp.random.seed(2)tf.set_random_seed(2) # reproducible# SuperparametersOUTPUT_GRAPH = False #原创 2017-11-23 13:49:18 · 4605 阅读 · 0 评论 -
Asynchronous Methods for Deep Reinforcement Learning 阅读笔记
Asynchronous Methods for Deep Reinforcement Learning 阅读笔记标签(空格分隔): 增强学习算法 论文笔记本文的贡献在于提出了异步学习的算法,并应用在A2C Q-learning等算法中该论文作者提出了异步训练(Asynchronous Methods)的方法应用到强化学习的各个算法中(Sarsa,one-step Q-learning n-ste原创 2017-11-23 16:49:30 · 3068 阅读 · 0 评论
分享