fo-in-优快云博客

ContentsHeuristic SearchRollout AlgorithmMonte Carlo Tree SearchReferencesHeuristic SearchDecision-time planning methods, collectively known as heuristic search, are classical state-space planning methods in AI.The approximate value function is applied

2021-08-02 14:56:43 258

原创 Planning and Learning with Tabular Methods: Part 2

ContentsExpected vs. Sample UpdatesIs it better devoted to a few expected updates or to bbb times as many sample updates?Trajectory SamplingReal-time Dynamic ProgrammingPlanning at Decision TimeExpected vs. Sample UpdatesSample updates can in many cases

2021-07-24 16:14:25 296

原创 Planning and Learning with Tabular Methods: Part 1

ContentsModels and PlanningPlanning and learning methodsRandom-sample one-step tabular Q-planningDyna: Integrated Planning, Acting, and LearningThe reason to put forward Dyna-QModel learning and direct RLDyna-QMethods of reinforcement learning fall into t

2021-07-18 17:12:52 627

原创 n-step Bootsrapping：Part1

Contents一级目录二级目录三级目录一级目录二级目录三级目录

2021-07-17 19:20:33 254

转载 git clone 出现fatal: unable to access ‘https://github ××× 类错误解决方法

将命令行里的http或https改为git重新执行如下图所示：

2021-06-15 20:23:33 3106

原创 Windy Gridworld: A simple implementation of Tabular RL(Sarsa and Q-learning)

ContentsDescription of the problemDescription of the problemThe windy gridworld is a simple example in the textbook. By programming to reproduce and solve this problem, I began to really understand Sarsa and Q-learning. Now I introduce this to you, and

2021-04-27 21:23:19 1741 2

原创 Q-learning、Expected Sarsa、Double Learning

Contents一级目录二级目录三级目录一级目录二级目录三级目录

2021-04-21 23:58:49 874

原创 google colab自动断连咋办？

咋办

2021-04-14 23:58:11 20335 7

原创 Sarsa: One of classical algorithms of RL

Sarsa: One of classical algorithms of RLWhat is TD learning?On policy and Off-policyA brief introduction of SarsaA simple implementationWhat is TD learning?“TD learning” means “temporal-difference learning”, witch is a combination of Monte Carlo ideas(MC

2021-04-10 21:53:18 2306 13

原创着手开篇

着手开篇写在前面专题随笔PTA练习Reinforcement Learning功能快捷键合理的创建标题，有助于目录的生成如何改变文本的样式插入链接与图片如何插入一段漂亮的代码片生成一个适合你的列表创建一个表格设定内容居中、居左、居右SmartyPants创建一个自定义列表如何创建一个注脚注释也是必不可少的KaTeX数学公式新的甘特图功能，丰富你的文章UML 图表FLowchart流程图导出与导入导出导入写在前面朋友你好！不论你是因为何种原因点开了这篇博客，都是我的荣幸！此时落笔的我是一名大三学生，通信

2021-04-10 21:37:46 210 1

WZX_Hello的博客

原创开启电脑麦克风设置

原创 git clone报错“××× Could not resolve host:github.com“的解决办法

原创施密特正交化(Gram-Schmidt Orthogonalization)

原创关于正交矩阵的二三事

原创 DOA算法3：Matrix Pencil

原创矩阵的奇异值分解

原创 DOA算法2：ESPRIT算法

原创 DOA算法1：MUSIC算法（二）

原创 DOA算法1：MUSIC算法（一）

原创 Planning and Learning with Tabular Methods: Part3