自定义博客皮肤VIP专享

*博客头图：

点击选择上传的图片

格式为PNG、JPG，宽度*高度大于1920*100像素，不超过2MB，主视觉建议放在右侧，请参照线上博客头图

请上传大于1920*100像素的图片！

博客底图：

点击选择上传的图片

图片格式为PNG、JPG，不超过1MB，可上下左右平铺至整个背景

栏目图：

点击选择上传的图片

图片格式为PNG、JPG，图片宽度*高度为300*38像素，不超过0.5MB

主标题颜色：

RGB颜色，例如：#AFAFAF

Hover：

RGB颜色，例如：#AFAFAF

副标题颜色：

RGB颜色，例如：#AFAFAF

预览取消提交

自定义博客皮肤

-+

上一步保存

leejieleejie的博客

leejieleejie 优快云认证博客专家优快云认证企业博客

码龄8年

15: 原创

13万+: 周排名

233万+: 总排名

8万+: 访问

: 等级

1029: 积分

55: 粉丝

306: 获赞

131: 评论

525: 收藏

私信

关注

热门文章

分类专栏

最新评论

eclipse4.15/16.0通用教程安装springsource-tool-suite；STS4没有spring；没有spring Bean configuration file；
ranhoulei: 感恩的心感谢有你
sourcetree的初步使用二以及git SSH密钥的创建、配置（图文详解）
a小手冰凉℃: 写得非常好！
Unified Functional Testing(UFT)15.0.2入门保姆级教程（一），图文详解。QTP
况: Install the extension manually from the UFT One installation folder. In Edge, select Tools > Extensions to open the edge://extensions page. In the Extensions page, select the Developer mode option. Additional options are displayed after you select this option. Click the Load unpacked button. In the Browse for Folder dialog, browse to and select the <UFT One installation folder>\Installations\Edge\v3 folder. The OpenText UFT Agent is now displayed in the Edge extensions list.
Unified Functional Testing(UFT)15.0.2入门保姆级教程（一），图文详解。QTP
况: Install the extension manually from the UFT One installation folder. In Edge, select Tools > Extensions to open the edge://extensions page. In the Extensions page, select the Developer mode option. Additional options are displayed after you select this option. Click the Load unpacked button. In the Browse for Folder dialog, browse to and select the <UFT One installation folder>\Installations\Edge\v3 folder. The OpenText UFT Agent is now displayed in the Edge extensions list.
Unified Functional Testing(UFT)15.0.2入门保姆级教程（一），图文详解。QTP
weixin_68092421: 请问有这个插件吗

马尔科夫决策

关注

关注数：文章数：1 文章阅读量：1554 文章收藏量：17

作者: leejieleejie

这个作者很懒，什么都没留下…

展开

马尔科夫决策过程，强化学习，深度强化学习，Q-learning 入门一看就懂

**马尔科夫性：**即无后效性，下一个状态只和当前状态有关而与之前的状态无关。马尔科夫过程：马尔科夫过程是随机过程的一种。可以理解为，在满足马尔科夫性质的条件下，状态与状态之间的转换过程即为马尔科夫过程。这个过程只有状态和状态转移概率，是不涉及动作的。马尔科夫决策过程：考虑了动作策略的马尔科夫过程，即系统下个状态不仅和当前的状态有关，也和当前采取的动作有关。强化学习：是依靠环境给予的奖惩来学习的，因此对应的马尔科夫决策过程还包括奖惩值这几者的区别详见：https://zhuanlan.zhihu.c

原创 2020-09-04 17:13:06 · 1554 阅读 · 0 评论