Machine Learning - Lecture 16

Reinforcement Learning (R.L.)

① MDPs (Markov Decision Processes)

② Value Functions

③ Value Iteration

④ Policy Iteration

(both ③ and ④ are algorithms for solving R.L. problems)

 

Supervised Learning: we have the training set in which we were given sort of the right answer of every training example and it was the just a drop of the learning algorithms to replicate more of the right answers.

Unsupervised Learning: we had just a bunch of unlabeled data just the x's and it was the job in the learning alogrithm to discover so-called structure in the data and several algorithms like cluster analysis K-means, a mixture of all the sort PCA, ICA and so on.

 

Today we just talk about a different class of learning algorithms between supervised and unsupervised — R.L.

there's a helicopter experiment performed by Andrew Ng at Stanford University(you could see the video and the details of that experiment on the Internet), which is a unmanned helicopter controlld by R.L. algorithms.

It's different from Supervised Learning, because usually we actually do not konw

转载于:https://www.cnblogs.com/dimsumboy/p/6166044.html

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值