PRML 读书笔记-Chapter1

强化学习是一种通过在给定环境中采取行动以最大化奖励的智能算法。它涉及在探索未知行动效果与利用已知高收益行动之间寻找平衡。线性模型用于拟合数据,最小化误差函数以确定系数。RMS(根均方)是衡量函数与训练数据点之间差异的指标。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

reinforcement learning

Finding suitable actions to take in a given situation in order to maximize a reward.
A general feature of reinforcement learning is the trade-off between exploration,in which the system tries out new kinds of actions to see how effective they are,and exploitation, in which the system makes use of actions that are known to yield a high reward.
Too strong a focus on either exploration or exploitation will yield poor results.

linear models

Functions,such as the polynomial,which are linear in the unknown parameters have important properties and are called linear model.
for instance:
y(x,W) = w0 + w1*x + w2*x2 + w3*x3 + … + wm*xm

Error function

The values of the coefficients will be determined by fitting the polynomial to the training data.This can be done by minimization an error function the measures the misfit between the function y(x,W),for any given value of W, and the training set data points.
##Root - Mean -Square##
RMS,defined by

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值