经典Q-learning讲解

最新推荐文章于 2023-11-14 10:33:59 发布

seaside2003

最新推荐文章于 2023-11-14 10:33:59 发布

阅读量334

点赞数

分类专栏：强化学习文章标签：强化学习 Q-learning

原文链接：https://www.freecodecamp.org/news/diving-deeper-into-reinforcement-learning-with-q-learning-c18d0db58efe

版权

强化学习专栏收录该内容

9 篇文章

订阅专栏

本文通过实例详细介绍了Q-Learning的基本原理与实现步骤，包括Q-table的初始化与更新过程，并解释了Bellman方程在强化学习中的应用。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

本文转载，很经典：

Diving deeper into Reinforcement Learning with Q-Learning

1、Q-learning

Step 1: We init our

Q-table

The initialized Q-table

Step 2: Choose an action
From the starting position, you can choose between going right or down. Because we have a big epsilon rate (since we don’t know anything about the environment yet), we choose randomly. For example… move right.

We move at random (for instance, right)

We found a piece of cheese (+1), and we can now update the Q-value of being at start and going right. We do this by using the Bellman equation.

Steps 4–5: Update the Q-function

First, we calculate the change in Q value ΔQ(start, right)
Then we add the initial Q value to the ΔQ(start, right) multiplied by a learning rate.

Think of the learning rate as a way of how quickly a network abandons the former value for the new. If the learning rate is 1, the new estimate will be the new Q-value.