参考书目
正文
理解强化学习的一种好方法是考虑一些指导其发展的示例和可能的应用。
A master chess player makes a move. The choice is informed both by planning— anticipating possible replies and counterreplies—and by immediate, intuitive judgments of the desirability of particular positions and moves.
- 下象棋时走步决策。
An adaptive controller adjusts parameters of a petroleum refinery’s operation in real time. The controller optimizes the yield/cost/quality trade-off on the basis of specified marginal costs without sticking strictly to the set points originally suggested by engineers.
- 自适应控制器可实时调整炼油厂的运行参数。
A mobile robot decides whether it should enter a new room in search