how to learn reinforcement learning(answered by Sergio Valcarcel Macua on Quora)

本文推荐了关于强化学习领域的核心书籍与论文资源,涵盖了从基础到前沿的多个方面,如信息表示、逆强化学习、算法及大规模应用等。特别介绍了几本实用性强且适合解决实际问题的著作。

link:

https://www.quora.com/What-are-the-best-books-about-reinforcement-learning

 

The main RL problems are related to:
- Information representation: from POMDP to predictive state representation to deep-learning to TD-networks
- Inverse RL: how to learn the reward?
- Algorithms
  + Off-policy
  + Large scale: linear and nonlinear approximations of the value function
  + Policy search vs. Q-learning based
- Beyond MDP
  + Policy search for Black-box optimization with global performance guarantees

 

Recommended papers:

* Algorithms for Reinforcement Learning: Csaba Szepesvari. Nice compendium of ready to be implemented algorithms. 

* Reinforcement Learning and Dynamic Programming using Function Approximators. Busoniu, Lucian; Robert Babuska; Bart De Schutter; Damien Ernst (2010). This is a very practical book that explains some state-of-the-art algorithms (i.e., useful for real world problems) like fitted-Q-iteration and its variations.

* Reinforcement Learning: State-of-the-Art. Vol. 12 of Adaptation, Learning, and Optimization. Wiering, M., van Otterlo, M. (Eds.), 2012. Springer, Berlin. In Sutton's words "This book is a valuable resource for students wanting to
go beyond the older textbooks and for researchers wanting to easily catch up with
recent developments".

* Optimal Adaptive Control and Differential Games by Reinforcement Learning Principles: Draguna Vrabie, Kyriakos G. Vamvoudakis, Frank L. Lewis. I am not familiar with this one, but I have seen it recommended.

* Markov Decision Processes in Artificial Intelligence, Sigaud O. & Buffet O. editors, ISTE Ld., Wiley and Sons Inc, 2010.

 There are also several good specialized monographs and surveys on the topic, some of these are:

+ "From Bandits to Monte-Carlo Tree Search: The Optimistic Principle Applied to Optimization and Planning" by Remi Munos (New trends on Machine Learning). This monograph covers important nonconvex optimistic optimization methods that can be applied to policy search. 

+ "Reinforcement Learning in Robotics: A Survey" by J. Kober, J. A. Bagnell and J. Peters. 

+ "A Tutorial on Linear Function Approximators for Dynamic Programming and Reinforcement Learning" by A. Geramifard, T. J. Walsh, S. Tllex, G. Chowdhary, N. Roy and J. P. How (Foundations and Trends in Machine Learning). 

+ "A Survey on Policy Search for Robotic" by Newmann and Peters (Foundations and Trends in Machine Learning). 

转载于:https://www.cnblogs.com/cxxszz/p/6959594.html

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值