
MADRL
Coop_Multi-Agent_DRL
The whole world can be modeled as multi-agent
展开
专栏收录文章
- 默认排序
- 最新发布
- 最早发布
- 最多阅读
- 最少阅读
-
Paper Reading Weird Words
文章目录heuristicallyaka. heuristically without valid theoretical groundings 一个基于直观或经验构造的算法,在可接受的花费(指计算时间和空间)下给出待解决组合优化问题每一个实例的一个可行解,该可行解与最优解的偏离程度一般不能被预计。 aka. 又名,亦称 (also known as) ...原创 2020-05-07 21:38:49 · 480 阅读 · 0 评论 -
一些术语
文章目录decentralised executionsuboptimal policies(添加自由探索几率, 防止 局部最优) decentralised execution each agent can select its action based only on its own factor suboptimal policies(添加自由探索几率, 防止 局部最优) Single ag...原创 2020-05-07 21:38:26 · 340 阅读 · 0 评论 -
Cooperative Deep MARL
文章目录Abstract Abstract 这个世界就是个大规模多智能体世界,大量智能体协作才是在AGI的正道上。原创 2020-04-29 11:30:33 · 283 阅读 · 0 评论 -
Partial Observable State-of-art
文章目录Inducing Cooperation via Team Regret Minimization based Multi-Agent Deep Reinforcement Learning Inducing Cooperation via Team Regret Minimization based Multi-Agent Deep Reinforcement Learning 论文链接...原创 2020-01-21 14:06:59 · 178 阅读 · 0 评论 -
Q_TRAN
文章目录 value function factorization methods have been proposed to efficiently handle a joint action-value function whose complexity grows exponentially with the number of agents. the additivity and mono...原创 2020-01-02 12:05:35 · 329 阅读 · 0 评论 -
QMIX
文章目录constraint we can learn a fully centralised stateaction value function Q_tot and then use it to guide the optimisation of decentralised policies in an actor-critic framework QMIX consists of agen...原创 2019-12-27 19:32:43 · 2113 阅读 · 0 评论 -
QMIX_paperReview
文章目录decentralised policies decentralised policies Decentralised policies also naturally attenuate the problem that joint action spaces grow exponentially with the number of agents, often rendering the...原创 2019-12-09 23:29:08 · 212 阅读 · 0 评论 -
MA_Policies
文章目录decentralised policies decentralised policies However, RL methods designed for single agents typically fare poorly on such tasks, since the joint action space of the agents grows exponentially wit...原创 2019-11-30 16:42:13 · 159 阅读 · 0 评论 -
AC_PolicyOpti_in_PartObserv_MAEnvi
文章目录terminal historyprefix(non-terminal history) terminal history prefix(non-terminal history)原创 2019-11-27 19:25:58 · 135 阅读 · 0 评论 -
MAML_ReinforcementApproach
文章目录4 极大极小 4 极大极小原创 2019-11-24 23:29:52 · 252 阅读 · 0 评论 -
COMA
文章目录credit assignment credit assignment Since all agents are exploring and learning at the same time, it is difficult for any given agent to estimate the impact of their action on the overall return...原创 2019-11-24 21:07:19 · 869 阅读 · 0 评论