今天看了Human-level control through deep reinforcement learning和代码。学习了下 卷子核,分别是8*8, 4*4, 3*3 输出维度分别是32,64,64