动态规划-Predict the Winner

预测赢家的动态规划算法

最新推荐文章于 2022-12-01 11:54:33 发布

转载最新推荐文章于 2022-12-01 11:54:33 发布 · 52 阅读

0 ·

CC 4.0 BY-SA版权

原文链接：http://www.cnblogs.com/TIMHY/p/8909041.html

文章标签：

#数据结构与算法

本文介绍了一种利用动态规划预测游戏中先手玩家是否能赢得比赛的算法。通过递推关系式dp[i][j]=Math.max(nums[i]-dp[i+1][j],nums[j]-dp[i][j-1])计算先手玩家A与后手玩家B得分差值，最终判断先手玩家能否获胜。

2018-04-22 19:19:47

问题描述：

Given an array of scores that are non-negative integers. Player 1 picks one of the numbers from either end of the array followed by the player 2 and then player 1 and so on. Each time a player picks a number, that number will not be available for the next player. This continues until all the scores have been chosen. The player with the maximum score wins.

Given an array of scores, predict whether player 1 is the winner. You can assume each player plays to maximize his score.

Example 1：

Input: [1, 5, 2]
Output: False
Explanation: Initially, player 1 can choose between 1 and 2. 
If he chooses 2 (or 1), then player 2 can choose from 1 (or 2) and 5. If player 2 chooses 5, then player 1 will be left with 1 (or 2). 
So, final score of player 1 is 1 + 2 = 3, and player 2 is 5. 
Hence, player 1 will never be the winner and you need to return False.

Example 2：

Input: [1, 5, 233, 7]
Output: True
Explanation: Player 1 first chooses 1. Then player 2 have to choose between 5 and 7. No matter which number player 2 choose, player 1 can choose 233.
Finally, player 1 has more score (234) than player 2 (12), so you need to return True representing player1 can win.

问题求解：

首先我们如果穷举的话，是会出现重叠子问题的，比如A选left，B选left，A选right，B选right等同于A选right，B选right，A选left，B选left。因此适用于动态规划的方法来解决。现在问题就是如何建立这样的一个递推关系式。这条题目的动态规划建立是比较trick的，因此这里做一个介绍。

dp[i][j]：保存的是先手玩家A在i-j之间能获得的做高分数与后手玩家B的最高分数的差值。

初始条件：i == j时，dp[i][j] = nums[i]，这也对应着长度为一的情况。

递推关系式：dp[i][j] = Math.max(nums[i] - dp[i + 1][j], nums[j] - dp[i][j - 1])，也就是说，对于当前的先手玩家，他既可以选择前面一个数，也可以选择后面一个数，那么后手玩家的范围就因此减少了，由于存储的是差值，因此可以得到上述的递推式。

    public boolean PredictTheWinner(int[] nums) {
        int n = nums.length;
        int[][] dp = new int[n][n];
        for (int i = 0; i < n; i++) dp[i][i] = nums[i];
        for (int len = 2; len <= n; len++) {
            for (int i = 0; i <= n - len; i++) {
                int j = i + len - 1;
                dp[i][j] = Math.max(nums[i] - dp[i + 1][j], nums[j] - dp[i][j - 1]);
            }
        }
        return dp[0][n - 1] >= 0;
    }