Leetcode question bank: application of dynamic programming in stone guessing game (Python language)

dynamic programming

There are a lot of tutorials and concepts on the Internet. The premise of reading this article is that you have a preliminary understanding of dynamic planning.

To sum up, dynamic planning has four steps:

  1. Define the status.
  2. Find the state transition equation
  3. State compression (if possible)
  4. Get results

Stone guessing game on LeetCode

LeetCode has many stone guessing games, most of which can play the best level. It belongs to the category of game. Here are three stone guessing problems that can be solved by dynamic programming, which are easy to use and difficult.

1. Leetcode 486: predicting winners

Title Description

Thinking process

1. It can be seen from the title that players can only take numbers (stones) from the head and tail of the array at a time, but it is optional to take the head or tail.

After the array is taken, two people will have a score. The difference between player 1's score and player 2's score can be used as the basis for judging who becomes the winner.

2. The definition status dp[i][j] is the current array interval i,j The maximum value of the score difference between two people after taking stones. Then the player has only two choices, either take the number i (head) or the number j (tail). The following are discussed separately:

  • When the player takes the number I, the interval left for the next player is [i+1,j] (closed interval), which just corresponds to the state of dp[i+1][j], so the difference is num [i] - dp[i+1][j].
  • When the player takes the number j, the same reason is that the difference is num [j] - DP [i] [j-1]

To sum up, the state transition equation is:

dp[i][j] = max(nums[i]-dp[i+1][j],nums[j]-dp[i][j-1])

It is worth noting that the state of dp[i][j] needs to use the state of dp[i+1][j], so when traversing the interval of the array, it should be traversed in reverse order.

class Solution:
    def PredictTheWinner(self, nums: List[int]) -> bool:
        dp=[[0]*n for _ in range(n)]  #dp[i][j] indicates the score difference when the interval is [i,j]
        for i in range(n):
            dp[i][i]=nums[i] # State initialization. When the interval length is only 1, the difference is the value
        for i in range(n-2,-1,-1):
            for j in range(i+1,n):
        return dp[0][n-1]>=0

2. LeetCode877, stone game

Title Description


1. This problem is almost the same as that of the previous prediction player, or more precisely, it is a special case of the previous problem, that is, the array length is even, and the array sum is odd. After proof, it will be found that this is a problem that must be won first. Just return to this problem. Prove the process strategy.

return True

2. But think about it carefully. If the next question is what is the maximum difference between the scores obtained by two people after the game, it can't be written like this. It happens that there is such a question behind it.

3. The code of this question is exactly the same as that of the previous question, and the thinking process of dynamic planning is the same.

class Solution:
    def stoneGame(self, piles: List[int]) -> bool:
        dp=[[0]*n for _ in range(n)]  #dp[i][j] indicates the score difference when the interval is [i,j]
        for i in range(n):
        for i in range(n-2,-1,-1):
            for j in range(i+1,n):
        return dp[0][n-1]>=0

Right, exactly the same.

3. Leetcode 1690, stone game VII

Title Description

Thinking process

1. In fact, this question is also very similar to the previous one. The only difference is that the score is the sum of the current number and the remaining numbers.

It involves frequently calculating the sum of an interval, and the numbers in the interval do not change. It is natural to think of prefixes and arrays.

If an array is [1,2,3,5,8], then its prefix and array are [0,1,3,6,11,19], which is one bit longer than the length of the original array. To calculate the sum between nums[i:j], you only need to use presum[j+1]-presum[i].

For example, to calculate the sum of num [1] + num [2] + num [3], you only need to use presum[4]-presum[1]=10.

The specific principle of prefix and is not repeated here. Prefix and differential array will be written in detail in the future.

2. The state definition of this question is the same as before. We need to change the state transition equation and change the score from num [i] to sum([i+1:j]).

dp[i][j] = max(sum(stones[i+1:j])-dp[i+1][j],sum(stones[i:j-1])-dp[i][j-1])

The process of calculating sum needs no prefix and will waste a lot of extra time.

3. Go directly to the code, which is exactly the same as the code in the first two questions, except for the prefix and

class Solution:
    def stoneGameVII(self, stones: List[int]) -> int:
        n = len(stones)
        presum = [0] + list(itertools.accumulate(stones))  # To calculate the prefix sum, you can also traverse the array at one time
        # presum = [0]*(n+1)
        # for i in range(1,n+1):
        #     presum[i] = presum[i-1]+stones[i-1]
        dp = [[0]*n for _ in range(n)]
        for i in range(n-2,-1,-1): # Because the state of i is related to i+1, it should be flashed back and smaller than j
            for j in range(i+1,n):
                l = presum[j+1] - presum[i+1]
                r = presum[j] - presum[i]
                dp[i][j] = max(l-dp[i+1][j],r-dp[i][j-1])
        return dp[0][n-1]

This is the application of dynamic programming in some stone guessing games. Of course, there must be some stone guessing games I didn't write about.

The most important thing I think is that these three questions provide me with an idea, that is, dp[i][j] is defined as a state in an interval, and then the state is updated by constantly updating the interval, and finally we get the desired result.

Tags: Programming Algorithm leetcode Dynamic Programming

Posted by xeross on Sun, 01 May 2022 15:41:16 +0300