The Sprague-Grundy theory of impartial games

最新推荐文章于 2019-12-10 16:21:33 发布

touzani

最新推荐文章于 2019-12-10 16:21:33 发布

阅读量2k

点赞数

分类专栏：数据结构与算法文章标签： function properties each token application equals

数据结构与算法专栏收录该内容

27 篇文章

订阅专栏

本文介绍了一种名为Sprague-Grundy函数的数学工具，用于分析无偏博弈中的最优策略。文章详细阐述了如何通过该函数确定游戏状态的赢输性质，并以Nim游戏为例进行了说明。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

An impartial game is a two-player game in which both players have complete information, no chance is involved, and the legal moves from each position are the same for both players. We will deal with the normal play rule, in which the last player to move is the winner.

An impartial game can be abstractly represented by a directed acyclic graph, in which each vertex corresponds to a position, and each directed edge represents a legal move from one position to another.

We can imagine that a token is initially placed on some vertex, and the two players take turns moving the token from its current vertex to one of its direct followers. The game ends when the token reaches a sink, which is a vertex with no outgoing edges, and then the last player to have moved is the winner.

P- and N-positions

We can classify each position in the game according to whether it is a first- or a second-player win, if both players play optimally starting from that position. A first-player-win position is known as an N-position (because the next player is to win), while a second-player-win position is known as a P-position (because the previous player is to win).

The P- and N-positions can be characterized inductively as follows:

A vertex v is a P-position if and only if all its direct followers are N-positions.
A vertex v is an N-position if and only if it has some P-position follower.

The induction starts at the sinks, which are P-positions because they vacuously satisfy the P-position requirement.

Knowledge of the P- and N-positions of a game provides the winning strategy for it: If it is our turn and the game is in an N-position, we should move into a P-position. Then our opponent will be forced to move into an N-position, and so on. Eventually we will move into a sink and win.

Sums of games

If G₁ and G₂ are impartial games, then their sum G₁ + G₂ is another impartial game, which is played as follows: On each turn, a player chooses one of G₁, G₂ (whichever he wants) and plays on it, leaving the other game untouched. The game ends when no moves are possible on G₁ nor on G₂.

Formally, if G₁ = (V₁, E₁) and G₂ = (V₂, E₂) are game-graphs, then their sum G_sum = (V_sum, E_sum) is given by:

V_sum = V₁ × V₂,
E_sum = {(v₁v₂, w₁v₂) | (v₁, w₁) ∈ E₁} ∪ {(v₁v₂, v₁w₂) | (v₂, w₂) ∈ E₂}.

Now, suppose we are given two games G₁ and G₂. Can we correctly play their sum G₁ + G₂ if we just know the P- and N-positions of the individual games? It turns out that the answer is no. It is not hard to see that the sum of two P-positions is always a P-position, and the sum of a P-position and an N-position is always an N-position. But the sum of two N-positions can either be a P- or an N-position. Therefore, just knowing the P- and N-positions of the individual games is not enough.

To play sums of games correctly we need a generalization of the notion of P- and N-positions, which is known as the Sprague-Grundy function (or Grundy function for short).

The Sprague-Grundy function

Let N = {0, 1, 2, 3, ...} be the set of natural numbers. The Sprague-Grundy function assigns to each position in a game a natural number. The Grundy value of a vertex v equals the smallest natural number that does not appear among the Grundy values of v's direct followers.

Formally: Given a finite subset S ⊂ N, let mex S (the minimum excluded value) be the smallest natural number not in S:

mex S = min (N / S).

Now, given a game-graph G = (V, E), its Sprague-Grundy function g: V → N is defined inductively by

g(v) = mex {g(w) | (v, w) ∈ E}.

This induction starts at the sinks of G, which receive a Grundy value of 0.

The Sprague-Grundy function satisfies two important properties:

A vertex v is a P-position if and only if g(v) = 0.
If G = G₁ + G₂ and v = v₁v₂ is a position in G, then g(v) is the bitwise XOR of the binary representations of g(v₁) and g(v₂):
g(v) = g(v₁) ⊕ g(v₂).

The operation ⊕ is also called nim sum. For example, 3 ⊕ 5 = 011₂ ⊕ 101₂ = 110₂ = 6. Similarly, 3 ⊕ 6 = 5 and 5 ⊕ 6 = 3.

It is not hard to prove the above two properties by induction.

From these properties it follows that v = v₁v₂ is a P-position if and only if g(v₁) = g(v₂), since this is the only way the nim sum can come out 0.

Clearly, the sum of games is a commutative and associative operation, as is the nim sum operation. Therefore, we can correctly play the sum of any number of games by knowing the Grundy function of each individual game.

Our strategy is as follows: If it is our turn and the games' Grundy values give a nonzero nim-sum, then there must exist a move in some component game that causes the nim sum to become 0. We should make that move, and then our opponent will be forced to make the nim sum nonzero again. Eventually, we will be the ones to make the last move in the last game, bringing the nim sum to 0 for the final time.

The game of Nim

The most fundamental impartial game is the Nim pile. A Nim pile consists of a certain number of tokens. On each turn, a player removes from the pile any number of tokens between one token and the entire pile. The player who empties the pile wins.

This game by itself is of course trivial: The first player can just take all the tokens and win immediately! But if we add together Nim piles of various sizes, we get the famous game of Nim.

The Grundy value of a Nim pile of size n is n. Therefore, the Grundy value of a position in Nim is the nim sum of its pile sizes.

Games that decompose into sums of themselves

The most natural application of the Sprague-Grundy theory is for games that naturally decompose into sums of themselves.

Consider the following game: There is a checkerboard of size m × n, and there is an unlimited supply of polyominoes of a certain shape. On each turn, a player places a polyomino on an empty place on the board, and the player who cannot make a move is the loser:

During the course of the game, the board will gradually split into separate regions, for which we can calculate their Grundy values independently:

For another example consider Grundy's game. A position in this game consists of a number of piles of tokens of various sizes, and a move consists of taking one pile and splitting it into two unequal piles. The game ends when there are only piles of sizes 1 and 2 left, which cannot be split further.

Let g(n) be the Grundy value of a single pile of size n. The sequence g(n) starts as follows:

n:	1	2	3	4	5	6	7	8	9	10	11	12	13	14	15	16	17	18	19	20	...
g(n):	0	0	1	0	2	1	0	2	1	0	2	1	3	2	1	3	2	4	3	0

Although these values have been calculated to around n = 2³⁵, this sequence is still not completely understood. It is possible that the sequence will eventually become periodic, although this has not been seen to happen yet. See the following page by Achim Flammenkamp for more information.

The theory of impartial games was discovered independently by Roland P. Sprague in 1936 and P. M. Grundy in 1939.

width="728" scrolling="no" height="90" frameborder="0" align="middle" marginwidth="0" marginheight="0" src="http://download1.youkuaiyun.com/down3/20070601/01184120111.htm">