AI基础 : Adversarial Search II 对抗性搜索

AND - OR搜索树与博弈决策复杂度优化

最新推荐文章于 2025-04-05 18:00:59 发布

原创最新推荐文章于 2025-04-05 18:00:59 发布 · 669 阅读

11 ·

CC 4.0 BY-SA版权

文章标签：

#人工智能

Non-deterministic Transitions

AND-OR Search Trees

• In deterministic environments在确定性环境中，分支仅由智能体的选择引起。, branching only occurs due to agent’s choice (OR Nodes)
• In non-deterministic environments在非确定性环境中，除了智能体的选择，环境的随机性也会导致分支, the environment’s choice must also be taken into account (AND Nodes)
• Solution is a subtree of the AND-OR tree that:
— Has a goal node at every leaf
— Specifies an action at each OR node
— Includes every outcome branch of its AND nodes

AND-OR Graph Search

Adversarial Optimal Decisions

• Time Complexity O(bm)
• Space Complexity O(bm)
• Chess, on average: b = 30 m = 40

Reducing Complexity

• Reducing complexity of bm
— Reduce branching factor (b)?
— Reduce maximum search depth (m)?
— Searching in a graph rather than a tree? 在树形结构中，状态之间的连接是分层的，而在图形结构中，状态之间的连接可以是任意形式的。

Reducing Branching Factor

• Alpha-Beta Pruning
— Evaluate which nodes/branches would not affect MIN/MAX’s decision
— Based on keeping track of two parameters:
◦ α - value of the best (highest) choice we have in MAX’s path
◦ β - value of the best (lowest) choice we have in MIN’s path
• Updates these values as one goes along the tree

Move Ordering

• Pruning is strongly affected by the ordering of the moves in the tree
— A good ordering*, would enable us to prune many nodes
• Move ordering is often game-dependent knowledge (heuristic)
• Dynamic move-ordering (killer-move heuristic) 可以利用搜索树中已知的有效剪枝信息。

Reducing Depth - Killer Move

• Dynamic heuristic to determine a “good” ordering
• Search two plies ahead until Max (alt. Min) causes a beta (alt. alpha) cutoff
• The move that caused the cutoff is the killer move

在搜索过程中，算法会搜索两步，直到MAX（或MIN）玩家导致一个剪枝。

如果一个移动导致剪枝，那么这个移动被称为killer move。

Reducing M - Eval Function 减少评估函数的复杂性

Weighted linear function over features of a state

示例：国际象棋当前状态：棋子和位置（结构）

示例：万智牌（纸牌游戏）当前状态：生命值、游戏卡牌和手牌

Graph Search

• As in non-adversarial search, many states will be revisited 搜索可能需要探索不同的路径
• However, only recording visited states is not enough (since MIN can deviate in the future)
• Need to store actual loop paths (memory intensive)
— Requires “caching” strategy

Stochastic Games

• Outcome of agent choices is not deterministic
— Games must take into account multiple outcomes for the player
• Solution: weight outcomes by their probability
— Expected value

Expectiminimax