Reasoning in the Dark: ISMCTS for Imperfect Information
The Problem with AlphaGo's Brain
Standard Monte Carlo Tree Search (MCTS) is famous for solving Go and Chess. It works by building a search tree of future states. You simulate a move, see what happens, and backpropagate the result.
But MCTS relies on a critical assumption: Perfect Information. The agent knows the exact state of the board.
In card games (like Poker or Callbreak), this assumption breaks. I don't know my opponent's cards. If I construct a search tree assuming they have a specific hand, my strategy will be brittle. If they actually have a different hand, my computed "optimal" move might be a blunder.
Enter ISMCTS (Information Set MCTS)
To solve this, we use Information Set MCTS.
An "Information Set" is the collection of all possible world states that are consistent with my current knowledge. For example, if I hold the Ace of Spades, the "Information Set" contains all permutations of the remaining deck distributed among opponents.
The Algorithm: Determinization
The core innovation of ISMCTS is Determinization. The search process looks like this:
- Determinize: At the start of a simulation, we randomly sample one specific world from the Information Set. We randomly deal the remaining cards to the opponents.
- Search: We run standard MCTS on this fully visible, specific world.
- Repeat: We run thousands of these simulations, re-shuffling the opponent's hidden cards every time.
- Aggregate: The final action is chosen based on the aggregated statistics across all these different "possible worlds."
Why it works
By averaging across thousands of determinizations, the agent learns a strategy that is robust to the hidden information. It learns moves that are good on average, regardless of exactly which cards the opponent holds.
Limitations: Strategy Fusion
While ISMCTS is powerful, it suffers from a theoretical flaw called Strategy Fusion. Because it averages values across different worlds, it assumes it can react to the hidden information after it is revealed, which isn't true.
However, in practice—and specifically in the game engines I developed—ISMCTS provides a superhuman baseline for trick-taking games, balancing computational cost with strategic depth much better than deep RL in low-data regimes.