The weakness of AI Go programs and what it means for the future of AI
We need a bridge between intuition and logic
AI can play the game Go far better than any human, but oddly it has some weaknesses which can allow humans to exploit and defeat it. Patching over these weaknesses is very difficult, and teaches interesting lessons about what AI, traditional software, and us humans are good and bad at. Here’s an example position showing the AI losing a big region after getting trapped:1
For those of your not familiar with the game, Go is all about connectivity between stones. When a region of stones loses all connection to empty space, as the red marked one in the above position just did, it dies. When a group surrounds two separate regions it can never be captured because the opponent only places one stone at a time and hence can’t fill both at once. Such a region is said to have ‘two eyes’ and be ‘alive’.
The above game was in a very unusual position where both the now-dead black group and the white one it surrounds only have one eye and not much potential for any other. The AI may have realized that this was the case but optically it looks like the position is good so it keeps playing elsewhere on the board until that important region is lost.
Explaining what’s failing here and why humans can do better requires some background. Board games have two components to them: ‘tactics’ which encompasses what happens when you do an immediate look-ahead of the position at hand, and ‘position’ which a a more amorphous concept encompassing everything else you can glean about how good a position is from looking at it and using your instincts. There’s no fine white line between the two, but for games on a large enough board with enough moves in a game it’s computationally infeasible to do a complete exhaustive search of the entire game so there is a meaningful difference between the two.
There’s a bizarre contrast between how classical software and AI works. Traditional software is insanely, ludicrously good at logic, but has no intuition whatsoever. It can verify even the most complex formal math proof almost immediately, and search through a number of possibilities vastly larger than what a human can do in a lifetime in just an instant, but if any step is missing it has no way of filling it in. Any such filling in has to follow heuristics which humans painstakingly created by hand, and usually leans very heavily on trying an immense number of possibilities in the hope that something works.
AI is the exact opposite. It has (for some things) ludicrously, insanely good intuition. For some tasks, like guessing at protein folding or evaluating it’s far better than any human ever possibly could be. For evaluating Chess or Go positions only a handful at most of humans can be an AI running purely on instinct. What AI is lacking in is logic. People get excited when it demonstrates any ability to do reasoning at all.
Board games are special in that the purely logical component of them is extremely well defined and can be evaluated exactly. When Chess computers first overtook the best humans it was by having a computer throw raw computational power at the problem with fairly hokey positional evaluation underneath it which had been designed by hand by a strong human player and was optimized more for speed than correctness. Chess is more about tactics than position so this worked well. Go has a balance more towards position so this approach didn’t work well until better positional evaluation via AI was invented. Both Chess and Go have a balance between tactics and position because we humans find both of those interesting. It’s possible that sentient beings of the future will favor games which are much more heavily about position because tactics are more about who spends more computational resources evaluating the position than who’s better at the game.
In some sense doing lookahead in board games (the technical term is 'alpha-beta pruning’) is a very special form of hand-tuned logic. But it by definition perfectly emulates exactly what’s happening in a board game, so it can be mixed with a good positional evaluator to almost get the best of both. Tactics are covered by the lookahead, and position is covered by the AI.
But that doesn’t quite cover it. The problem is that this approach keeps the logic and the AI completely separate from each other and doesn’t have a bridge between them. This is particularly important in Go where there are lots of local patterns which will have to get played out eventually and you can get some idea of what will happen at that time by working out local tactics now. Humans are entirely capable of looking at a position, working through some tactics, and updating their positional evaluation based on that. The current generation of Go AI programs don’t even have hooks to make that possible. They can still beat humans anyway, because their positional evaluation alone is comparable if not better than what a human gets to while using feedback, and their ability to work out immediate tactics is ludicrously better than ours. But the above game is an exception. In that one something extremely unusual happened, in which something which immediate optics of a position were misleading, and the focus of game play was kept off that part of the board long enough that the AI didn’t use its near-term tactical skill to figure out that it was in danger of losing. A human can count up the effective amount of space in the groups battling it out in the above example by working out the local tactics and gets a further boost because what matters is the sum total of them rather than the order in which they’re invoked. Simple tactical evaluation doesn’t realize this independence and has to work through exponentially more cases.
The human executable exploits of the current generation Go AIs are not a single silly little problem which can be patched over. They are particularly bad examples of systemic limitations of how those AIs operate. It may be possible to tweak them enough that the humans can’t get away with such shenanigans any more, but the fact remains that they are far from perfect and some better type of AI which does a better job of bridging between instinct and logic could probably play vastly better than they do now while using far less resources.
This post is only covering the most insidious attack but the others are interesting as well. It turns out that the AIs think in japanese scoring despite being trained exclusively on chinese scoring so they’ll sometimes pass in positions where the opponent passing in response results in them losing and they should play on. This can be fixed easily by always looking ahead after a pass to see who actually wins if the opponent immediately passes in response. Online sites get around the issue by making ‘pass’ not really mean pass but more ‘I think we can come to agreement about what’s alive and dead here’ and if that doesn’t happen it reverts to a forced playout with chinese scoring and pass meaning pass.
The other attack is ‘gift’ where a situation can be set up where the AI can’t do something it would like to due to ko rules and winds up giving away stones in a way which make it strictly worse off. Humans easily recognize this and don’t do things which make their position strictly worse off. Arguably the problem is that the AI positional evaluator doesn’t have good access to what positions were already hit, but it isn’t clear how to do that well. It could probably also be patched around by making the alpha-beta pruner ding the evaluation when it finds itself trying and failing to repeat a position but that needs to be able to handle ko battles as well. Maybe it’s also a good heuristic for ko battles.
Both of the above raise interesting questions about what what tweaks to a game playing algorithm are bespoke and hence violate the ‘zero’ principle that an engine should work for any game and not be customized to a particular one. Arguably the ko rule is a violation of the schematic of a simple turn-based game so it’s okay to make exceptions for that.