Now computers can bluff.
This should be a terrifying thought, but it’s a breakthrough moment, watched by thousands of people across 151 countries on the Twitch network, all of them witness to the first time a computer artificial intelligence (AI), Libratus, soundly beat four of the world’s best poker players at heads-up no-limit Texas Hold’Em.
“At a high level, this means now that we have proven the ability for an AI to do strategic reasoning in imperfect situations has surpassed that of humans,” said Carnegie Mellon University Professor of Computer Science Tuomas Sandholm on Tuesday during a press conference following the AI victory at Rivers Casino “Brains Vs. Artificial Intelligence: Upping the Ante” tournament in Pittsburgh, Pennsylvania.
Backed by 600 nodes from the Pittsburgh Super Computer Center Bridge’s computer, Libratus played 200 hundred hours of poker over 20 days against four poker pros: Dong Kim, Jimmy Chou, Daniel McAulay and Jason Les. Nearly two years after its predecessor, Claudico, lost almost a three-quarters of a million dollars in chips in a similar competition, Libratus won decisively by more than $1.7 million chips (no actually payouts were made during this competition; the four human players split a $200,000 purse).
Winning aside, the accomplishment may be remembered as a turning point in the annals of AI.
While previous AIs have already beaten human pros in games like chess and, more recently, Go, these games are essentially tests of an AI’s raw computing power. Chess has many possible outcomes, and Go has even more. But poker is a different animal.
To master no-limit Texas Hold ‘Em, Libratus needed to approach hands the same way a human does: where a player bluffs or bets in a way that belies the actual quality of their hand. It’s the essence of fuzzy logic, where people’s actions and intents do not necessarily align.
Humans are afraid to do these kinds of actions; the bot is fearless.
Chess and Go are perfect information games, Libratus programmer and PhD student Noam Brown told me in an interview after Libratus’ win. The AI searches through all the reachable states to find the optimal path in the game tree, the best possible move.
Poker, on the other hand, is a game of incomplete information, since you don’t know what cards your opponent has, or, in Texas Hold ‘Em, what community cards will be drawn later in the hand. “You no longer know what state you’re in,” says Brown. “You know you’re in some number of states.”
The possible outcomes also depend, not only on the hidden and still-to-be-drawn cards, but also on how your poker opponent has played, so the idea of searching through a game tree no longer applies.
Libratus is also the first game-playing AI that did not observe human players to learn its game. “We gave it the rules of the game and we had the bot play against itself, starting from scratch.” Ultimately Libratus played trillions of hands before facing its first human opponent.
During that time, Brown said, Libratus had to come up with a new strategy for finding hidden information and hiding its own information to its advantage. The AI isn’t, though, exactly bluffing. “Libratus is always going to do what it thinks is going to make the most money possible,” he said. Thus, some of its actions could look like bluffing. “You could see the bot making really innovative moves like betting huge amounts of money on small pots,” he said. “Humans are afraid to do these kinds of actions; the bot is fearless.”
Going into the tournament, however, Libratus’s victory was far from assured. Brown said on Tuesday that he hoped the pros wouldn’t try as hard as they did. “They were very good at finding any weakness in the bot,” he said. Most of the pros said they thought they would do better. Choo called it one of the most challenging experiences in his life.
“Halfway through the challenge, we really thought we were going to win and we were beaten very soundly,” said McAulay, echoing other players who called the almost daily losses “demoralizing.”
Halfway through the challenge, we really thought we were going to win and we were beaten very soundly
What may be even more remarkable about Libratus’s historic win is that the computer behind it wasn’t even running at full capacity. With over 19 million core hours of computing and 2,600 terabytes of generated data, the tournament only used 46 percent of Bridges’s computational capacity, said Nick Nystrom, PSC’s senior director of research on Tuesday. While Libratus was besting four poker pros, the rest of the super computer was working on other problems like finding new cures for cancer and investigating next-gen nuclear power.
Libratus only competed heads-up with the poker champs, so it’s not clear if the AI would come out on top at a full table of nine people. However, Brown says it wouldn’t be that much of a stretch if there were more players and that he focused on two-player matchups because it was easier to measure performance.
Even though Libratus is a poker-playing AI, it’s not necessarily a “Poker AI.” “The research wasn’t focused on poker — the research used poker as a test bed for research,” Brown told me.
“The algorithms are actually game-independent,” said Professor Sandholm, adding that the AI’s ability to take any imperfect situation and output a strategy has implications for everything from negotiation and bargaining to military uses and some forms of finance.
Brown explained that while the team has yet to look at any specific areas where they can apply the AI capabilities, he has no doubt it can be used across a wide set of applications. “Any of them can be modeled as games of imperfect information and the algorithms can be applied pretty much out of the box.”