Google’s DeepMind lab has built an artificially intelligent program that taught itself to become one of the world’s most dominant Go players. Google says the program, AlphaGo Zero, endowed itself with “superhuman abilities,” learning strategies previously unknown to humans.
AlphaGo Zero started out with no clue how to win the game Go — a 2,500-year old Chinese game in which two players use black and white tiles to capture more territory than their opponents.
It took AlphaGo Zero just three days to beat an earlier AI program (AlphaGo Lee), which had resoundingly beaten world champion Lee Sedol in 2016. After 21 days of playing, AlphaGo Zero defeated AlphaGo Master, an intelligent program known for beating 60 top pros online and another world champion player in 2017. By day 40, AlphaGo Zero had defeated all previous AI versions of AlphaGo.
And it achieved all these victories without any human-provided strategies or game-playing knowledge. Google published their results this week in the journal Nature.
“The most important idea in AlphaGo Zero is that it learns completely tabula rasa — that means it starts from a blank slate and figures out for itself, only from self-play, without any human knowledge, any human data, without any human examples or features or intervention from humans,” said lead AlphaGo researcher David Silver in a Nature interview.
After watching their machine learn human strategies, Silver and his team watched AlphaGo Zero autonomously attain superhuman abilities.
“So what we started to see is that AlphaGo Zero not only discovered the common pattern and openings that humans tend to play… it also learned them, discovered them, and ultimately discarded them in preference for its own variance which humans don’t even know about or play at the moment,” explained Silver.
In May 2017, professional Chinese Go player Ke Jie (left) plays against Google’s artificial intelligence program AlphaGo.
Image: VCG via Getty Images
Google’s researchers used a “reinforcement learning” scheme to make AlphaGo Zero intelligent enough to learn on its own. Using a deep neural network — which is an artificial model of how human minds relate ideas and make the best possible outcome predictions — AlphaGo Zero made its own expert predictions and then learned from its errors.
Over the course of some 30 million games, AlphaGo Zero made an immense number of moves. This required around $25 million in computer hardware, according to Google DeepMind chief executive Demis Hassabis.
Now that AlphaGo Zero has dominated its world competition, Google thinks this unprecedented self-learning ability can be applied to other problems, without having to spend time and resources teaching the machine.
“If you can achieve tabula rasa learning, you really have an agent that can be transplanted from the game of Go to any other domain. You untie yourself from the specifics of the domain you’re in and you come up with an algorithm that is so general that it can be applied anywhere,” said Silver.
If the AlphaGo experiments are any clue, this sort of AI innovation could lead to “superhuman” thought being applied to other realms of existence — perhaps medicine or self-driving cars.
But according to DeepMind’s Silver, the aim is not to outpace humans; it’s for these intelligent machines to contribute to the sum of human knowledge.
“For us, the idea of AlphaGo is not to go out and defeat humans, but… for a program to be able to learn for itself what knowledge is,” he said.
from Mashable! http://on.mash.to/2gSc6KZ