8 years later: A world Go champion's reflections on AlphaGo
blog.google> Go is a deeply complex strategic game — famously far more complicated than chess, with 1,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000 possible board configurations.
The correct number of legal Go positions is over twice as much, or to be exact [1]:
208168199381979984699478633344862770286522453884530548425639456820927419612738015378525648451698519643907259916015628128546089888314427129715319317557736620397247064840935
Indeed far larger than the ~ 4.8 x 10^44 legal chess positions [2], that is in between the number of legal 9x9 and 10x10 Go positions.
All these digits are only making it more obfuscated. Using the order of magnitude it's 10^44 for chess versus 10^170 for Go. Thus, Go is 10^126 times more complex than chess.
For reference, the estimated number of individual atoms in the universe is thought to be between mere 10^80 and 10^83.
> All these digits are only making it more obfuscated.
I think that using these numbers as as stand-ins for difficulty is itself a form of obfuscation.
The truth is that, despite the massive number of potential board states, Chess and Go are some of the easier games to solve, thanks to their nature (perfect information, zero randomness, alternating turns where each player plays exactly one move). And trying to use board states as a proxy for complexity and complexity as a proxy for difficulty doesn't generalize to other categories of games. Compared to Go, what's the complexity of Sid Meier's Civilization? If I devise a game of Candyland with 10^180 squares, is that harder to devise an optimal strategy for than Go just because it has more board states?
The reason that we're still using board states as a proxy for difficulty is because historically our metric of "this is difficult for a computer to play" was based on the size of the decision tree and thus the feasibility of locally searching it up to a given depth. In the age of machine learning, surely we can come up with a more interesting metric?
We have more theory for deterministic games. That doesn't mean that it is harder to solve them to a human level.
Computers got to being better than humans in both Go and poker in 2016. The difference is that https://www.deepstack.ai/ was solvable by academics with normal research grants. The training for the final version of AlphaGo is estimated at about $35 million. And who knows how many other versions were created?
Yes, actually solving Go and Civilization are both impossible. But I would be shocked if playing Civilization at a human level was too hard for us to solve with current machine learning techniques.
Number of possible game states is a poor measure of complexity. How many game states does soccer or basketball have, when you consider flight of ball and movement of players? Does that tell us anything about whether basketball is more complex than Go?
Every measure of complexity is limited in some way. So they are all poor in various ways. That doesn't make them not worthwhile.
Total game states is one measure. What does it take to solve the game?
You can also look at the branching factor, how many moves are there to make on average. For chess, it's 20. For go, it's something like 300.
You can also look at how long it takes from a move to seeing its negative consequences. A bad move in chess is frequently visible in very few moves. You lose a piece. You lose an exchange. Often these sequences are virtually forced. By contrast, the consequences of a bad move in Go are usually not visible for 50 moves or more. And there is nothing forced about the sequence that gets there.
You can also look at how likely good players are to play similar games. Many chess games have been played over and over again. People sometimes play half the game out of a memorized opening sequence. By contrast, it is plausible that no Go game has ever been played twice on a 9x9 board. It is very unlikely that any Go gamme has been played twice on a 13x13 or 21x21 board.
You can also look at how big the skill gaps between humans get. In my experience, a 1 stone difference in Go is roughly a similar skill gap to 200 points of Elo in chess. A rank beginner who barely moves the pieces may have a 400 rating. No human has ever reached a 2900 rating. That's 12.5 levels. By contrast Go has 30 levels of amateur (kyo), another 9 for serious players (dan), and then the skill range among professionals is about another 3. That's 42 levels of fairly recognizable skill differences between humans. Which speaks to how much more there is to learn about Go than chess. (Even more so when you realize how much of advancing in chess is a matter of making fewer mistakes. By contrast advancing in Go is much more about integrating better principles.)
No matter how you look at it, Go is much more complex than chess.
One good measure is how much resources a computer program needs to have to play optimally. This is hard to measure as we don't have optimal programs but maybe "how much resources is needed to play better than the best human" is a sensible measure as well. Go wins on this one but only marginally.
> For reference, the estimated number of individual atoms in the universe is thought to be between mere 10^80 and 10^83.
Yes, but what are the estimated number of states of all these atoms?
is it even countable?
Or one could say that Go is about as complex as 4 simultaneous games of go, with the number of positions being about the 4th power.
Complexity of the game has nothing to do with the number of legal positions. It's very easy to design a game with arbitrary number of positions which is very simple. While go might be more complex than chess using a more reasonable measure this argument was used to for arguing nonsense in scientific papers in the past (that some poker games are more complicated than chess because they have more possible states).
I have a further nitpick regarding terminology.
Wording like “game is more complex” overal seems incorrect. Game is not complex by itself (for example go rules are extreemly simple), all the difficulty and challenge depends on the skill of your oponent. Game only allows the opponent to demonstrate the skill.
Beyond rule complexity, there are at least 5 measures of game complexity in Combinatorial game theory [1]:
[1] https://en.wikipedia.org/wiki/Game_complexityState-space complexity Game tree size Decision complexity Game-tree complexity Computational complexityThose are all meaningful measures of complexity, but it's worth noting that all of them are a function of the number of legal positions (among other things as well).
That would be quite a strange function. For example, Tic-Tac-Toe has 26830 possible games on 5478 possible positions, while 2x2 Go has 386356909593 possible games on only 57 possible positions.
The major difference of course being that Go allows stones to be captured.
I don't see what's strange about that, different functions grow at different rates. That's like saying it's strange that exp(x) is a billion for x = 21 while sqrt(x) is merely ~4 when x = 21.
Go's complexity grows much faster than tic tac toe's complexity as a function of legal positions, but the complexity is still a function of the number of legal positions (among other things, as I pointed out).
Are you sure that the complexity of a game has absolutely nothing to do with the number of legal positions?
I mean I am open to hear the justification for this, but I was fairly certain that all measures of game complexity are a function of the number of legal positions. Now certainly there are other factors, namely the cost of computing the transition from one legal move to another legal move so a simple game might have a very low cost transition function while a complex game has a very complex transition function, but I can't conceive of a game where the number of legal positions bears no weight on the game's complexity.
Take a game where you get to pick a single number between one and a billion. If you pick 10 you win. This has a billion states, but it's trivial. I can increase the bound above a billion, it doesn't matter.
State count gives an upper bound, though, to how complex a game can be, for sure.
Those position counts don't account for blunders or bad tactics which the players will take advantage of in order to simplify and win. In go, any move that lets your opponent cut you off is highly likely a blunder, so counting every other move other than the obvious one you have to play in order to not lose is pretty useless. You can also call it a bad move if you are not under an immediate threat and you are not invading. (this is a guess, I am not a go player) So counting the rest of the moves developing in your territory is also useless.
Yes, for example no limit Rock Paper Scissors where you bet an arbitrary amount and your opponent might call or fold is a game that:
1)has infinitely many legal states/positions
2)imperfect information (another argument often used to argue that game is more complex)
And is:
3)dead simple to play optimally
It's also easy to design a board game with arbitrary number of legal positions that is dead simple to play optimally.
Go is played in a a bigger board though and has this kind of recursive nature where a subset of a go game is also a go game while chess is more ad-hoc.
For extra fun, use the "Listen to article" feature on that paragraph...
To my knowledge AlphaGo models never became meaningfully available to the public, but 8 years later the KataGo project has open source, superhuman Go AI models freely available and under ongoing development [1]. The open source projects that developed in the wake of AlphaGo and AlphaZero are a huge success story in my mind.
I haven't played Go in a while, but I'm kind of excited to try going back to use the KataGo-based analysis/training tools that exist now.
Google's documentary on AlphaGo https://www.youtube.com/watch?v=WXuK6gekU1Y
Truly a must watch! (just look at the video comments to be convinced)
+1
What stuck with me is Lee Sedol's strong emotional reaction, leading him to leave professional Go playing.
It's understandable he didn't expect AlphaGo to be that strong. Or that (for him) losing to a machine took the 'soul' out of the game.
But come on... I've been cornered by Pac-Man ghosts many times. That doesn't make Pac-Man less fun to play.
Nor does losing to the crude 'AI' steering those ghosts. Instead, you play, aim for a high score, see how long you can survive, how many levels you can complete, or how many fruits & ghosts you can eat in a game.
And (if you care) compare how those 'metrics' stack up against other players.
If a machine with superhuman Go-playing ability isn't fun or challenging, then stick to human opponents.
Of course it's his views and choices, and I respect that. But other than providing extremely challenging opponent, I don't see how human-beating machine would take the fun out of a game. Rather the opposite: new tactics, new insights, a raised upper bound for a Go player's strength (human or otherwise), etc.
Some people connect their ego tightly to their accomplishments. Others are emotionally fragile to what they perceive as negative experiences. Some achieve great things by what they perceive as great sacrifices, and have a lot of difficulty maintaining what they have achieved. Some at or near the top of a field perceive their accomplishment primarily as being "best", and are consumed with angst when they are no longer best. The public explanation they give may not describe how they are different than someone trying to understand.
Open up how you frame personality types and life experiences, and you can think of possibilities beyond "I don't see how".
There's a plausible explanation even without ego/emotions:
In the world of go, there was an obsession with finding "the perfect move". This was a significant motivation for the players.
That is now completely gone: if you want to find the perfect move, ask a computer.
He wanted to advance our knowledge of the game, to find better moves. Now that he no longer can, why pay?
I know there are Go channels in Korea for watching professional matches... are there any for watching AIs face off against each other?
Virtually every public Go server at the moment, unfortunately.
Michael Redmond, to my knowledge the highest ranked English-speaking Go player, does youtube analyses of Go matches, including Pro vs AI games, which are very insightful.
I can't believe it's been 8 years.
Great PR for Google from Lee! It totally isn't mostly for advancing Google's commercial interests, the bottom line being:
"I believe that humans can partner with AI and make great progress. As long as we can set clear principles and standards for it, I am quite optimistic about the future of AI technology in our daily lives."
I hope he got paid well.
I thought it would be an easy victory
I ... ended up only winning one out of our five games
It's interesting, how an expert in a field can be unaware of how AI is taking over. And a few years later, no human can compete anymore.I think we are in a similar situation in multiple professions today. For example with self-driving.
Musk recently said, that other car manufacturers are not much interested in talks about licensing FSD because they don't think it can work.
In ten years, probably no human can compete with AI drivers anymore.
> In ten years, probably no human can compete with AI drivers anymore.
That's what they said 10 years ago. Sooner or later people will say it and be right, but the last few percent of any problem is a lot harder than people give it credit for. It may not be that hard to stay in a lane or write a little code, and that may look like it's doing most of the job, but those common tasks are just the easy part.
10 years ago, a lot of FSD was still manually written code. Manually written code gets harder and harder to improve, the larger the codebase.
Now it is all NNs and therefore will scale with more data and more compute. Which is increasing exponentially. So far it seems like they are not hitting diminishing returns.
Yeah, and plain Q learning is able to iteratively improve a policy in any environment. Every single loop leading to improvement, and yet it hasn't really solved much of anything since the 60s (just some toy problems).
My point being, we don't know where the asymptote lies. Computers have had self improving algorithms since the 60s, and people have been making the same bold claims, like you, that because an iterative process for improvement has been discovered, we're close to super human AI since the 60s too.
“Driving” is solved. Driving with humans on the road - doing unpredictable human things - is far off still.
My guess is industrial and home robotics will solve a lot of the “doing things around humans” problems in the next ten years.
Why the hell people decided to automate giant death machines before perfecting small things never made sense to me.
Fwiw, I work in home robotics, but have no experience in self driving. My halfway-naive belief is that self-driving is easier than getting useful home robots —in fact I feel it’s not even a close comparison. Some reasons:
- The home is a very unstructured environment, whereas roads have at least _some structure_, and perhaps ~70% of the most useful roads even have clear lane markings and other signs.
- People already know that roads are dangerous, and there’s an expectation that babies won’t suddenly crawl in front of cars. This doesn’t exist in the home
- People are more comfortable being recorded on roads and highways than in their own homes, so you can get training data more easily for self driving.
- to do something useful in the home, imo you need to solve navigation _and_ complicated manipulation problems. For self driving, you only need to solve the navigation problem.
- (this is speculation on my part) Customers will happily pay 10k-20k extra for a self-driving car, and there are industries in which even more cost makes sense. Customers are less likely to pay that for a robot that does your chores
Would be very interested to hear the perspective of someone that works on self-driving
>to do something useful in the home, imo you need to solve navigation _and_ complicated manipulation problems. For self driving, you only need to solve the navigation problem.
Right. It can be challenging to figure out how fast, what lane, should I brake, etc. in many cities. But there are really only a few things the car can control. And its objectives are pretty simple: Obey the law, don't hit anything (and avoid being hit), and get to point B.
By contrast, think of all the different types of manipulation you need to clean up around the house and the 100 judgements you make you decide what needs to be cleaned--which will vary by person.
>Customers will happily pay 10k-20k extra for a self-driving car, and there are industries in which even more cost makes sense. Customers are less likely to pay that for a robot that does your chores
It would be at least an upper middle class purchase at that level but it depends how generally useful it was. People pay thousands of dollars a year for a housekeeper to come by.
Yes this is my point - the home is a hard place to operate in but less potential for lethal outcomes. If we can solve home robotics I think cars would be easier.
Also, a robot that replaces a housekeeper would have a huge market. I’d pay a handsome sum to have perfectly cleaned kitchen and bathrooms every day when I wake up.
For clarity, I’ll call out the areas where I think we disagree:
> “the home … [has] less potential for lethal outcomes.”
I don’t think this is true. Roads already have systems in place to make them safer, and people are aware of the dangers. This isn’t the case at home, and useful home robots certainly have the ability to cause serious injuries/deaths
> “If we can solve home robotics I think cars would be easier”
I also think cars are easier. However, I think this is _why_ we’ve made more progress towards solving self driving.
> “I’d pay a handsome sum to have perfectly cleaned kitchen and bathrooms every day when I wake up.”
When you say “perfectly cleaned rooms”, I think “better than you can get with a 90th percentile hired cleaner”. I suspect useful home robots might be 10 years out, but I’m doubtful we’ll get “perfectly cleaned rooms” from a commercial home robot and using the above criteria within even the next 50 years. Maybe controversial, but I think AGI might be easier, lol
My main thing with road safety is the presence of giant dangerous SUV which one has no control over. At least I can control what is or isn’t in my home, on the roads some asshole driving their Cybertruck at 40 mph over the limit will annihilate my hatchback. Point taken regardless, but I still worry more about cars than anything in my home.
Otherwise I have a small child in the house, so I’d be grateful for 1 percentile capability at the moment. ;-)
Thanks for your thoughts tho, I think we can agree future seems interesting at the least.
> Customers are less likely to pay that for a robot that does your chores
They will if you can get them 2% financing like I can get on a new Honda HR-V.
> “Driving” is solved. Driving with humans on the road - doing unpredictable human things - is far off still.
Plus there's serious questions about liability with self driving cars which are still unresolved in most of the world - if the goal is to have vehicles operate themselves with no human supervision, who goes to jail when they kill someone? Despite all of the progress that's been made with AI it's mostly been in low-stakes problems where failure isn't a big deal, so we don't have a consensus on what we're supposed to do when a neural network negligently obliterates a person because some logistics company wanted to save a few bucks on driver salaries.
The answer almost certainly has to be the manufacturer. I'm sure not responsible if my properly maintained and used self-driving car kills someone. That said, it's a novel area that doesn't have a clear analog to other products today.
There's also the question of incident response, if a human driver "malfunctions" you take them out of service and the rest of the world keeps going, but if a self-driving model malfunctions there are potentially millions of vehicles running the same software ready to make exactly the same mistake until the issue is isolated and fixed. Should we ground the entire fleet of vehicles running that software until the issue is resolved and software re-certified, if the software is demonstrably dangerous? How much would that cost?
“Driving” is not solved unless you mean perfectly paved streets in perfect weather on empty streets with no pedestrians. A competent solution like Waymo can handle significantly more complex cases at real world levels of complexity, but it is still unclear how comprehensive and robust that really is across the massive complexity of reality even without other cars on the road. There is simply not enough data, and no independent audits yet.
It is prudent to remain cautiously optimistic that the evidence will bear out in time, but not assert unsupported claims.
Precisely my point, I’m talking about the DARPA grand challenge era of “look this car drives itself” being the “solved” part. I’d you cleared all the roads and left street signs and stoplights I’m sure most self-driving cars would be fine.
People got way overconfident once the grand challenges were accomplished.
I think your guess that home robotics will be solving problems before self-driving cars git gud will be disproven (industrial robotics have been delivering value for five decades at least).
Home robotics has to solve two problems: the robot and operating the robot ~perfectly. Self-driving cars already have cars, which are waldos, if you squint. What sort of sensors should be added is up for debate but the actuation mechanism is a solved problem, and a very simple one, cars have three linear inputs and two binary ones for the turn signals. Technically a few more but none of them are any less trivial.
There's less risk of a fatality when Rosie Robot knocks over the vase you inherited from your grandmother, but people are no more tolerant of that kind of failure in home robots than they are in cars.
And cleaning a house isn't one task. It's a whole slew of different tasks which, given some basic instructions, my housekeeper can handle easily without supervision. And there's quite a bit of common sense required.
Driving the giant death machines costs us billions of manhours every day. Hundreds of billions of manhours per year.
Which small thing puts a similar burden on mankind?
Phones.
I think Waymo does better than a vast percentage of human drivers. It’s here already.
But the average hour is not driven by the average driver, better drivers drive more hours, so it has to be better than the average driver in order to result in fewer incidents.
There’s no reason better drivers drive more hours. You can be a shit semi driver. That being said, it’s an irrelevant comment. If it’s at average now this is the worst it can get and it’s already way above average.
I don’t disagree that self driving is good and will eventually get there. Just pointing out a flaw in your reasoning.
Yes, such a person can exist, the hypothetical counter example doesn’t disprove the general statement. I think it’s safe to say that in general the more time someone spends driving the better driver they are.
> It's interesting, how an expert in a field can be unaware of how AI is taking over.
I think the interesting thing is how an expert in a field is wholly unprepared for predicting how the future will develop.
You mention what Musk has said about FSD and how it will completely take over in just ten years, but I feel compelled to point out that Musk has said that it's just right around the corner with only small challenges left, for many years.
I wouldn't place any faith in anything Musk says.
Musk wasn't wrong - he just wasn't the person to deliver it. Waymo, AFAICT works astonishingly in San Francisco. I think you can argue that it might not work in the snow, but that's pretty much it
Musk was wrong. He said specifically and unequivocally that Tesla would be delivering fully autonomous vehicles to customers within a year, every year for the last 8 years.
Just to get ahead of anybody claiming it was not a firm promise, in 2019: “ I think we will be feature-complete full self-driving this year, meaning the car will be able to find you in a parking lot, pick you up, take you all the way to your destination without an intervention — this year. I would say that I am certain of that. That is not a question mark." [1]
See that part where he says: “I would say that I am certain of that. That is not a question mark.” That is called a firm promise.
[1] https://www.businessinsider.com/elon-musk-doubles-down-on-cl...
If Musk was right we would've been there almost a decade ago.
And we're still not really there yet. They work great during some conditions and in certain areas, but they're still nowhere close to making human drivers obsolete.
At the time, I think everybody was unaware of this. Everybody followed the development of machine chess, but it was widely assumed that machine go was an entirely different category of difficulty. Chess engines gradually encroached on the very top grandmasters. AlphaGo came out of nowhere.
> It's interesting, how an expert in a field can be unaware of how AI is taking over. And a few years later, no human can compete anymore.
Easy to see that in hindsight, but when the game was actually played it was earlier in the development of AI and less apparent how good it had become.
At the time, the only public results were demonstration games against a much weaker professional. The actual strength of the machine was only known privately within DeepMind.
You do realize this was 8 years ago, and no Go engine came even close to what Alpha Go was able to do right? Afaik, there weren't even any competitive engines period. It basically came out of nowhere.
Not entirely right. Remi Coulom's Monte Carlo Tree Search, in 2006, was the first really big discovery. It didn't make engines good enough to beat the best humans, but it steadily made them good enough to beat 99% of go playing humans, playing at up to 6 - 7 dan level. It was still part of AlphaGo, too (though as I recall AlphaGo Zero did away with it).
I think AlphaZero does use MCTS?
The thing AlphaZero did away with, AFAIK, is supervision with expert games. Instead it just knows the rules and tries to win.
That's what I meant.
That's how it usually goes with technological progress.
In any field.
Progress is minimal for a few years and then suddenly jumps up very suddenly.
So to predict what's coming, you can't just extrapolate the progress of recent years. You have to account for it being exponential with a very uneven distribution of sudden jumps.
I'm not sure one can, from today that is, really understand how huge of a leap was made by AI at this time.
Even going back to the closest analogue, chess, there were good chess engines for a long time prior to Kasparov loosing in 97 to deep blue. Even before Kasparov lost Chess engines were pretty good, just look at the game in 96 when Kasparov won. A grand master would still need to put some thought into how he played.
In Go however even the best engines couldn't hold a candle to a professional player, let alone someone who was the equivalent of a chess grand master. Hell, even as a lowly amateur player I was able to trounce some of the most powerful AIs at the time. Looking at some of the Pro vs AI games back in the early 2010s it's almost painful how bad they were.
It's hard to communicate just how huge of a leap this was, and just how shocking to the whole Go community. It would be like a child one day being unable to speak and the literal next day reciting Shakespeare.
AlphaGo took many AI researchers by surprise. An even bigger surprise came next year, with AlphaZero:
"AlphaZero was a reinforcement learning system that was able to master three different perfect information games - chess, shogi (Japanese chess), and Go - at superhuman levels by just learning from self-play, without using any human expert games or domain knowledge crafted by programmers.
Its predecessor AlphaGo, which defeated the world champion Go player in 2016, was revolutionary but relied on human expert games and domain-specific rules coded by the DeepMind researchers.
AlphaZero started from random play and used a general-purpose reinforcement learning algorithm to iteratively improve its gameplay through self-play, ending up with superior performance compared to the best human players and previous game-specific AI systems.
Many experts were stunned that a general algorithm could rediscover from scratch the millennia-old principles and strategies for these highly complex games, often discovering novel and counterintuitive moves along the way."
In any field.
That would be pretty strange. For a trivial counterexample, you can look up the history of integrated circuits from invention to today.
FSD will never work because it concentrates defendants into single juicy target.
When your neighbor Bob (who is still paying mortgage and his wife is battling cancer and who occasionally babysit your kids) ran over your cat, you don't sue him. But you would sue Tesla.
Mark my words, in 10 years ex-programmers will be throwing shelter cats under FSD cars just to earn a living.
It will have 100 cameras capturing surround footage of ex-coders throwing cats at the vehicle. Case dismissed.