Settings

Theme

Player of Games

arxiv.org

364 points by vatueil 4 years ago · 241 comments

Reader

captn3m0 4 years ago

If you are interested in this, I maintain a list of boardgame-solving related research at https://github.com/captn3m0/boardgame-research, with sections for specific games.

This looks really interesting. It would be a good project to test this against a general card-playing framework to easily test it on a variety of imperfect-information games based on playing cards.

  • fho 4 years ago

    I tried my hand once or twice at (re-)implementing board games [0], so that I could run some common "AI" algorithms on the game trees.

    What tripped me up every time is that most board games have a lot of "if this happens, there is this specific rule that applies". Even relatively simple games (like Homeworlds) are pretty hard to nail down perfectly due to all the special cases.

    Do you, or somebody else, have any recommendations on how to handle this?

    [0] Dominion, Homeworlds and the battle part of Eclipse iirc.

    • anonymoushn 4 years ago

      Dominion and Homeworlds are pretty complicated! Maybe you can start with a simpler game like Splendor.

      In my 2-player Splendor rules engine, the following actions are possible:

      1. Purchase a holding. (90 possible actions, one for each holding)

      2. If you do not have 3 reserved cards, reserve a card and take a gold chip if possible. (93 possible actions, one for each holding and one for each deck of facedown cards)

      3. If there are 4 chips of the same color in a pile, take 2 chips of that color. (5 possible actions)

      4. Take 3 chips of different colors, or 2 chips of different colors if only 2 are available, or 1 chip if only 1 is available. (25 possible actions)

      5. If after any action you have at least 11 chips, return 1 chip. (6 possible actions which are never legal at the same time as any other actions)

      This still doesn't correctly implement the rules though. In the actual game, you'd be allowed to spend gold chips when you don't need to, which would make purchasing holdings contain extra decisions after you pick which holding to purchase about which chips you'd like to keep.

      • fho 4 years ago

        I actually played Splendor for the first (three) time(s) some time ago and honestly didn't really like it. It's a very simple game, true. I feel like there are not many decision points for me as a player and therefore there is not much strategy involved. But maybe that is just my view after very few games.

        (At the same time that probably makes it a good choice for a game implementation)

        Thing is that for all my examples above I had a "good" reason to implement that specific game:

        1. Dominion (shortly after it came out) To evaluate strategies to best my friends (obviously). 2. Eclipse Has a nice rock-paper-scissors type of ship combat, where you can counter every enemy build (if you have enough time and resources). Calculating the odds of winning would be interesting. 3. Homeworlds Seems to be a very fascinating game. But without any players to compete with [0] ... AI to the rescue ;-)

        [0] I am aware of SDG where I could play online, but that site is in decay mode. Getting an account involved mailing the maintainer and those times I tried to start a game no players showed up.

        • henshao 4 years ago

          I think splendor gets more interesting if your opponents are also trying to be strategic. You can see what color chips they are picking up, which lets you know what they are aiming for, which influences what card you want to aim for or reserve. Mid game, you can see what colors other people are missing and try to corner colors to give you room to breath and pick up cards. You can also see the set of colors people are holding to see which of the 4 final bonus point cards are being fought over.

          I like the game for what it is. I'd say, surprisingly strategic.

          • piyh 4 years ago

            It's like if you turned tuning magic mana bases into a stand alone game.

    • captn3m0 4 years ago

      +1 to boardgame.io. It provides very good abstractions for turns, phases, players, and partial information. I’ve implemented small games with a few hours of effort, and that includes a UI.

      • penteract 4 years ago

        It's a good set of abstractions, but I've found that the system used for immutability (immerjs) carries noticable performance costs (a factor of more than 2), to the point that it was faster to make a mutable copy of almost all the gamestate at the start of the 'apply move' code.

        • mathgladiator 4 years ago

          For Adama ( http://www.adama-lang.org/ ), I am using a mutable tree with two copies and then built transactions such that I can emit deltas on the parts that change. I have all the benefits of immutability without the cost PLUS I have a cheap undo/redo log.

    • iwd 4 years ago

      If you’re doing it for fun, one option is to start with a simplified version of the game. It’s faster to implement and faster to run. And you’ll get insights you can apply to the full game.

      That’s what I did when I applied RL to Dominion, because the complexity of the game depends heavily on the cards you include! See part 3 of https://ianwdavis.com/dominion.html

    • LeifCarrotson 4 years ago

      > What tripped me up every time is that most board games have a lot of "if this happens, there is this specific rule that applies". Even relatively simple games (like Homeworlds) are pretty hard to nail down perfectly due to all the special cases.

      The key is to build a data-driven state machine, rather than writing logic with a bunch of 'if' statements.

      • mathgladiator 4 years ago

        You are correct, but some games can yield exceptionally complicated state machines.

        I designed a language for solving this: http://www.adama-lang.org/

        I get all the benefits of a data driven state machine with the simplicity of a language that supports asynchronously asking users for updates.

      • fho 4 years ago

        I am "camp Haskell", so my approach was pretty much data-driven. But what is a state machine if not a big nest of if-else statements? :-)

    • nicolodavis 4 years ago

      You could consider using a library like boardgame.io for this.

    • mathgladiator 4 years ago

      I'd appreciate you checking out my language and providing feedback. An element that helps is building a stateful server and using streams where the people behave like servers:

      http://www.adama-lang.org/

  • JoeDaDude 4 years ago

    Thank you for posting! Maybe you can include the game of Arimaa [1]. Arimaa was designed to be hard(er) for computers and level the playing field for humans. Algorithms were developed eventually, though I have not kept up to know where that stands today.

    [1]. https://en.wikipedia.org/wiki/Arimaa

  • mathgladiator 4 years ago

    Thanks for this! I'm currently designing a language for complex board games like Battlestar Galactica: http://www.adama-lang.org/

    Something that I found amazing was inverting the flow control such that the server asks players questions with a list of possible choices simplifies the agent design tremendously. As I'm looking to retire to work on this project, I can generate the agent code and then hand-craft an AI. However, some AIs are soooo hard to even conceptualize.

  • majani 4 years ago

    Imperfect information games will always have a luck element that gives casual players an edge. That's basically the appeal of card games over board games.

  • alper111 4 years ago

    This looks very good, thanks.

sdenton4 4 years ago

This is clearly part of DeepMind's long-game plan to achieve world domination through board game mastery. Naming the new algorithm after the book is a real tip of their hand...

https://en.wikipedia.org/wiki/The_Player_of_Games

  • sillysaurusx 4 years ago

    The abbreviation is PoG too. I bet that was totally on purpose. At least one person in Brain is a dota player, so you better believe they watch twitch.

    Funny that most of the comments are about the name. What an excellent choice.

  • chrisweekly 4 years ago

    PSA: The "Culture" novels by Iain M Banks are fantastic and can be read in any order. "Player of Games" was the 1st one I read and still probably my favorite.

    • bduerst 4 years ago

      Player of Games is the second book, and the one I recommend people start The Culture series with.

      The first book Consider Phlebas isn't bad, but it isn't as well developed as the rest of the series IMO.

      • hesperiidae 4 years ago

        It's a great starting point, since not only is the story both fun and interesting, but it also shows what the Culture's values and methods are in a very satisfying way by juxtaposing them against the Empire through the tournaments of the latter's own game.

    • bewaretheirs 4 years ago

      I keep hearing recommendations for the Culture books so I tried reading it recently and it just didn't work for me -- I gave up on it halfway through, which is rare for me.

      • Jordanpomeroy 4 years ago

        They are a slow burn, but the ends always justify the means with those novels. If you really did make it 1/2 way, I’d encourage you to go back and finish reserve judgment.

        • bewaretheirs 4 years ago

          Not going to pick it up again.

          My reaction is perhaps summarized by the old quip: "This is not a novel to be lightly tossed aside. It should be thrown with great force.".

          It wasn't any one thing but I eventually hit a point where continuing just wasn't worth it for me. There's too much else on my to-read shelf which is less likely to seriously annoy me.

      • mrslave 4 years ago

        Me too with Consider Phlebas. Then I hit the Alastair Reynolds novels pretty hard and now I'm stuck for new material. Dune is en vogue so perhaps that's the right read next? I really enjoyed Vernor Vinge's A Deepness in the Sky but couldn't quite get into A Fire Upon the Deep but it still sits on my shelf taunting me.

        • CRConrad 4 years ago

          > read next?

          Stephen R. Donaldson: He's more known for The Chronicles of Thomas Covenant, but you might prefer The Gap Cycle; it's more typical SF.

          Also, the Night's Dawn trilogy by Peter F. Hamilton. Perhaps Dan Simmons? And of course anything by Charlie Stross.

      • pault 4 years ago

        Which one? They each have a unique feel and setting.

    • kmtrowbr 4 years ago

      Yes! I love this one. It's my favorite too.

  • 7thaccount 4 years ago

    Pretty amazing book. I wish I could play a board game like that as well.

    • automatic6131 4 years ago

      I always imagine the board game as essentially being SM's Civilisation but really, really good in an indescribable way - with some card games inbetween.

      • 0_gravitas 4 years ago

        I believe Banks himself said that he used to play Civ and took some inspiration from it

        • bduerst 4 years ago

          Definitely for his later books, but Player of Games came out three years before the first Sid Meier's Civ game.

          What cool is that Paradox's Stellaris, a civ-in-space game, definitely takes pages from Ian's Culture series.

    • stavros 4 years ago

      I second this, it was excellent. I've only read a few Banks books, but this was my favorite.

      • arvinsim 4 years ago

        I started with Consider Phlebas but stopped because it seems too slow for me.

        Does it get better in the later chapters?

        • vermilingua 4 years ago

          It does, but IMO it's probably worth reading The Player of Games or Use of Weapons before it anyway. With the exception of perhaps Surface Detail, none of the Culture books rely on any others. Consider Phlebas gives a good view of The Culture from "outside" (the perspective of the Idrians) but is quite slow.

          • DylanSp 4 years ago

            I mean, Player of Games has a pretty slow start too. I love that book, but the initial pacing is IMO its biggest flaw.

            I know Use of Weapons doesn't depend on any of the other books for its plot, but is it a decent intro to the setting? If it is, that's where I'd recommend starting.

            • hesperiidae 4 years ago

              Yeah, it does take its time to build up, but in my opinion that just lays a stronger foundation for the latter half(-ish) of the book.

              Use of Weapons is more than a decent intro, but I'd still personally recommend The Player of Games since it isn't as deep and heavy in comparison, and the narrative structure is simpler.

              Of course, YMMV, but I started with TPoG, and reading UoW right after it was absolutely fantastic. I guess my biggest concern with recommending UoW as the starter would be that it might diminish TPoG, which I'm fond of, but I don't know if it actually would, since they're connected pretty much only bound by the setting.

            • avemg 4 years ago

              I've read (in order) Consider Phlebas, Player of Games, Use of Weapons, and Excession thus far. Use of Weapons was the toughest one for me to get through so far. I started it and stopped it a few times over several years and just couldn't get past the halfway point. I eventually got over the hump with it and devour the last half of the book over a couple of days (which is fast for me). So for my money, Use of Weapons is a bad starting point.

              My favorite by far is Excession but I don't know that I'd start there. I think the payoff of getting a story from the perspective of the Minds is better appreciated after you've heard about them and their capabilities from a distance in the preceding books.

              My pick would be to start with Player of Games. That's the one that was a page turner for me nearly from the jump.

        • dgritsko 4 years ago

          I started with Consider Phlebas because I wanted to see why people raved about the Culture series. I found it kind of tedious and slow, and although I finished it, I wondered what all the hype was about. I'm thankful that I picked up the second book though (Player of Games) - because I couldn't put it down; it was fantastic. I've stuck with the series since then (Excession was another highlight). I'd would like to revisit Consider Phlebas at some point, I think I might enjoy it more now that I have more context for the story.

        • gpderetta 4 years ago

          I enjoyed Consider Phlebas a lot, but it is very different from the rest of the series.

        • gman83 4 years ago

          I couldn't get through the cannibal part of Consider Phlebas, was soo weird.

          • wiredfool 4 years ago

            There's one scene or so in each one of his books that's just too much for me. I just don't need to donate brainspace to that sort of thing. (Use of Weapons has one, Song of Stone too.)

            I like 80% of his work, 15% is a pointless depressing slog, and the other 5% is just too much for me.

            • bewaretheirs 4 years ago

              That confirms my decision to abandon reading the series after I hit a spot like that in Player.

          • wishinghand 4 years ago

            I guess I just took it as another possibility in a society of nearly infinite ones. I did use material from that encounter in running a horror RPG, so in a way I'm kind of thankful for it.

          • DylanSp 4 years ago

            Yeah, that was just...out-of-nowhere gruesomeness.

        • sidibe 4 years ago

          Use of Weapons and Consider Phlebas are the worst of the series IMO. I powered through Consider Phloebus just because I knew people loved the series, but there's really no reason to start there

          • rishav_sharan 4 years ago

            Sacrilege! Taste is subjective, but Use of Weapons, imo, is Banks best work. I personally consider it one of the best SciFi, period. It's been years I last read it, but that ending still gives me shivers whenever I think of it

          • NoGravitas 4 years ago

            Use of Weapons is generally considered one of the best, but it has a complex narrative structure that makes it a harder read, and it's probably not a good place to start.

          • arethuza 4 years ago

            Personally, I think Use of Weapons is by far the best of the Culture series - although I admit it took a few readings for me to get to that view...

            • User23 4 years ago

              He does a really good job exploring the theme "what is a weapon?"

              • hesperiidae 4 years ago

                Yes! And also, "why is a weapon?", "how is a weapon?" and "what isn't a weapon?"

                Really, Culture is a great setting for discussing a lot of interesting philosophical questions and topics.

        • apetersonBFI 4 years ago

          Consider Pheblas wasn't the best in my opinion. Player of Games is my favorite, but some of the other later ones are good.

          I enjoy the settings and the concept of the Culture more than the plotlines.

          • hesperiidae 4 years ago

            The Player of Games is my favourite exactly for the same reason: it explores the Culture in contrast to the Empire, and even the drama is just an expression of the clash between the two structures.

        • baq 4 years ago

          Consider Phlebas is easily the worst of the series. My top 3 in no particular order are Use of Weapons, Player of Games and Excession.

          • DylanSp 4 years ago

            Curious that you put Consider Phlebas behind Matter (my least favorite, by far). My favorite is probably Look to Windward, closely followed by Player of Games and Use of Weapons.

        • smiley1437 4 years ago

          If you find Consider Phlebas too slow, try Excession before giving up on Banks.

          • speed_spread 4 years ago

            Eeh, Excession is very good but a still bit hermetic for an introduction to the Culture. It's the only other book I wouldn't recommend as a first along with Consider Phlebas.

  • WithinReason 4 years ago

    "In 2015, two SpaceX autonomous spaceport drone ships—Just Read the Instructions and Of Course I Still Love You—were named after ships in the book, as a posthumous tribute to Banks by Elon Musk"

  • 6510 4 years ago

    The end game is pinball and we are the balls.

sfkgtbor 4 years ago

I really like seeing references to the Culture series when naming things:

https://en.m.wikipedia.org/wiki/The_Player_of_Games

  • CobrastanJorji 4 years ago

    Allusions are fun and all, but I disagree. These are important problems that a lot of people have put their whole careers into researching. Silly names like these lack gravitas.

    • sjg1729 4 years ago

      Always sad to see these projects suffer from A Shortfall of Gravitas

    • moritonal 4 years ago

      Sorry, to explain the joke. The ships name themselves, and when they pick jokey names they're often mocked by the humans (which are in every way essentially ants to the spaceships) for not having enough gravitas. So the ships start naming themeselves things like the "Death-ray 9000 super-killer deluxe", to essentially take the piss.

      Funnily enough you can see the exact same effect in principal game-engineers or computer-hacking.

      • robbie-c 4 years ago

        I believe the user you are replying to was also joking, given that many of Banks' ship names reference the g-word

        Edit: if not that's even more amusing

        • marvin 4 years ago

          I'll make a minor contribution to the discussion by mentioning the Culture ship normally referred to as the Mistake Not..., which is shorthand for

          "Mistake Not My Current State Of Joshing Gentle Peevishness For The Awesome And Terrible Majesty Of The Towering Seas Of Ire That Are Themselves The Milquetoast Shallows Fringing My Vast Oceans Of Wrath".

          Unsure if this name is also a sarcastic stab that the lack of gravitas in ships' names, but regardless it's very sad that Banks died young :(

          • hesperiidae 4 years ago

            Yeah, it's making fun of the human desire for gravitas when it comes to ship names, since it's just exponentially more and more over the top.

    • ZeroGravitas 4 years ago

      Very little Gravitas Indeed.

    • 0_gravitas 4 years ago

      indeed

    • gremloni 4 years ago

      If anything the caliber and lore of the series gives the project an incredible amount of gravitas. Plus the scheme is just plain beautiful in my opinion.

  • doctor_eval 4 years ago

    I suppose it's better than "Use of Weapons".

    • OneTimePetes 4 years ago

      Why not have a seat, take that chair over there.

      • _0ffh 4 years ago

        One of the best, and executed to perfection! You can sort-of-see the point coming for a long, long time in the book, as he gradually builds the suspicion by dropping the occasional hint here and there, but it's always so that it must remain a highly uncertain speculation until he drops the reveal. Just the right balance between "How should I have suspected that?" and "Those hints were too much on the nose!".

        • OneTimePetes 4 years ago

          Its such a crime - of war and all else, its like a blindspot of imagination. That a man would do such a thing - to what is essentially family, as tactics.. the horror..

  • dane-pgp 4 years ago

    I think it is also a reference to "PogChamp", although it's disappointing that PoG apparently wasn't evaluated against the Arcade Learning Environment (ALE) corpus of Atari 2600 games.

    • abledon 4 years ago

      much more refined to think a spam of "POG!" stands for Player of Games when reading twitch chat

  • Borrible 4 years ago

    Banks should have named one of Culture's General System Vehicles 'Don't be Evil'.

    https://theculture.fandom.com/wiki/List_of_spacecraft

  • hoseja 4 years ago

    Kinda ironic since in the novel, a human player is better than the strong AI (albeit a little inexplicably).

    • pharmakom 4 years ago

      No he is not, but AIs are not allowed in the competition the story centers around.

      • hoseja 4 years ago

        Near the end of the competition, as he is deep in his analysis, the light craft AI gives up on helping him since it gets overwhelmed. Granted it's not a full Culture Mind (kinda hazy, been a while) but still a point for the meatbag.

        • arlort 4 years ago

          I always interpreted the end reveal as showing that control was highly confident of both the outcome of the game and of how Gurgeh got to that outcome.

          It's been a while but I am pretty sure that the ship lied when saying that it got overwhelmed and did so only because it was confident he was on the right path but needed to get there in a specific way which wouldn't have worked quite the same if the ship intervened

          • bduerst 4 years ago

            Yep, basically the nebulous, unknown minds of Control predicted the main character would win, and set up as many conditions as possible to push him to do so. Including bluffing about help from the AI.

            It was part of an even bigger game but I'm not going to get into spoilers.

          • hesperiidae 4 years ago

            Yeah, he wouldn't have reached such a good solution with help, and that was also originally taken into account by the Culture when they sent him out in the first place, since they knew him that thoroughly.

        • pharmakom 4 years ago

          I think the main character can be so strong at the game by the end because of his immersion in Empire culture. The ships AI would likely be at least as strong with the same experiences. Plus, as you mention, the ships AI is not smartest AI around.

    • 7thaccount 4 years ago

      I thought the protagonist wasn't nearly as talented as the culture AIs (even the ones that are not all that powerful)?

      • thom 4 years ago

        Is that clear from the text? Gurgeh supposedly perceives the result of the last game before the AIs so we’re led to believe he’s seeing deeper. Obviously he could have been wrong and still won. The AIs lied to and manipulated him the entire time so it’s hard to know, but it would seem a very odd weakness for an AI to have. I think Banks pretty quickly recanted on the subject of the Culture’s ‘referrers’ but I don’t think he plays a full Mind, so it’s not a clear cut conversation.

        • joshuamorton 4 years ago

          My recollection is that by the end of the novel its clear that Gurgeh was never competitive with the ship, although he might have been competitive with his security drone (although even that isn't clear, since <spoilers> imply that the security drone is a better game player than it pretends to be).

          To me it felt like the whole point of the novel was that Gurgeh was a piece in an even larger game and he didn't even realize it. So the idea that the people playing the "bigger" game couldn't compete in the smaller game seems silly, and I think they mention that they used Gurgeh instead of an AI to make it appear fair to the inhabitants of the planet.

          • hesperiidae 4 years ago

            I agree with you that Gurgeh was just a piece getting manipulated and that that was the point, but Gurgeh was still the best piece that they could use for the job.

            The Culture is (in this story) pretty much only bound by their own constraints. They chose Gurgeh for the role, since he had enough skill and talent to actually be able to accomplish the Culture's (or the SC's, winkwink) objectives without having the whole thing being taken over by an AI.

            The Culture worked very much like the PoG this thread is about: it minimised potential loss and considered the constraints it had to get the best possible outcome.

            The Culture is mostly constrained by only ethical rules, which, admittedly, can get flexible, especially with regards to the SC. The practical restrictions, like it being easier to send one capable human than to conquer a small galaxy, are in my mind lesser in comparison.

            As such, I think they got the most out of the operation, just by being confident in their assessment of a single human who played games good. And there's absolutely no reason to believe that the overminds that guide the Culture can't model human behaviour down to the smallest variable, especially considering how augmented humans are in the Culture.

            I'm also 100% onboard the idea that all the drones could outplay Gurgeh in a blink in any game, intuition be damned.

            • randomswede 4 years ago

              A non-drone player was necessary, as Azad would never have acknowledged defeat by a drone. But, defeat by a human accomplished the Culture's goal and Gurgeh was the best bet (or at least the best available one) on the desired outcome.

              I see their selection as mostly being about Gurgeh having enough pride to accept the small cheat, coupled with enough skill to not actually need the AI assistance. It's been a while since I read TPoG last, but my recollection is that there were a handful of players at essentially the same level as Gurgeh and from a pure skill and intuition level, I suspect any of them would have worked, but only Gurgeh fell for the entrapment.

          • sdenton4 4 years ago

            Yeah, I thought it was clear from the beginning of the book that no humans were even remotely competitive with any AI (including the main character) but that human game players were sort of an aesthetic throwback, like dog-racing in an era of F1 cars.

            • 7thaccount 4 years ago

              This was my understanding as well, but I might have read into it. The culture minds are in freaking hyperspace to get around lightspeed limitations on computations. He for sure can't beat that, but he could beat someone on another planet at their own game that he literally just learned in the year it took to get there. A game that permeates every aspect of their civilization.

              I do assume his drone could beat him as well, but I'm not sure.

              • randomswede 4 years ago

                One of the reasons that Contact (and Special Circumstances) have (some) humans[] around is for intuitive leaps. I can't say I recall which Culture book this is mentioned in, but it is, in one of them.

                [] let's go with "human" as a general term for Culture biological citizens, it is probably a bit incorrect, but gets the point across.

                • thom 4 years ago

                  The 'referers' are in Consider Phlebas, a small group of humans among trillions who are able to reliably predict the future better than Minds. This would be one argument in Gurgeh's favour, but Banks later admitted how flimsy the idea was.

        • OneTimePetes 4 years ago

          Remember the "ambassador" aka Zakalwe? They had boots on the ground and even set up interactions to prevent him getting to friendly/embedded with the "host" culture.

          The whole thing was about to blow up anyway, so they brought in the Player of Games, to do it in a style that would prevent any recovery. Gurgeh was not there to defeat the empire, he was there to defeat the whole idea of the game being "holy".

          He was the Jesse Owen shipped to hitlers olympics.

          • 7thaccount 4 years ago

            Yeah, of course the culture could just obliterate the empire militarily. I think only a few civilizations (pre ascension) could hold their own for awhile.

            You bring up a good point about shattering the view of the game though.

      • hoseja 4 years ago

        I don't think a full Culture Mind is present but he outstrips his spacecraft's ability to help him with preparation in later stages of the competition. I clearly remember this.

        • macmac 4 years ago

          At least that is what the ship (SC) wants him to think.

          • WJW 4 years ago

            Indeed. (spoiler following) The plot basically revolves around SC manipulating both Gurgeh and the Empire of Azad in an ever bigger and complex game than the one in the book. Given how Banks describes the Minds in other books it would be extremely curious if they wouldn't crush any biological player in any normal game the same way chess computers crush humans these days. But, it is possible that a more limited mind like the security drone could be outstripped by Gurgeh. In one of the other books they do mention that "smaller" machines like environmental suits and small drones get more limited minds than full starships as it would be cruel to put a fully capable Mind in such a limited body.

            • stavros 4 years ago

              How did I miss this plot point? It's been a while, but I remember focusing on the game Gurgeh played. Maybe I just don't remember it now.

              • WJW 4 years ago

                The last page of the book gives it away: (MASSIVE SPOILER OBV) The security drone who came with Gurgeh to Azad was the same drone he meets during the introduction chapters who was "rejected" from SC and offers to let him cheat (though it was wearing a disguise at the time). Then, after he cheats he basically gets blackmailed into going to Azad and conveniently this "non-SC" drone comes with him in a very "non-SC" ship that claims to have its weapons removed but doesn't. At some crucial points the security drone influences Gurgeh to play the best he can, such as when he takes him on a tour of the slums and the Culture-educated Gurgeh gets so furious at the mistreatment he witnesses that he absolutely crushes his opponent in the next match.

                They mention in one of the final chapters that the minds wanted the Azad empire to become a better place since it was really shitty to its citizens. However, they couldn't just invade and impose laws because they're the Culture, and the Azad empire kept claiming moral superiority because they had this one thing (the Game) that they thought the Culture couldn't match. The Minds knew Gurgeh was talented enough to get far enough in the tournament that the Azad Empire would be seriously shaken, because if this single foreigner can beat so many of the best and brightest in the Empire at the thing it claims to do best then what could the entire Culture do? This turns out to have been correct, at the end of the book the Azad empire starts to collapse because they no longer trust their leadership, who have been proven to be incompetent at the very thing they claim to do best. Beaten by a human btw, not even by one of the god-machines that the Culture also has. Having predicted that this would happen, the Minds set out to manipulate Gurgeh into going to Azad to play the Game and by doing so bring about regime change. The Minds and/or SC were playing a much higher level game than Gurgeh all along, he was merely one of the pieces they used to play.

                • stavros 4 years ago

                  Ahh, thank you! Now that you recount it, it all comes back to me. I should read more Banks, he's a fantastic writer.

                • 7thaccount 4 years ago

                  Yes, at the end you start to question just who the player of games actually was.

                  • baq 4 years ago

                    that was a beautiful feeling after completing the book: who has been played here? i, the reader, certainly was.

fxtentacle 4 years ago

This is a great result, but you can see that it's more of a theoretical case because of this: "converging to perfect play as available computation time and approximation capacity increases." That is true for pretty much all current deep reinforcement learning algorithms.

The practical question is: How much computation do you need to get useful results? Alpha Go Zero is impressive mathematics, but who is willing to spend $1mio daily for months to train it? IMPALA (another Google one) can learn almost all Atari games, but you need a head node with 256 TPU cores and 1000+ evaluation workers to replicate the timings from the paper.

  • sillysaurusx 4 years ago

    You often don't need anywhere near the amount of compute in these papers to get similar performance.

    Suppose you're a business that needs to play games. Most people seem to think that it's a matter of plugging in the settings from the paper, buying the same hardware, then clicking a button and waiting.

    It's not. The specific settings matter a lot.

    But my main point is that you'll get most of your performance pretty rapidly. The only reason to leave it running for so long is to get that last N%, which is nice for benchmarks but not for business.

    DeepMind overspends. Actually, they don't; they're not paying anywhere close to the price of a 256 core TPU. (Many external companies aren't, either, and you can get a good deal by negotiating with the Cloud TPU team.)

    But you don't need a 256 core TPU. Lots of times, these algorithms simply do not require the amount of compute that people throw at the problem.

    On the other hand, you can also usually get access to that kind of compute. A 256 core TPU isn't beyond reach. I'm pretty sure I could create one right now. It's free, thanks to TFRC, and you yourself can apply (and be approved). I was. https://sites.research.google/trc/

    It kills me that it's so hard to replicate these papers, which is most of the motivation for my comment here. Ultimately, you're right: "How much compute?" is a big unknown. But the lower bound is much lower than most people realize (and most researchers).

    • fxtentacle 4 years ago

      My personal experience was the opposite. I'm currently trying different approaches for building a Bomberman AI for the Bomberland competition that was discussed here on HN a few weeks ago.

      "IMPALA with 1 learner takes only around 10 hours to reach the same performance that A3C approaches after 7.5 days." says the paper, but I can run A3C on a cheap CPU-only server but to get that IMPALA timing, I need to spend a lot of money. But my biggest roadblock so far is that I need compute far exceeding what the papers claim.

      The diagrams for IMPALA show good performance starting at 1e8 environment frames and excellent performance at 1e9 frames. By now, I'm at 2.5e9 frames and performance is still bad. In my opinion, the reason is that the sequence lengths for Bomberland are quite long. To clear a path, you place a bomb, wait 5 ticks for it to become detonatable, then detonate it, then wait 10 ticks for the fire to clear. With 7 possible actions per tick, the chance of randomly executing this 17 tick sequence becomes (1/7)^17 = 4e-15. If I calculate optimistically that all moves are valid, too, while we wait, then I can get up to (1/7)(5/7)^5(1/7)*(5/7)^10 = 1e-4. But that still means that at 1e8 env steps, I only have 1000 successful executions to learn from.

      • Javantea_ 4 years ago

        I don't have a lot of experience with IMPALA, but the sequence of events you describe should be very easy for an end-to-end system. Assuming you don't have an end to end system, just getting a gradient would result in rapid learning of that sequence. I'm surprised that at 2.5e9 frames you're not done. Perhaps there is a hyperparameter issue. Sorry I can't help but it sounds like you are in the same place I am with ML project. Good luck.

      • iwd 4 years ago

        Not an expert, but I believe many papers on other video games make a single decision for the next X frames at once, possibly including a delay factor that governs exactly when to act. I think OpenAI’s Dota2 agent does this.

        • fxtentacle 4 years ago

          I have experimented with that, too, but in my case it also multiplies the number of potential actions. If I have 7 actions per timestep, grouping them into 3-timestep blocks means I now have 777 = 343 possibilities to choose from.

          From what I understand, the OpenAI Dota 2 AI has a long-term strategy module which was mostly trained by imitating 60,000+ replays played by human professional teams. My problem with doing that for the Borderland competition is that I don't have any data source for replays of someone playing the game really well. You control 3 units simultaneously and it's 2 teams against each other, so I'd need 6 dedicated volunteers playing the game for many hours to create a reasonably-sized corpus of human replays. And who says that those people are good at it?

      • ericd 4 years ago

        Hm not an expert in this, but would something with a world model help, rather than depending on stochastic random action choices? It seems like it should be possible to learn that a frame sequence where you've been next to a bomb for 6 ticks is rapidly decreasing your expected score, and that your score would be significantly better if you weren't in line with the bomb pretty soon.

        • fxtentacle 4 years ago

          I'm in the process of attempting just that, with limited success. In my case, I trained a classifier that takes the current surroundings of the player unit and tries to predict that we'll gain an advantage in this segment of the game. I split the game into segments based on when the HP relationships between teams change. And gaining an advantage then means that you take more HP from the enemy team than what you and your teammates lost.

          The classifier has on average 90% accuracy which seems good. I then use the likelihood predicted by this classifier to compute the weight with which I want to train each action and if I want to train it positively (by pulling its likelihood of being chosen up) or negatively (pushing the likelihood of that action down).

          However, what this model cannot correctly represent is the fact that whether or not a given situation will turn out to be good or bad in the long term is highly dependent on how you play. So if I train this with replay data, I will score the situations in relation to how well those (outdated) AIs could take advantage of them.

          Next up, I'll try to fix this issue by introducing a graph-like stochastic structure. The basic idea is that I encode "from this state S if I take action A, then I can reach state T with P percent likelihood" into yet another neural network. If I then identify a state which is really beneficial in the sense that I can reliably convert it into an advantage, then I can use this graph to back-propagate that knowledge so that I get "from this state S, action A takes me to state T, then action B takes me to state U, and U is great".

          That should allow me to train with historical data to identify which transitions are possible, and then I can combine that with realtime data about the desirability of each state. So basically I'd do A* pathfinding over the graph of possible states to identify which actions are needed to bring me from my current situation into the closest "I will surely win" situation. Except that the graph is memorized by an AI because the real state-space is huge: 15x15 fields with 6 units + 5 environment states => roughly 11^(15*15) states

          • ericd 4 years ago

            Ah yeah the hp thing sounds a bit like what the OpenAI dota team did with their team scoring differential thing.

            I’ve not built what you’re describing, just read related research papers, so I can’t really evaluate your plan, but I can wish you good luck!

    • loxias 4 years ago

      My thoughts, not being in the field, are parallel to the parent post. "It's nice and all that we're achieving better and better computer performance at things that used to require the human brain, but it seems we're doing so by building larger and larger computers."Not to detract from that achievement, I love large computers in their own right!

      I'm a dabbler in Go, and "somewhere below professional" at the game of poker. I've followed the advances in the latter for more than a decade, eagerly reading every paper the CPRG publishes. They use a LOT of compute power!

      I know from experience that "The specific settings matter a lot.". For several years, I made my living "implementing papers for hire". It's real work, no argument there. Sometimes the settings are the solution, and heck, sometimes the published algorithm is outright wrong, and you only discover so when trying to implement it.

      But the second part of your point, that it's not simply achieving more performance by throwing more transistors at it, I don't have experience with, and I sorta don't believe you. :)

      Your comment is quite well written, making me (irrationally?) predisposed to suspect you're correct on factual matters, or at least more of a domain expert than I. Can you cite sources, or simply elaborate more?

      • fault1 4 years ago

        > "The specific settings matter a lot.".

        Yes, and in the case of deep RL, the ability to to get "lucky" random initialization seems to (still) matter a lot.

        I work in real time control systems, which are roughly decision making under uncertainty problems. A lot of the RL research has become noise buoyed with large marketing budgets.

  • gwern 4 years ago

    > That is true for pretty much all current deep reinforcement learning algorithms.

    Is that true? I was unaware that PPO, SAC, DQN, Impala, MuZero/AlphaZero etc would all automatically Just Work™ for hidden information games. Straight MCTS-inspired algorithms seem like they'd fail for reasons discussed in the paper, and while PPO/Impala work reasonably well in DoTA2/SC2, it's not obvious they'd converge to perfect play.

    • fxtentacle 4 years ago

      You can mathematically prove for a lot of different algorithms (including PPO, DQN, IMPALA) that given enough experience with the game world, they will eventually converge to the optimal policy. It's just that the "enough experience" part might be so large that it's practically useless.

      If I remember correctly, the DeepMind x UCL RL Lecture Series proves the underlying Bellman equation in this video: https://www.youtube.com/watch?v=zSOMeug_i_M

      As for "hidden information" games, I thought the trick was to concatenate the current state with all past states and treat that as the new state, thereby making it an MDP.

      • gwern 4 years ago

        I don't think you can prove that (forgive me if I don't sit through a 2h video). Those all are susceptible to the deadly triad, and AFAIK there are no convergence proofs of any kind for the big model-free DL algs, and it would've been big news if someone had proved that a real-world version of PPO/DQN/IMPALA does in fact converge in the limit. Sutton's book and earlier proofs only cover cases where you drop the nonlinear approximator or something.

        (History stacking may turn POMDPs into MDPs, but I don't know if they handle the specially adversarial nature of games like poker. That's quite different from stacking ALE frames.)

        • cygaril 4 years ago

          Standard RL algorithms will converge to optimal play versus a fixed opponent, but will not find an optimal policy via self play.

          One intuitive way to see this is that a sequence of improving pure policies A < B < C < etc. will converge to optimal play in a perfect information game like chess, but not necessarily in an imperfect information game like rock/paper/scissors where Rock < Paper < Scissors < Rock, etc

tsbinz 4 years ago

Comparing against Stockfish 8 in a paper released today and labeling it as "Stockfish" is bordering on being dishonest. The current stockfish version (14) would make AlphaZero look bad, so they don't include it ...

  • dontreact 4 years ago

    The name of the game here is generality. For a really general agent, they are looking to have superhuman performance, not get state of the art on every individual task. Beating stockfish 8 convinces me that it would be superhuman at chess.

    • remram 4 years ago

      They could still be honest that it's Stockfish 8, not the Stockfish everyone has. Your product having genuine value does not excuse lying about that value.

      • Skyy93 4 years ago

        I observed this kind of behavior in many papers nowadays. This extremely painful for research, because some better candidates could be overseen and FAANG publishs a majority in the ML-paper section. Its a mess.

      • ShamelessC 4 years ago

        They were? They say they use Stockfish 8 the very first time they mention it.

        • remram 4 years ago

          First time they mention it is page 10:

          > one of the strongest and most widely-used programs is Stockfish [81].

          Here's the citation, note the date:

          > [81] The Stockfish Development Team. Stockfish: Open source chess engine, 2021. https://stockfishchess.org/.

          They mention the version number only once, further down, and don't point out that it's out of date since February 2018. All other 11 mentions of it don't have the version number, like in that sentence:

          > In Chess, PoG(60000,10) is stronger than Stockfish using 4 threads and one second of search time.

          • hesperiidae 4 years ago

            >First time they mention it is page 10:

            Yeah, so it is! I guess I ran into the same weirdness as ShamelessC, since when I first Ctrl-F:ed the PDF, hit 1/11 was on page 11. Now that I try my damndest to reproduce it, I get 12 hits and the first is that one on page 10.

        • hesperiidae 4 years ago

          Yup, "In chess, we evaluated PoG against Stockfish 8, level 20 [81] and AlphaZero."

  • ShamelessC 4 years ago

    The first mention says "Stockfish 8, level 20" in the paper. This isn't a blog post that you can skim, you need to read the whole thing before critiquing.

    • karpierz 4 years ago

      That's actually the second mention, the first is when they introduce the games in section 4:

      > Today, computer- playing programs remain consistently super-human, and one of the strongest and most widely-used programs is Stockfish.

      They also go back to referring to it as Stockfish for the rest of the paper.

      An analogous situation in my mind would be if AMD released a new CPU and benchmarked it against an Intel CPU, only mentioning once, somewhere in the middle of the paper, that it was a Pentium 4.

      • Vetch 4 years ago

        This sort of evasiveness around speaking on method limitations, down playing or de-emphasizing related work but boosting senior authors previous work is standard academic fare. It's partly a strategy against novelty nitpickers and results in a net negative for all.

        I also suspect part of the reason they chose Stockfish 8 was as a basis of comparison with AlphaZero. Their baselines for Go and poker are also pretty weak so their emphasis is clearly on displaying generality and reduced domain specialized input, not supremacy.

        A single algorithm to play perfect and imperfect information games is difficult to achieve. Standard depth limited solvers and self-play RL result in highly exploitable agents. PoG appears to be very strong at Chess, decently strong at Go and decent at Poker (Facebook AI's ReBeL, the strongest prior work in this area, performed better against slumbot). What's unique about PoG is its ability to also play an imperfect information game (Scotland Yard) that has many rounds and a relatively long horizon (although it still has scaling issues).

      • ska 4 years ago

        > An analogous situation

        It really isn't though. Technical papers have conventions, and they following them reasonably. You expect the methods description to be specific, the abstract not to be hyperbolic, and conclusions to be balanced. The general discussion parts are just that, general.

        In the methods area they discuss the exact versions and parameters used, and how they compared them.

        In the conclusions:

          In the perfect information games of chess and Go,PoG performs at the level of human experts or professionals, but can be significantly weaker than specialized algorithms for this class of games, like AlphaZero, when given the same resources.
        
        It would have perhaps been interesting to include a more recent stockfish, but it wouldn't really impact the paper.
      • ShamelessC 4 years ago

        > Today, computer- playing programs remain consistently super-human, and one of the strongest and most widely-used programs is Stockfish.

        This is just a general effort to describe the present state of things. When they explicitly describe their evaluation process, they are sure to use the version number. They then _immediately_ drop the version number in subsequent usage which is culturally standard in research papers so they don't concern themselves with minute details of every single thing they find themselves redescribing. Believe me, you don't want to read the verbose version of this paragraph.

        > In chess, we evaluated PoG against Stockfish 8, level 20 [81] and AlphaZero. PoG(800, 1) was run in training for 3M training steps. During evaluation, Stockfish uses various search controls: number of threads, and time per search. We evaluate AlphaZero and PoG up to 60000 simulations. A tournament between all of the agents was played at 200 games per pair of agents (100 games as white, 100 games as black). Table 1a shows the relative Elo comparison obtained by this tournament, where a baseline of 0 is chosen for Stockfish(threads=1, time=0.1s).

      • ahefner 4 years ago

        I'd be interested to see that benchmark. A ~3 GHz Pentium 4 sounds like a good reference point for single threaded performance since it's a reasonably modern OoO microarchitecture and reflects the moment that clock scaling stopped.

        • littlestymaar 4 years ago

          With a smaller cache, a less efficient branch predictor and only SSE for SIMD, I'd be curious to see the benchmark too but I'd be surprised if it was close.

          I don't know if the RAM bandwidth being much lower would have an impact on CPU benchmark though.

    • tsbinz 4 years ago

      I obviously read it, otherwise I wouldn't have known which version they are using. They are banking on others, that do just skim the figures and tables, not noticing their usage of outdated baselines.

      • dontreact 4 years ago

        I honestly don’t care what version of stockfish they used and neither does most of their intended audience, for the reasons I stated.

  • david_draco 4 years ago

    Isn't the point comparing traditional heuristic techniques against DNN-learned techniques? I understand the latest Stockfish is etching quite close to AlphaZero techniques, but maybe I am wrong.

    • tsbinz 4 years ago

      It does have the option to use a neural network (nnue) in its evaluation, but it is very different from what AlphaZero/Lc0 do. You can choose not to use it, so you still could have a "traditional" evaluation (which would still blow Stockfish 8 out of the water). Also, Stockfish 8 isn't the last version before they merged nnue ...

  • moondistance 4 years ago

    The abstract clearly states that the best chess and Go bots are not beaten: "Player of Games reaches strong performance in chess and Go, beats the strongest openly available agent in heads-up no-limit Texas hold’em poker (Slumbot)..."

  • nixed 4 years ago

    the same goes for slumbot in poker, its super old like 2013, the game is played completely different now and current bots would destroy it.

    • bluecalm 4 years ago

      The problem with poker is that there is money to be made from having a strong AI so there is 0 incentive to release it. What's publicly available are solvers (which solve game abstractions similar to the full game but don't play themselves) and shitty bots.

    • scrozart 4 years ago

      As a commenter above noted, this work is about generality, being able to play every game, and not being the best at every game.

      • seoaeu 4 years ago

        The abstract claims they beat the "strongest openly available agent in heads-up no-limit Texas hold'em poker". To a non-expert that certainly sounds like they're claiming to be the best

        • antonvs 4 years ago

          "Openly available" is a strong constraint that's mentioned explicitly.

      • Skyy93 4 years ago

        As noted before, the reason for including old tech is to look better. Why not mention the current state of the art and show that with a general player we can come close to this results?

        This is just benchmark cherry picking and does not reflect real performance or comparison.

hervature 4 years ago

I think this is a good step forward that generalizes an algorithm to play both perfect and imperfect information games. However, table 9 shows (I believe it shows, it is not the most intuitive form), that other AIs (Deepstack, ReBeL, and Supremus) eat its lunch at poker. It also performs worse than AlphaZero at perfect information games. So, while a nice generalizing framework, probably will not be what you use in practice.

SuoDuanDao 4 years ago

I didn't even know about the book until I read the comments here, I thought it was a reference to the Grimes song. Funny coincidence the song and the engine would appear so close in time to one another.

  • Severian 4 years ago

    The Grimes song is a reference to the book too. She also has Marain subtitles in her video for "Idoru", which is the language used in The Culture. Weird mix of two author's (Idoru being William Gibson) works to be sure.

ArtWomb 4 years ago

This seems like a significant milestone in AI. I mean what can't an agent with mastery of "guided search, learning, and game-theoretic reasoning" accomplish?

  • ausbah 4 years ago

    modeling every task as a game seems like a big hurdle, or even just getting a working "environment"

WilliamDampier 4 years ago

so this is what Grimes latest song is about?

pixelpoet 4 years ago

Anyone else surprised to see that Demis Hassabis didn't have a hand in this research? Given his background as a player of many games, and involvement in a lot of their research.

BeenChilling 4 years ago

I want to see deepmind make a bot to play team based first person shooters like csgo and rainbow6 siege, to stack up five of them against a team of professional players.

  • fho 4 years ago

    Honestly that probably won't be too interesting as (a) one AI could perfectly control several agents (ie perfect coordination of global strategies) and (b) an AI has low to no reaction times and perfect aim (aimbots already have that) so I would expect that would quickly result in a slaughterfest.

    • arlort 4 years ago

      What would be interesting would be 5 independent AIs (even just different instances of the same AI of course) using the same interface as human players, so the same controls and the same video output

      I am pretty sure aimbots access the internals of the game rather than reading the video output to identify the silhouette of the enemy.

    • ausbah 4 years ago

      IIRC multi-agent domains are in their own category specifically because a single agent posing as "multiple agents" usually can't solve such environments, you need multiple agents with varying degrees of dependence

    • gverrilla 4 years ago

      Same applies to dota2, and it was very interesting what they did there. But yeah first they would need to simulate how human players react and aim, or it would be impossible to play against.

    • LudwigNagasena 4 years ago

      (a) make them independent (b) add 100-200ms delay

    • arethuza 4 years ago

      "...such consummate skill, such ability, such adaptability, such numbing ruthlessness, such a use of weapons when anything could become weapon..."

  • ausbah 4 years ago

    that's what OpenAI did a couple yewrs ago with Dota 2

    https://openai.com/five/

  • mensetmanusman 4 years ago

    They probably won’t for publicity reasons.

skinner_ 4 years ago

It would be awesome to have two interacting communities: AI experts building open source general game playing engines, and gaming fans writing pluggable rule specifications and UIs for popular games.

A bit of googling shows that there is a General Game Playing AI community with their own Game Description Language. I never really encountered them before, and the DeepMind paper does not cite them, either.

  • dpflug 4 years ago

    Last I looked, the GGP community is focused on perfect information games currently. I had the same thought, though.

cab404 4 years ago

SCP-like name for SCP-like neural network.

"SCP-29123 Player Of Games"

wiz21c 4 years ago

Couldn't resist :

https://www.youtube.com/watch?v=-1F7vaNP9w0

antonpuz 4 years ago

Anyone knows whether the agent is publicly available?

simonebrunozzi 4 years ago

Can this be realistically used by game companies to provide a much better AI experience for strategy games?

bkartal 4 years ago

Impressive work! Most authors, if not all, are from DeepMind Edmonton office.

crhutchins 4 years ago

I'll try to look into a brighter light into this one.

RivieraKid 4 years ago

Wow, it can beat a good poker bot, that is impressive.

loxias 4 years ago

Psh, wake me when it can play Mao. ;)

wly_cdgr 4 years ago

The future is so depressing

  • wetpaws 4 years ago

    Fun fact: The consensus between professional go and chess players is that all new AI systems (alphago, etc) have really revitalised the game and introduced incredible amount of new strategies and depth.

    • loxias 4 years ago

      I wish alphago was more "democratized" -- that is to say, I have many questions and experiments I'd love to run on it (a friend of mine and I have frequently pondered Go played in various different topological spaces, and I'd love to see an AI's result, for example).

      • kadoban 4 years ago

        Look into Katago. It's an open source AI in the same general style as AlphaGo, with an empasis on training speed. On 9x9 you can get up to superhuman really quickly on just a decent home machine (I think hours/days, can't remember exactly and it's probably improved since I looked).

        • elefantastisch 4 years ago

          You can also just download pre-trained models. Get those set up and then install Sabaki (https://sabaki.yichuanshen.de/) and connect it to your KataGo... instant (ok, a few hours probably if it's your first time setting it up) superhuman Go AI. There's even an npm package you can use to process SGF files and automatically score moves as good/questionable/bad + generate variations that were better choices: https://github.com/9beach/analyze-sgf/blob/master/README.en-...

          (Edit: Misread what the other poster was trying to do, but I'm leaving this here as a reference for anyone else who just wants to use KataGo on their own machine on their own Go games.)

        • loxias 4 years ago

          Awesome! Thanks! Checking it out now.

      • franknstein 4 years ago

        Fun idea. Did you reach any interesting conclusions?

    • jart 4 years ago

      Sad fact: Lee Sedol retired after AlphaGo defeated him.

      • jm547ster 4 years ago

        3 years after...

        • jart 4 years ago

          Here's what Lee Sedol said when he retired:

          > With the debut of AI in Go games, I've realized that I'm not at the top even if I become the number one through frantic efforts. https://en.yna.co.kr/view/AEN20191127004800315

          He'd been playing Go professionally for 24 years. I never said he ragequit. He's too great a man to do something like that. Lee instead apologized for his losses, stating "I misjudged the capabilities of AlphaGo and felt powerless" while emphasizing that the defeat was his own and "not a defeat of mankind". I imagine being the Hector of humanity is quite a burden to bear. His professional ratings then took a dive for a few years https://www.goratings.org/en/history/ before he announced his retirement. To this day he remains the only human being who's ever won a single game against AlphaGo.

          • Buttons840 4 years ago

            He's the last human to ever beat the strongest Go AI. I don't know if he's happy about it, but he'll have a special place in the history books because of that. And like Chess, the game of Go will continue to be played and loved.

        • dane-pgp 4 years ago

          I don't want to pull back the curtain too much, but surely DeepMind foresaw the possibility of AlphaGo winning and then Lee Sedol losing confidence or interest in the game, which would generate a load of bad publicity for them.

          So it would make sense for DeepMind's contract with him to contain a clause requiring him to continue playing go professionally for a few years (but not necessarily put much effort into it), as well as the standard non-disparagement clauses.

          In fact, I wouldn't be surprised if AlphaGo was programmed to throw the forth game after securing the win with the first three of the five games. That gives Lee some bragging rights, and makes for a more hopeful story than "Computer stomps likeable human".

        • newswasboring 4 years ago
      • visarga 4 years ago

        Caching out at the height of his fame.

mudlus 4 years ago

Yawn, show me a computer that game make fun games

  • TaupeRanger 4 years ago

    You're getting downvotes but honestly I agree. Who cares about board games? We should've moved on from this once we "solved" chess and Go. There are more important things and it's not remotely surprising that a computer can beat a human when there's a simple, abstract optimization problem to throw computing power at. Make it creative...now that's a challenge worthy of the top AI talent.

  • Buttons840 4 years ago

    Solving the game comes before solving for fun. If we create an AI that can win, then we can hamper the AI in fun ways, or give it an altered objective function that maximizes the players fun.

  • mbrodersen 4 years ago

    Yes indeed. AI research will only take a real step forward when it learns how to be creative instead of just very good at optimising simple formal systems like board games.

  • baq 4 years ago

    if making games is a game...

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection