Settings

Theme

Guess the daily Wordle in one try using the tweet distribution

kaggle.com

564 points by benhamner 4 years ago · 121 comments (118 loaded)

Reader

jeffchien 4 years ago

Waiting for a bot that guesses based on Google Trends:

* Wordle 218: https://i.imgur.com/PbYfLm6.jpg * Wordle 221: https://i.imgur.com/pTPbquL.jpg

  • avipars 4 years ago

    We'll need to do Wordle-SEO in 2022 ;)

    But I hope the game is increasing people's literacy... They should make a version for SAT words.

srcreigh 4 years ago

My favourite part of this is accounting for some fake grids.

Kudos! I have been so curious lately as to whether this was possible.

EDIT: The next question is which (if any) of these signals can be removed and still get it in 1 guess. Or if there are any other signals. Or how many tweets are needed (is 50 enough? 10? or 1000? 10k?)

  • benhamnerOP 4 years ago

    This uses ~2-3k tweets per day for most days, which seems to be more than enough. According to https://twitter.com/WordleStats/status/1486021209015963649 there's about 250k daily tweets per Wordle right now, so this is about a 1% sample coming from whatever the Twitter search API returned when I ran that query.

    The simulated distributions it's comparing to are based on 1000 runs per 5-letter word.

    Anecdotally, 250 was enough to get it working for those simulated distributions, 100 and below it became increasingly noisier. A higher N would be nice, but I didn't spend more time optimizing the performance for the simulation code beyond what was needed to get this working.

    • bscphil 4 years ago

      This is a cool project, but I wanted to tell you that your evaluate_guess function is wrong.

          evaluate_guess(answer="crest", guess="erase")
          "MYNYM"
      
      Many people misunderstand this but it's not how the rules actually work. Correct here would be MYNYN, because there is only one E in the correct answer. There must be a 1-1 correspondence between any 'M' letter in the guess and the letter in the answer. This is similar to the rules for the game "Mastermind".
      • waterproof 4 years ago

        Right, I wonder how many of the “fake/invalid” tweets that OP observed are actually this bug in the analysis code.

        EDIT: actually it looks like it’s correct - evaluate_guess_char() only returns “M” if there’s an instance of the guess letter that’s not accounted for.

        • bscphil 4 years ago

          It's not correct, I pasted the code from the article directly into ipython.

          It filters out cases where the corresponding character in the answer is correct (a 'Y'), but not cases where it's used in another maybe (a 'M'). The latter requires keeping track of state in a way that this doesn't.

          For example:

              evaluate_guess(answer="crest", guess="erase")
              'MYNYM'
          
          Which is wrong, as stated above.

              evaluate_guess(answer="crest", guess="erese")
              'NYYYN'
          
          Which is right, even though we only changed the middle letter of the guess, not either of the broken letters. In this case the filtering works correctly.
    • Scaevolus 4 years ago

      If you want to get even more tweets, you could use twitter's streaming API with the keyword "Wordle": http://adilmoujahid.com/posts/2014/07/twitter-analytics/

      It should allow capturing a significant fraction of the 250k daily wordle tweets.

    • gojomo 4 years ago

      Besides eliminating the superficially-impossible rows (like `YYYYM`), does it do anything against more-sophisticated chaffing, like one or more accounts posting possible-but-inaccurate hint grids pointing at an alternate answer?

  • not2b 4 years ago

    As the article explains, a grid containing, say, GGYGG, is fake. Finding more complex fakery is more difficult.

    (Edit: drat, HN filters out the Unicode colored-block characters).

del_operator 4 years ago

Two or three guesses with Wordle using the ETAOIN SHRDLU I learned doing cryptopals has been very effective at reaching a solution.

I usually have a first guess like SAINT then something like SCARE, CORED, etc eliminating vowels and frequent constants while also considering the most likely sequencing of matched characters or remaining characters.

Also eliminating S, T, C really reveals there’s no TH, SH, SP, CK, etc and is one factor that gets me suspicious of repeated chars or rarer k, g and x combos.

rogerallen 4 years ago

SPOILER ALERT: shows today's answer!

nilay 4 years ago

Or take all the fun away and just get it through browser console ¯\_(ツ)_/¯

  JSON.parse(localStorage.gameState).solution
  • dag11 4 years ago

    Or you can look at the source and see the list of words. They're sequential, so you know every future word. The "Wordle 222" is actually just the 222 index into the array.

    • theiasson 4 years ago

      That's a real bummer. Is there any way the author can prevent this? Can he generate random indexes that don't repeat while keeping this random number generator code public?

      • grenoire 4 years ago

        The only way to address this would be validating the answers server-side. Any information you leak in the locally executed code can be discovered without much effort.

        I think Wordle doesn't server-side validate because of volume, and also because it's a fun little game and cheating brings you nothing of value.

    • tomjakubowski 4 years ago

      Wait, did Wordle start counting from 0?

beepbooptheory 4 years ago

This is brilliant and something I had the intuition was possible, just couldn't put it all together myself. What was missing, I think, in my thought process was just taking into account the general common occurrence of words in English in general. Plus how to deal with static.

Just so cool someone put this together, major props.

Karunamon 4 years ago

Very cool!

One minor improvement here; if the user has toggled colorblind mode on, then their tweeted result will also have altered color blocks. Orange for right letter right place, and blue for right letter wrong place.

  • evandale 4 years ago

    That's a really neat attention to detail! I haven't seen the colourblind boxes.

csours 4 years ago

My metagame is guessing my friend's guesses.

  • aidenn0 4 years ago

    I always lead with STOAE so people don't have issues guessing my first one at least. I also tend to follow with UNLID if STOAE has zero or one hits.

    • csours 4 years ago

      ARTSY MODEL CHUNK here. It's not optimal, but it is pleasing to me in an aesthetic sense.

      • balls187 4 years ago

        I vary up my starting words to keep things interesting.

        Common ones for me are: MEATY, BISON, CHUMP, GROUP

        From yesterday's post on the state of the art, I tried SALET, but still took me 4 tries to get today's wordle.

        • jameshart 4 years ago

          If you stick with one starting word, and that word is in the set of possible Wordle answers, then some day, one day, if you keep playing forever, you are guaranteed to get that magical 1/6.

          This is not quite the same as the lottery-player's fear that they change their lucky numbers and then those numbers come up the next week... the lottery has no memory, so it really doesn't change your odds when you change your numbers. But Wordle's drawing words from a finite pool.

          Of course, if your go-to starting word is NOT in the set (looking at YOU overly optimized people who play crazy words like STOAE that are almost certainly not in the answer set...) then by sticking with that you're guaranteeing you'll never do better than 2/6...

        • gs17 4 years ago

          I turned on hard mode, and it's really forced me to change how I play. You can only really have one "starting word" unless you match zero letters.

          • function_seven 4 years ago

            After seeing an asterisk in one of my friends' shares, I'm now forced to play on hard mode as well. I can't risk the peer-shame of skating on easy mode anymore :)

            So that being said, I'm bracing myself for the curses-of-early-success this will lead me on. Right now I sometimes toss out a completely different word just to cover the search space. It has led me to quickly narrow down options. Am I screwed if my first guess matches on 2 letters? (Say, "___ES").

            I guess that's why it's called "Hard Mode" to begin with.

        • sterlind 4 years ago

          I've had fun with MUSTH.

      • nathancahill 4 years ago

        STEAM HOUND has been a winner for me.

      • dumbfounder 4 years ago

        STARE, CHIMP, BLOND or BOUND depending on how many vowels I hit. Sometimes FLUNK.

      • amrrs 4 years ago

        Given the situation we are in I start with VIRUS, PEACH always

      • cheeze 4 years ago

        SALTY URINE

    • __d 4 years ago

      AROSE for me. 5 of the top 6 most-frequent letters. But hard mode, so next depends.

  • tedd4u 4 years ago

    Glad to hear I am not the only one squandering my time doing this :) Might be a fun program to write. I find it’s often hard to guess more than the line prior to the win.

Boom_Rang 4 years ago

I like that it's robust to adversarial tweets!

I did something similar last week using the Twitter Stream API: https://github.com/basile-henry/twitter-wordle

It's not resistant to adversarial tweets, but it usually collects enough tweets to have an answer in around 1 minute, so it's not too bad to restart if some bad tweets were sampled.

Maybe I should try to use your wordle-tweets dataset to make it work offline as well. :)

rkimb 4 years ago

This is a really cool approach, definitely did not think of trying this! If you'd prefer to play without the crowdsourced data, I spent a couple hours on the following dictionary search algo yesterday which can typically solve puzzles in 3-4 guesses: https://github.com/rgkimball/wordlebot

  • keredson 4 years ago

    nice! i did similar, but used character frequencies in the remaining word sets to rank: https://github.com/keredson/wordle_solver

    • rkimb 4 years ago

      I tried yours out, nice work yourself! Seems we took a similar approach in recalculating the letter distributions based on remaining words - both our algos solved it in 4 turns today.

      If I may make two small suggestions as a user, I noticed you have a dictionary with nearly 13k words which often results in invalid suggestions like 'clery' and 'meryl'. In testing I found the Scrabble dictionary to be much more likely to yield valid Wordle words (found here: https://github.com/redbo/scrabble), though the official Wordle answers tend to be an even smaller set of ~2,500 common words.

      Second, though the implementation is very clean in code (much more concise than mine!), I found the use of the green/gray/yellow methods to be a bit cumbersome when adding constraints. You could wrap these three in a method like guess(word, reply) where your response encodes the feedback as something like [g]=green, [b]=black, [y]=yellow:

      Given: [('arose', 27122), ('aeros', 27122), ('seria', 27095), ('riesa', 27095)]

      >>> w.guess('arose', 'bybby')

      vs.

      >>> w.gray('aos') >>> w.yellow('r', 2) >>> w.yellow('e', 5)

      You could even have the guess method trigger a new round of suggestions since the response implies that we've advanced a turn.

  • Txoko 4 years ago

    Hperwordle works for me It defines the usable letters right on your keyboard.

  • Txoko 4 years ago

    Hyperwordle defined the usable letters right on your keyboard. Thanks!

freeslave 4 years ago

Cool but don't read all the way if you haven't done today's Wordle!

  • benhamnerOP 4 years ago

    Sorry about that! Just updated so today's guess is hidden by default (and you can click-to-unveil)

siruva07 4 years ago

This is the HN I’m here for. Brilliant.

swatts999 4 years ago

This is super smart. I wonder how many tweets this approach needs each day to converge to the correct answer? It would be interesting to see some plots vs. num tweets

jonny_eh 4 years ago

This is your regular reminder that today's word, and all the upcoming ones, are located in the Wordle minified JS.

  • lkbm 4 years ago

    Yes, we know. We can also easily "solve" a crossword puzzle by waiting a day and just copying down the published answers.

    People are having fun solving puzzles in clever ways. This post is an exceptionally clever way of solving a puzzle in an unexpected way, using forensic data analysis, which is itself something of interest to a lot of us.

  • yen223 4 years ago

    It's too easy to cheat at Wordle, even if you don't know what html/javascript is. Just open a new browser, solve it there, and enter the solution in your main browser.

    In the age of intrusive anti-cheat software and byzantine security measures, the fact that Wordle doesn't attempt to prevent cheating is something I find weirdly charming.

  • thechriswalker 4 years ago

    Shameless plug, but I built a site to do this the other way around.

    Enter a 5 letter word and it'll tell you the next the it will be the wordle solution.

    https://atom.7r.pm/whendl/

  • databased 4 years ago

    Yep, you can try https://wordhoot.com if knowing the answers are accessible somehow decreases your enjoyment of the game.

  • deadbunny 4 years ago

    You can also just run a command in the console to get the answer. But where is the fun in that?

codeflo 4 years ago

I wonder what percentage of these Twitter posts are fake (“OMG so lucky LOL”).

jsmith99 4 years ago

If you know someone always starts with SLATE it makes it easier...

  • jedberg 4 years ago

    I start with salet. Same letters but more likely to be in the right place.

alana314 4 years ago

I use (not today's puzzle) cat /usr/share/dict/words | grep -v 'w' | grep -v 'a' | grep 't' | egrep '^...er$'

Asraelite 4 years ago

What are the "⬛ squares taking social media by storm"?

  • zwieback 4 years ago

    That was my question and it's still not answered well here. I guess people post the colors that led up to their solution but no the actual letters.

    Might be good to add it to the original post for clarification. I play Wordle but didn't quite get what they were using for source data.

    • bscphil 4 years ago

      It wasn't obvious to me when I first started playing Wordle, but you can actually share an emoji-fied version of your game (without the letters) by clicking "Share" when the statistics window pops up. I didn't think to do that at first, but when I noticed everyone on social media posting their Wordles with the exact same format, I figured it had to be buried in the game somewhere.

  • devoutsalsa 4 years ago

    It's the text you copy/paste to share your Wordle results on social media.

  • imadethis 4 years ago

    Huh, I thought HN stripped out emojis from posts. Has that changed, or is there a limited subset that are available?

    • zerocrates 4 years ago

      The black and white squares are in the older Unicode "Miscellaneous Symbols and Arrrows" block so I guess they're allowed. Several things like that are sort of "retroactively" emoji... there's a "display as emoji" or "display as text" character you can put after them.

      What HN does and doesn't allow seems somewhat arbitrary, things like the star emoji are in that same block and yet are not allowed as far as I can tell.

    • croes 4 years ago

      Maybe because ■ is part if ASCII?

      Edit: ⬛⬜ work, the colored ones don't.

hkmaxpro 4 years ago

> Note that all of these 243 possibilities aren't valid in practice. For example YYYYM will never be seen because if the first four letters are correctly placed and the fifth is also in the word, it will be correctly placed.

Not true. For example if the correct answer is TWEED and you guess TWEET, then you’ll get YYYYM.

Edit: As pointed out by two commenters, the actual implementation contradicts the following claim in the post:

> “Maybe” - the letter is in the answer but in a different position

If the correct answer is TWEED and you gess TWEET, you will still get YYYYN, because the actual implementation uses a different definition of “Maybe” than what is written in the post.

  • unholiness 4 years ago

    I believe in the actual implementation it's correct. You can confirm a letter is not doubled, when one of the two letters in your guess is gray.

    • hkmaxpro 4 years ago

      I only played it occasionally and haven’t encountered doubled letters, so I don’t know what the actual response would be. Maybe you’re right.

      • jacobmischka 4 years ago

        Posting with such assurance without knowing what the actual response would be, lmao. Very on-brand for HN (and myself too, honestly).

        • hkmaxpro 4 years ago

          I blindly trusted the post’s claim that “Maybe” means “the letter is in the answer but in a different position”. If you use that definition, you’ll arrive at the same conclusion.

          The post should be updated with the correct definition.

          • jacobmischka 4 years ago

            It's a fine short summary for how the game works. Assuming no edge cases exist based on a 10-word summary is not the original author's fault.

  • KeytarHero 4 years ago

    > if the correct answer is TWEED and you guess TWEET, then you’ll get YYYYM

    No, this would give YYYYN

    • hkmaxpro 4 years ago

      If Yellow/“Maybe” really means “the letter is in the answer but in a different position”, then the final T satisfies this definition.

      Another commenter points out the actual implementation may deviate from this definition though.

      • KeytarHero 4 years ago

        Try TWEET in today's Wordle and you'll see what I mean

        • hkmaxpro 4 years ago

          I agree with you. The first E gives a “Maybe” and the second E gives a “No”.

          A “Maybe” response gives much more information than simply “the letter is in the answer but in a different position”.

        • a_t48 4 years ago

          :( Spoiled on hacker news, what a world

olliej 4 years ago

Something I’ve wondered about is how well you can guess what peoples guesses were from the images they post.

smaudet 4 years ago

I mean, I guess - or you can just use a private tab. Same difference, except way less complex.

mtoner23 4 years ago

Amazing, what a brilliant idea

cafed00d 4 years ago

Computer nerds strike wordle! Goddammit! How long before Skynet. Sigh. The end is nigh

/s

madcow2011 4 years ago

This is genius. I love it.

jl6 4 years ago

Lessons to be learned in the field of data anonymization!

  • tills13 4 years ago

    the point of wordle is to be simple, fun, and social so I don't think there's anything further to be learned here.

mrfusion 4 years ago

Would this be a good use of a hidden Markov model?

bushbaba 4 years ago

Guess the daily wordle by inspecting source code…yes every word is hard coded in the JavaScript in calendar order

  • jaredsohn 4 years ago

    Or use incognito for your first attempt at the puzzle and then redo it how you want in your normal account.

  • thamer 4 years ago

    Or open your browser's dev tools and type:

        $('game-app').solution
quotha 4 years ago

And people think bitcoin wastes energy.

faeyanpiraat 4 years ago

What is Wordle?

jcpham2 4 years ago

Can someone TL:dr what this Wordless thing is?

  • drdeca 4 years ago

    My understanding (I've never played it) is:

    each day there is a 5 letter word. You have a limited number of guesses as to what word it is (iirc 6 guesses). When you make a guess, it marks each letter with something indicating whether the actual word had that letter in that position, whether that word has a copy of that letter (and not one you already found), or whether that letter does not appear in that word.

    All of your guesses have to be words.

    In hard mode, all of your guesses have to contain all of the letters which you got right in a previous guess.

    At the end (if you get the word within 6 guesses?) you are given an option to share (on twitter mostly, I think) a representation of your game, in a way that doesn't reveal what words you guessed or what the final word was, just which positions had which of the 3 markings, which, in this share feature, are represented using emoji with the colored square blocks.

    This results in many people posting grids of colored square blocks, followed some fraction out of 6.

ouid 4 years ago

Now someone make an adversarial twitter bot.

Axien 4 years ago

I am trying so hard to not know what Wordle is or how to play. Now it is showing up on Hacker News? Damn. I’ve not had this much trouble since I avoided Sudoku.

nerdjon 4 years ago

I like the idea of worldle

But I hate that any guesses have to be words in its dictionary.

As someone who was never really a fan of crosswords, the need to find a real word that fits 5 letters every time severely limits how I can enjoy it.

not2b 4 years ago

You can guess it in one try by carefully reading the code. There is no server that knows the correct answer. The client already knows, based on the date. That is why you can only play once per day.

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection