Wordle as a Two-Player Game

25 min read Original article ↗

Some time ago I implemented Adversarial Wordle (AKA Evil Wordle AKA Absurdle) on my website, which is different from normal Wordle because it changes the word while you're guessing. (I did not say in the page that it was adversarial because it allowed me to mess with my friends.) Of course playing this silly game can be fun, but it would be even more fun if we overanalyzed it in a game theoretical framework, and coded a bot that can play both Adversarial Wordle and normal Wordle!

This is an example of competitive analysis, where you take an existing problem and lift it to a competitive setting with an opponent that intentionally tries to make the solution as difficult for you as possible. This restricts us from making subjective assumptions about the "probability" (what even is that) that a word is the correct solution, which the NYTimes WordleBot does do. A strategy that works for Adversarial Wordle will also work for regular Wordle (though maybe not as well?) and will be even more general and objectively correct.

Wordle has already been solved a thousand times over, so I'm not inventing anything new here; instead I want to discuss my problem solving process, relate the findings to existing theory, and introduce you to concepts you might not have heard about.

The Rules

In the normal Wordle game, one player (let's call them the Hinter), chooses a 5-letter word from a set of possible words, for instance the English dictionary. The other player (let's call them the Guesser) now has 6 guesses to find it, each time getting hints about which letters of the chosen word are correct or shuffled with respect to the hidden word.

However, there is no reason the Hinter can't cheat by changing the hidden word during play. After all, they only have to disclose the correct word at the end! They just have to make sure that all of the Guesser's previous guesses and the hints are still valid with respect to the newly chosen word. We have every ingredient of a game in the mathematical sense.

  • Two players: the Guesser and the Hinter.
  • A "game board" or the game state: the current list of chosen words and the hints assigned to them.
  • Moves: every turn, the Guesser picks a guess, and the Hinter gives back gray/yellow/green hints. Note: the hints given back by the Hinter are only legal if there is still at least 1 word from the possible set that matches the current and all previous hints.
  • Win conditions: the Hinter wins if they can give a word at the end that still matches all of the previous hints but was not picked by the Guesser, while the Guesser wins if the Hinter gives back all-green.

For convenience, let \( W \) the set of possible words. This could be all 5-letter words in the English dictionary, but also something smaller if you want to make it easier for the Guesser. Define \( \Sigma = \{ A, \dots, Z \} \) to be the alphabet. Finally, let \( \ell \) be the number of turns, so in standard Wordle \( \ell = 6 \).

Here's an example game, where \( n \) = no (letter does not occur), \( y \) = yes (letter in correct position), \( s \) = shuffled (letter in different position):

Guesser Hinter Remaining
\( W = \{ \mathit{CRANE}, \mathit{ALONE}, \mathit{SHAME}, \mathit{ATOLE} \} \)
\( \mathit{SHAME} \)
\( nnsny \) \( \{ \mathit{ALONE}, \mathit{ATOLE} \} \)
\( \mathit{ALONE} \)
\( ysyny \) \( \{ \mathit{ATOLE} \} \)
\( \mathit{ATOLE} \)
\( yyyyy \) (Guesser wins)

Or, as you would see it while playing the actual Wordle game:

S H A M E
A L O N E
A T O L E

Game Analysis

If we want to design an algorithm to play this game (for either side), we should note the following properties.

  • We have perfect information. This might be confusing since there is supposed to be a "hidden word" that the Guesser is meant to find, but in reality there is no hidden word, just a finite set of remaining words that the Hinter could still use as the hidden word. Both the Guesser and the Hinter can calculate exactly what these words are.
  • The game always takes a finite number of moves and there are also finite legal moves every turn.

Thus, this belongs to the field of combinatorial game theory (as opposed to "economic" game theory).

  • Both sides have different possible moves.

Thus, it is a partisan game and not an impartial game, which is a game like Nim where both players can do the same moves.

Unfortunately there is no general good algorithm for solving partisan games like there is for impartial games, but we can still look at the game tree. This is a tree where the nodes are game board states, the edges are the moves, and the root is the starting position. For instance, for \( W = \{ \mathit{CRANE}, \mathit{ALONE}, \mathit{SHAME}, \mathit{ATOLE} \} \) and \( \ell = 3 \), the tree has depth \( 2\ell \) and looks as follows. I have underlined the path followed by the example game.

state (no hints)
+ ATOLE -> state (no hints), ATOLE
  - yyyyy -> player 1 win
  - ynysy -> state ATOLE:ynysy
    + CRANE -> state ATOLE:ynysy, CRANE
      - nnsyy -> state CRANE:nnsyy, ATOLE:ynysy
        + SHAME -> state CRANE:nnsyy, ATOLE:ynysy, SHAME
          - nnsny -> player 2 win
        + ALONE -> state CRANE:nnsyy, ATOLE:ynysy, ALONE
          - yyyyy -> player 1 win
    + SHAME -> state ATOLE:ynysy, SHAME
      - nnsny -> state ATOLE:ynysy, SHAME:nnsny
        + CRANE -> state ATOLE:ynysy, SHAME:nnsny, CRANE
          - nnsyy -> player 2 win
        + ALONE -> state ATOLE:ynysy, SHAME:nnsny, ALONE
          - yyyyy -> player 1 win
    + ALONE -> state ATOLE:ynysy, ALONE
      - yyyyy -> player 1 win
  - snnny -> state ATOLE:snnny
    + SHAME -> state ATOLE:snnny, SHAME
      - yyyyy -> player 1 win
      - nnyny -> state ATOLE:snnny, SHAME:nnyny
        + CRANE -> state ATOLE:snnny, SHAME:nnyny, CRANE
          - yyyyy -> player 1 win
        + ALONE -> state ATOLE:snnny, SHAME:nnyny, ALONE
          - snnyy -> player 2 win
    + CRANE -> state ATOLE:snnny, CRANE
      - yyyyy -> player 1 win
      - nnyny -> state ATOLE:snnny, CRANE:nnyny
        + SHAME -> state ATOLE:snnny, CRANE:nnyny, SHAME
          - yyyyy -> player 1 win
        + ALONE -> state ATOLE:snnny, CRANE:nnyny, ALONE
          - snnny -> player 2 win
    + ALONE -> state ATOLE:snnny, ALONE
      - snnny -> state ATOLE:snnny, ALONE:snnny
        + CRANE -> state ATOLE:snnny, ALONE:snnny, CRANE
          - nnyny -> player 2 win
        + SHAME -> state ATOLE:snnny, ALONE:snnny, SHAME
          - yyyyy -> player 1 win
      - snnyy -> state ATOLE:snnny, ALONE:snnyy
        + CRANE -> state ATOLE:snnny, ALONE:snnyy, CRANE
          - yyyyy -> player 1 win
        + SHAME -> state ATOLE:snnny, ALONE:snnyy, SHAME
          - nnyny -> player 2 win
+ SHAME -> state (no hints), SHAME
  - yyyyy -> player 1 win
  - nnsny -> state SHAME:nnsny
    + CRANE -> state SHAME:nnsny, CRANE
      - nnsny -> state SHAME:nnsny, CRANE:nnsny
        + ATOLE -> state SHAME:nnsny, CRANE:nnsny, ATOLE
          - yyyyy -> player 1 win
        + ALONE -> state SHAME:nnsny, CRANE:nnsny, ALONE
          - ysyny -> player 2 win
      - nnsyy -> state SHAME:nnsny, CRANE:nnsyy
        + ALONE -> state SHAME:nnsny, CRANE:nnsyy, ALONE
          - yyyyy -> player 1 win
        + ATOLE -> state SHAME:nnsny, CRANE:nnsyy, ATOLE
          - ynysy -> player 2 win
    + ALONE -> state SHAME:nnsny, ALONE
      - yyyyy -> player 1 win
      - ysyny -> state SHAME:nnsny, ALONE:ysyny
        + CRANE -> state SHAME:nnsny, ALONE:ysyny, CRANE
          - nnsny -> player 2 win
        + ATOLE -> state SHAME:nnsny, ALONE:ysyny, ATOLE
          - yyyyy -> player 1 win
    + ATOLE -> state SHAME:nnsny, ATOLE
      - yyyyy -> player 1 win
      - ynysy -> state SHAME:nnsny, ATOLE:ynysy
        + CRANE -> state SHAME:nnsny, ATOLE:ynysy, CRANE
          - nnsyy -> player 2 win
        + ALONE -> state SHAME:nnsny, ATOLE:ynysy, ALONE
          - yyyyy -> player 1 win
+ ALONE -> state (no hints), ALONE
  - yyyyy -> player 1 win
  - ysyny -> state ALONE:ysyny
    + ATOLE -> state ALONE:ysyny, ATOLE
      - yyyyy -> player 1 win
    + SHAME -> state ALONE:ysyny, SHAME
      - nnsny -> state SHAME:nnsny, ALONE:ysyny
        + ATOLE -> state SHAME:nnsny, ALONE:ysyny, ATOLE
          - yyyyy -> player 1 win
        + CRANE -> state SHAME:nnsny, ALONE:ysyny, CRANE
          - nnsny -> player 2 win
    + CRANE -> state ALONE:ysyny, CRANE
      - nnsny -> state CRANE:nnsny, ALONE:ysyny
        + ATOLE -> state CRANE:nnsny, ALONE:ysyny, ATOLE
          - yyyyy -> player 1 win
        + SHAME -> state CRANE:nnsny, ALONE:ysyny, SHAME
          - nnsny -> player 2 win
  - snnny -> state ALONE:snnny
    + SHAME -> state ALONE:snnny, SHAME
      - yyyyy -> player 1 win
    + ATOLE -> state ALONE:snnny, ATOLE
      - snnny -> state ALONE:snnny, ATOLE:snnny
        + SHAME -> state ALONE:snnny, ATOLE:snnny, SHAME
          - yyyyy -> player 1 win
        + CRANE -> state ALONE:snnny, ATOLE:snnny, CRANE
          - nnyny -> player 2 win
    + CRANE -> state ALONE:snnny, CRANE
      - nnyny -> state CRANE:nnyny, ALONE:snnny
        + SHAME -> state CRANE:nnyny, ALONE:snnny, SHAME
          - yyyyy -> player 1 win
        + ATOLE -> state CRANE:nnyny, ALONE:snnny, ATOLE
          - snnny -> player 2 win
  - snnyy -> state ALONE:snnyy
    + SHAME -> state ALONE:snnyy, SHAME
      - nnyny -> state SHAME:nnyny, ALONE:snnyy
        + ATOLE -> state SHAME:nnyny, ALONE:snnyy, ATOLE
          - snnny -> player 2 win
        + CRANE -> state SHAME:nnyny, ALONE:snnyy, CRANE
          - yyyyy -> player 1 win
    + ATOLE -> state ALONE:snnyy, ATOLE
      - snnny -> state ALONE:snnyy, ATOLE:snnny
        + SHAME -> state ALONE:snnyy, ATOLE:snnny, SHAME
          - nnyny -> player 2 win
        + CRANE -> state ALONE:snnyy, ATOLE:snnny, CRANE
          - yyyyy -> player 1 win
    + CRANE -> state ALONE:snnyy, CRANE
      - yyyyy -> player 1 win
  - nnyny -> state SHAME:nnyny
    + CRANE -> state SHAME:nnyny, CRANE
      - yyyyy -> player 1 win
    + ATOLE -> state SHAME:nnyny, ATOLE
      - snnny -> state SHAME:nnyny, ATOLE:snnny
        + ALONE -> state SHAME:nnyny, ATOLE:snnny, ALONE
          - snnyy -> player 2 win
        + CRANE -> state SHAME:nnyny, ATOLE:snnny, CRANE
          - yyyyy -> player 1 win
    + ALONE -> state SHAME:nnyny, ALONE
      - snnyy -> state SHAME:nnyny, ALONE:snnyy
        + CRANE -> state SHAME:nnyny, ALONE:snnyy, CRANE
          - yyyyy -> player 1 win
        + ATOLE -> state SHAME:nnyny, ALONE:snnyy, ATOLE
          - snnny -> player 2 win
+ CRANE -> state (no hints), CRANE
  - yyyyy -> player 1 win
  - nnsny -> state CRANE:nnsny
    + SHAME -> state CRANE:nnsny, SHAME
      - nnsny -> state CRANE:nnsny, SHAME:nnsny
        + ALONE -> state CRANE:nnsny, SHAME:nnsny, ALONE
          - ysyny -> player 2 win
        + ATOLE -> state CRANE:nnsny, SHAME:nnsny, ATOLE
          - yyyyy -> player 1 win
    + ALONE -> state CRANE:nnsny, ALONE
      - ysyny -> state CRANE:nnsny, ALONE:ysyny
        + ATOLE -> state CRANE:nnsny, ALONE:ysyny, ATOLE
          - yyyyy -> player 1 win
        + SHAME -> state CRANE:nnsny, ALONE:ysyny, SHAME
          - nnsny -> player 2 win
    + ATOLE -> state CRANE:nnsny, ATOLE
      - yyyyy -> player 1 win
  - nnyny -> state CRANE:nnyny
    + ATOLE -> state CRANE:nnyny, ATOLE
      - snnny -> state CRANE:nnyny, ATOLE:snnny
        + SHAME -> state CRANE:nnyny, ATOLE:snnny, SHAME
          - yyyyy -> player 1 win
        + ALONE -> state CRANE:nnyny, ATOLE:snnny, ALONE
          - snnny -> player 2 win
    + SHAME -> state CRANE:nnyny, SHAME
      - yyyyy -> player 1 win
    + ALONE -> state CRANE:nnyny, ALONE
      - snnny -> state CRANE:nnyny, ALONE:snnny
        + ATOLE -> state CRANE:nnyny, ALONE:snnny, ATOLE
          - snnny -> player 2 win
        + SHAME -> state CRANE:nnyny, ALONE:snnny, SHAME
          - yyyyy -> player 1 win
  - nnsyy -> state CRANE:nnsyy
    + ALONE -> state CRANE:nnsyy, ALONE
      - yyyyy -> player 1 win
    + SHAME -> state CRANE:nnsyy, SHAME
      - nnsny -> state CRANE:nnsyy, SHAME:nnsny
        + ALONE -> state CRANE:nnsyy, SHAME:nnsny, ALONE
          - yyyyy -> player 1 win
        + ATOLE -> state CRANE:nnsyy, SHAME:nnsny, ATOLE
          - ynysy -> player 2 win
    + ATOLE -> state CRANE:nnsyy, ATOLE
      - ynysy -> state CRANE:nnsyy, ATOLE:ynysy
        + ALONE -> state CRANE:nnsyy, ATOLE:ynysy, ALONE
          - yyyyy -> player 1 win
        + SHAME -> state CRANE:nnsyy, ATOLE:ynysy, SHAME
          - nnsny -> player 2 win

If the game tree is already this huge with 4 words and 3 guesses, imagine how big it would get with all English 5-letter words. One common technique to reduce it is a "transposition table", where multiple occurrences of the same position are deduplicated. For instance, observe that the order in which guesses are made is irrelevant as it gives the exact same information regardless; for instance, ATOLE:snnny, ALONE:snnyy is the same board state as ALONE:snnyy, ATOLE:snnny, and thus the children of these nodes in the game tree are also the same. This gives us a "game DAG" (directed acyclic graph).

Figure 1: part of the game DAG for \( W = \{ \mathit{CRANE}, \mathit{ALONE}, \mathit{SHAME}, \mathit{ATOLE} \} \). Some paths lead to the same node, making this structure smaller than the game tree.

We can recursively calculate if, given a board state, the node is "winning" for the Guesser or the Hinter, meaning they have a winning strategy:

  • A board state where it's the Guesser's turn is winning for the Guesser if and only if either (1) less than \( \ell \) moves were made and an all-yes hint (yyyyy) was given, or (2) some move leads to a winning state;
  • A board state where it's the Hinter's turn is winning for the Guesser if and only if every move leads to a winning state.

Worst case, this naive algorithm will take time exponential in \( W \). The transposition table reduces the time from \( O(\text{# states in game tree}) \) to \( O(\text{# states in game DAG}) \), but this is still quite bad.

Set Representation

To hopefully reduce the number of unique states, we can choose an entirely different state representation. Instead of storing the guesses and hints so far, a state could store the set of words from \( W \) that are still possible. We will call this a set state, whereas we call the representation that stores all word-hint pairs explicitly a hint state.

For instance, the hint state \( \mathit{SHAME}:nnsny \) can also be represented by the set state \( \{ \mathit{ALONE}, \mathit{ATOLE} \} \), which is the set of words that is still consistent with those hints. As another example, both the hint state \( \mathit{SHAME}:nnsny, \mathit{ALONE}:ysyny \) and the hint state \( \mathit{SHAME}:nnsny, \mathit{CRANE}:nnsny \) can be represented by the set state \( \{ \mathit{ATOLE} \} \).

Definition: for a hint state \( S \), define \( [ S ]_W \) to be the set of words from \( W \) that are consistent with these hints.

Observation: since every hint state maps to one set state but not the other way around, the number of reachable set states is less than or equal to the number of reachable hint states. (Here, a "reachable" state is one for which there exists a sequence of moves from the starting state that reaches it.)

Thus, calculating the game DAG will likely be more efficient for this representation. It also gives us a clearer view of the information that we've gathered so far: if \( [ S_1 ]_W \) is a smaller set than \( [ S_2 ]_W \), that could give the Guesser an indication that being in state \( S_1 \) is better than being in state \( S_2 \).

Info Representation

We can make this notion of information even more explicit with a third representation, storing the knowledge of what we know about the remaining possible solutions.

There are two aspects of information when receiving hints, namely which positions can hold certain letters, and how often a letter must occur in total.

  • Green hint (y): its position can only hold that letter, and the letter must occur once in total.
  • Yellow hint (s): its position cannot hold that letter, and the letter must occur once in total.
  • Gray hint (n): its position cannot hold that letter, and the letter cannot occur any more often.

Example: hint \( \mathit{FIGHT}:nyyyy \) tells us that position 1 may not hold F, position 2 may only hold I, position 3 may only hold G, position 4 may only hold H, and position 5 may only hold T. It also tells us that the F must occur 0 times, while I/G/H/T must occur at least 1 time.

Example: hint \( \mathit{BROOK}:yysns \) tells us that position 1 may only hold the B, position 2 may only hold the R, positions 3 and 4 may not hold the O, and position 5 may not hold the K. It also tells us that B/R must both occur at least 1 time, O must occur exactly 1 time, and K must occur at least 1 time. Notice that a gray hint does not necessarily mean the letter cannot occur, it only means it can occur at most a certain number of times!

We encode this information as a triplet \( \mathcal{I} = ([L_1, \dots, L_5], \phi, \psi) \) with the following fields, where \( \Sigma \) is the alphabet set.

  • For every position \( 1 \leq i \leq 5 \), set \( L_i \subseteq \Sigma \) contains the letters that position \( i \) is allowed to hold.
  • \( \phi : \Sigma \rightarrow \mathbb{N} \) gives for every letter its minimum frequency.
  • \( \psi \subseteq \Sigma \) is the set of letters that have a maximum frequency, i.e., there were gray hints for them.

Definition: the move information function \( \mathit{move}(\mathit{guess}, \mathit{hint}) \), which returns the corresponding information given a move pair like \( \mathit{FIGHT}:nyyyy \).

Example: \( \mathit{move}(\mathit{FIGHT}, nyyyy) = ( [\Sigma \setminus \hspace{-0.4em} \{ F \}, \,\, \{ I \}, \,\, \{ G \}, \,\, \{ H \}, \,\, \{ T \}], \,\, \{ I G H T \mapsto 1, F \mapsto 0 \}, \,\, \{ F \} ) \) where \( \Sigma \setminus \hspace{-0.4em} \{ F \} \) is the whole alphabet except F.

We define two standard information states: the top state which contains no information (all words are still possible) and the bottom state which contains the maximum amount of information (no words are possible anymore, which is an unreachable state as the Hinter is not allowed to leave a hint with no possible words).

Definition: the top information state \( \mathcal{I}_{\top} = ([\Sigma, \dots, \Sigma], \{ \_ \mapsto 0 \}, \emptyset) \).

Definition: the bottom information state \( \mathcal{I}_{\bot} = ([\emptyset, \dots, \emptyset], \{ \_ \mapsto 0 \}, \Sigma) \).

Definition: the knowledge order \( \sqsubseteq \) on these triples, where \( \mathcal{I} \sqsubseteq \mathcal{I}' \) if \( \mathcal{I} \) gives at least as much information as \( \mathcal{I}' \).

The set of information states ordered on \( \sqsubseteq \) is a lattice, so for any two \( \mathcal{I} \) and \( \mathcal{I}' \) we can calculate the greatest lower bound \( \mathcal{I} \sqcap \mathcal{I}' \), which gives as much information as both but not more. We do this by taking the intersection of all the \( L \) sets, the minimum between both \( \phi \) functions, and the intersection of the \( \psi \) sets.

Definition: the word set \( [ \mathcal{I} ]_W \), those words that are still possible solutions given the information in \( \mathcal{I} \).

Observation: if \( \mathcal{I} \sqsubseteq \mathcal{I}' \), then \( [ \mathcal{I} ]_W \subseteq [ \mathcal{I}' ]_W \).

When we play a game, we start with state \( \mathcal{I} := \mathcal{I}_{\top} \). Whenever a move pair is made, so a guess + a hint, go to new state \( \mathcal{I}' := \mathcal{I} \sqcap \mathit{move}(\mathit{guess}, \mathit{hint}) \). If this results in \( [\mathcal{I}']_W = \emptyset \), then the move was illegal. Observe that always \( \mathcal{I}' \sqsubseteq \mathcal{I} \), i.e., we can only gain information, never lose it.

Observation: since every information state maps to only one set state but not the other way around, the number of reachable set states is less than or equal to the number of reachable information states.

Observation: since every hint state maps to only one information state but not the other way around, the number of reachable information states is less than or equal to the number of reachable hint states.

So, this representation is somewhere in between the previous two in terms of its reachable instance count. However, it is independent from \( W \), so the number of possible states does not grow with \( W \). We can also store it with \( O(1) \) space, unlike the set representations which must store subsets of \( W \).

Note that there is some redundancy in the representation, so a trick is to reduce the number of reachable states by normalizing them. For instance, if we get \( \phi(A) = 0 \) and \( A \in \psi \), i.e., the A must occur exactly 0 times, then we can also remove \( A \) from all \( L \) sets as it cannot occur in any position.

StockHorse Design

We now try to design an algorithm for solving Adversarial Wordle. I came up with the name StockHorse as a nod to Stockfish (the world's strongest chess engine).

There is no single efficient algorithm for partisan games, but Minimax with alpha-beta pruning is a popular metaheuristic. A heuristic is an algorithm that does not promise optimality or efficiency, but tries to do well in practice; a metaheuristic is a "template" heuristic algorithm.

Minimax is a depth-first search algorithm similar to branch-and-bound: it explores the "best" looking branches of the game tree first up to a certain depth \( d \), stores the best solution found so far, and cut off future ones that are definitely not good enough. We combine it with iterative deepening, which runs Minimax successively with depths \( d = 0, \dots, 2\ell \). This sounds counterintuitive (why not run Minimax once with \( d = 2\ell \)?) but redoing the work from lower-depth Minimax runs actually does not cost that much, and we can start with the best move of the previous run, which speeds up the algorithm significantly since move order matters a lot.

I did some preliminary investigation of which representation to use by computing the game DAG and counting the number of states. The results are shown in Table 1.

Representationsmall13small20medium50
Hint15 93378 436(too slow)
Set9813 82399 655
Info14 84171 3172 478 958
Info (normalized)14 84170 5822 452 132

Table 1: number of game DAG nodes for \( \ell = 3 \). Lower is better. We try with a 13-word, 20-word, and 50-word dataset.

On these small datasets, the set representation is a clear winner in terms of reachable state count, but I did find that the info representation took less time per state, as it is more efficient to store in a hash table. The info normalization gives about a 1% reduction.

After implementing Minimax with a transposition table and some other tricks, I tried both these representations again on bigger datasets and found different results.

Representationlarge500, up to \( d = 6 \)full, up to \( d = 4 \)
Set95 8763 821
Info(too slow)(too slow)
Info (normalized)124 8713 842

Table 2: transposition table size after algorithm for \( \ell = 6 \). Lower is better. We try with a 500-word dataset and the full WordleBot possible solution dataset.

So, the info representation mostly catches up as it does not scale in size with the word list. The normalization also turns out to be much more important as \( W \) grows larger, since the non-normalized info representation caused the process to crash from an out-of-memory error. Since it is more efficient to store in a dictionary, the info representation actually turns out to be the winner.

We still use the set representation to estimate how good a board game state is. The evaluation of a Hinter state is just \( |[\mathcal{I}]_W| \), the size of the set of words that are still possible (smaller is better), whereas the evaluation of a Guesser state is the size of this set + a term that estimates how good the most recently guessed word was, based on the info in \( \mathcal{I} \).

Bot Evaluation

What's nice is that the longer you run the bot, the better of a move it will give you. On \( d = 4 \), it takes about 30 seconds on my laptop and it thinks the best move is \( \mathit{ALDER} \), but you can wait for longer and get an even better first move.

The most interesting question is, what if we try to run our bot on 1-Player Wordle instead of Adversarial Wordle? Does it perform well?

# of GuessesStockHorse3Blue1Brown
100
233 (1.4%)80 (3.5%)
3582 (25.2%)1225 (52.9%)
41417 (61.4%)965 (41.7%)
5269 (11.7%)45 (1.9%)
600
Loss8 (0.3%)0
Average3.8213.421

Distribution of # of guesses that StockHorse and 3Blue1Brown's original Wordle engine need when running up to \( d = 4 \) on the full Wordle solution list (as of 2026-02-24).

The 8 losses are because some of the words from the Wordle solution list (mostly semi-vulgar words) were not in the WordleBot possible solution list, which is what I picked for \( W \).

So, 3Blue1Brown's original Wordle engine is clearly better. I will now tell you how I cope. First of all, this is to be expected since we are optimizing for different things: StockHorse tries to minimize the worst-case number of guesses, whereas most Wordle bots, like 3Blue1Brown's, try to minimize the expected number of guesses given the probabilities certain words are the solution. I think the results are still pretty good: we get 4 most of the times, 3 pretty often, 5 sometimes. This is roughly on par with good human play, and theoretically we cannot do better than 5 in the worst case.

S T O C K
H O R S E
P O

Bot Details

For those interested, here are some details. Also see the code, which is written in Rust.

State representation:

  • Store words in 25 bits (within 4 bytes), since storing a letter takes up 5 bits.
  • A set of letters can be stored as a bitset, so using 26 bits (within 4 bytes).
  • I represent states by the information states, which take up 128 bytes per instance.

Search algorithm:

  • I use iterative deepening running Minimax (with alpha-beta pruning).
  • There is a transposition table that stores for Guesser states (1) the evaluation, (2) the principal variation, (3) the depth it was calculated at, and (4) whether this is an upper bound, lower bound, or exact value (necessary when using alpha-beta pruning).
  • Move ordering is done by sorting the words based on letter frequency and the hints (\( nnnnn \) up to \( yyyyy \)) by how many \( s \)s and \( y \)s occur in them, at the start of the game. During the search, it always places the principal move from the last ID iteration at the start (if available) and then does the rest as usual.

Evaluation:

  • The remaining set size \( |[\mathcal{I}]_W| \).
  • In Hinter states, it also subtracts a term based on how "bad" the last guess is expected to be: it has letters in positions where it cannot occur, it contains duplicate letters, or (early in the game) it contains letters that are already known to occur at most a certain number of times.

I did not tweak it that much, so many improvements can still be made. Move ordering could be improved, info normalization is quite slow, and calculation of \( [\mathcal{I}]_W \) could be optimized. The evaluation function for Hinter states is also not really tweaked. Feel free to play around with the code and send me an email/PR if you find improvements.

Interesting Questions

Our game can be defined in terms of the lattice that we discussed before. Can we generalize the concept of a game that goes down a lattice and needs to reach a certain state within a certain number of moves? Does the structure of a lattice tell us anything interesting, or can we perform certain optimizations in the solver algorithm that are not necessarily possible for arbitrary games?

Examples of similar games that could fall under such a generalization are Adversarial Hangman, Adversarial "I Spy With My Little Eye" (which would be very frustrating if you were the Guesser), and Adversarial Battleship.

For \( \ell = 6 \), what is the smallest set \( W \) of 5-letter words that is winning for the Hinter? What about different values for \( \ell \)? What if you are only allowed to use English words and not arbitrary 5-letter sequences?

What is the smallest decision tree (i.e., the one with the fewest branches not counting the final guaranteed correct guesses), that solves Wordle within 5 moves? Or 6 moves?

What if besides a set of possible words \( W \) we also have a different set of allowed words that the Guesser can choose from? For instance, if we have possible words \( W = \{ \mathit{NATTY}, \mathit{CATTY}, \mathit{PATTY} \} \) and \( \ell = 2 \) guesses, it would be really convenient for the Guesser if they could first choose \( \mathit{PINCH} \) to identify what the first letter was, and then always get it right the second time. It would however expand the game tree massively if this set was very large.

Can we adapt the bot for Wordle Hard Mode, where every guess must still be a valid possible answer given the information from all previous hints?