Wordle Solver Wordle Simulator Details

Informations about the site and solver

Two lists of words

the solver use two lists of words, one is all the words that are usable as guesses and the other one is all words that could end up being the solution I found those on internet, don't remember where though, it seems to match the ones that other algos use too like: MIT's algorithm

from time to time the solution isn't in the list of possible solutions, so I check server side if it's the case and if it is I add it to the list of possible solutions (due to the different timezones it actually look for the previous and following day of the current one too (UTC))

Wordle Solver, the way it works

the way that the algorithm works at its core is that it takes every possible guesses and for each one it looks at every possible solution and split them according to their color pattern
(if the possible guess is "apple" then "might" and "right" will both be put in the colors BBBBB where as "light" will be put in BBBYB)
then for each color group, it adds to the total score of the guess:
the probability to get the color pattern * the number of words that it would remove from the possible solutions
slight problem with that method is that the number of possible guesses and solutions is pretty big like 14_855 and 2309 respectively which means that the function that calculate the colors needs to run 34_300_195 times for the first guess which is a lot, and make it takes around 90 seconds to calculate the first guess

Optimisations

Color patterns caching

the optimisation I first tried was to cache the color pattern for every possible possible guess-solution duo which worked great reducing from 90 seconds to only 12! though a 'little' problem was that it takes quite a bit of memory and basically just made my vps unusable a bit after I deployed it
so I sadly had to remove that optimisation

Compexity Reducing

the second optimisation I came up with, I was doubtfull that it would work great but it really did more so that I could have thought
the idea is actually pretty simple, instead of evaluating each guess one by one on each solution, let's just do a shallow evaluation before, what it means is that we'll first evaluate all possible guesses on max 100 possible solutions instead of the max 2309
(if there is already less than that it doesn't change anything)
and then take the 40 guesses that scored the best and evaluate them like we did before
which in terms of complexity is great before it was: (n = number of possible solutions, m = number of possible guesses)
- n * m (34_300_195 max)
- min(n, 100) * m + n * min(m, 40) (1_577_860 max)
which reduces the time at a bit more that a second! with that I had the possibility to implement the edit mode

Cython

the third optimisation the most simple of all, I just compiled the python code that do all the calcul to cython with that I achieved to go slightly under a second (on my computer at least)

Caching the first two guesses

for the 4th one back to the caching strategy but instead of caching the color patterns, I cached the first two guesses but there is a bit more to that, in that I didn't just calcul what my algorithm would give for the first guess and then the second based on the color pattern instead I did an extensive calcul to get yet better words in the same time
the way I did that for both the first and second guesses is that I took the ~20 best guesses and for each one used the algorithm previously described on all possible solutions to evaluate each one's average number of guesses of them and cached the best ones, with that strategy I got from 3,54 guesses in average to 3,499, by comparaison the best possible score is 3,421 as achieved by the MIT's algorithm

Site

for the frontend I used nextjs, react, typescript and tailwindcss
for the backend I used flask (python)

for the yellow/star button on the wordle-solver page, it colors all the guesses to the current included according to the current wordle solution, the current local day is collected using js and passing to a request to my backend, so that whatever the local timezone, it will color the guesses accordingly (I couldn't just do the request directly from the client to the official wordle site (because of CORS))