Solving Foodle/Wordle
NOTE: This article was written in collaboration with Manit Kaushik , a BTech (CSE) undergraduate student at Indraprastha Institute of Information Technology, Delhi . Manit drafted this article on Wordle/Foodle solver as part of his project.
Foodle URL: https://cosylab.iiitd.edu.in/foodle/
Wordle is a word-guessing game where players have six attempts to identify a secret five-letter word correctly. For each guess, the game provides feedback by coloring the letters of the guessed word: green if the letter is correct and in the proper position, yellow if the letter is correct but in the wrong position, and grey if the letter is not in the word. The objective is to deduce the target word through strategic guessing and the process of elimination within six attempts, fostering a blend of linguistic intuition and deductive reasoning.
?The first guess in the Wordle game is almost always a blind guess, but how about mathematically increasing our chances of getting yellow or green letters on the first try or maybe the first 2-3 tries? One effective initial guessing strategy is to select a word that contains the most commonly occurring letters.
?Each letter’s frequency in this word is determined by the total number of times that letter appears in all the possible words in our chosen dataset. Here, we are dealing with two datasets. One is the valid solution of the Wordle puzzle, which has 2,315 words; the other is all possible five-letter words in the English Dictionary. Essentially, all the valid guesses that the Wordle puzzle can take are approximately 13,000 words. We aim to increase the probability of getting yellow and green letters on our first try. So, it would make sense to use the valid guesses dataset to look for the most common occurrences.
The first step to achieve our goal is to see the alphabet frequency for each letter, i.e. computing all the times an alphabet appears in the five-letter words.
By looking at the above graph, we know which letters to use to increase our probability of yellow letters. However, to increase the likelihood of achieving any green letter, we should also focus on the frequency of a letter at a given index, namely the letter index frequency. A letter can only occur at five indexes in a letter word. The probability of a letter occurring in any of these five indexes is calculated by reviewing the dataset of valid guesses and seeing where each letter occurs while incrementing its letter index frequency for that particular index. Here is the heatmap for the letter index frequency and values of letter index probability for all the letters.
Now, to improve the score for each word, we combine the Letter Index Probability (LIP) with the Alphabet Frequency (AF) in such a way that -
We then pivot our dataset for only valid solutions so that our starting word is a solution. We calculate the score for each word in the valid solutions dataset. This is the final list based on the decreasing order of scores.
The above work was based on maximizing alphabet frequency & letter index probability. To further improve our chances of solving the Wordle, we will apply another strategy commonly used by the players. This involves using the first three tries to input three fixed seed words, which are unique to each other, have no common letters, and cover all the vowels. This strategy helps the players guess the answer to the puzzle in the next tries. This is because, most of the time, players get many critical cues due to the diverse nature of the seed words. One possible seed word set is Alter, Bison, and Duchy.
?Before using this strategy, we must remove words in the score list with repeating letters. This is necessary since we want the number of unique letters in the three seed words to be as high as possible, increasing the probability of getting yellow letters if not green letters. The above list, after doing this operation, looks like this.
领英推荐
Moving on, we now create combinations of these words remaining in the scores list, such that the combination of three words has no common letters. This idea follows the main idea of our strategy. Simultaneously, we should also calculate the sum total score for that particular combination of words by adding the scores for each word in that combination. This will help us get the seed words with the highest possible score and are diverse, maximizing the probability of getting yellow and green letters in our first 3 tries.
?These are possible seed words listed in descending order of their total score -
From the above list, we finish with the words? Bonus, Camel, and Dirty being our seed words, i.e. our first 3 tries. For our Wordle Solver, the program will always use its first 3 tries as? Bonus, Camel, and Dirty.
?So far, we have reduced our search space to the minimum possible solutions. But doing so has cost us three tries out of six. Our search space for potential solutions that fit our current list of yellow and green letters may have more than three words. Hence, a simple brute force method is not going to work.
?Each word left in the search space has a corresponding score assigned in previous steps. We will take the word in the search space with the maximum score for each of the next guesses and use that as our next try. Then, using the new information, such as new greyed-out letters and potentially new yellow and green letters, we eliminate the words in the search space and repeat the process until we reach the answer, or the guesses are exhausted.
?Let’s see how our Wordle solver strategy fairs in the game. We test how the program handles every possible solution, from the 2315 solutions to the puzzle.
Out of all the possible solutions to Wordle, our program solved for every word except for 13. That’s an accuracy of 99.43%! The words the program couldn’t guess in six tries or less were eager, eater, hover, jaunt, joist, jolly, rover, stave, vaunt, voter, wafer, waste, and wound. Following the popularity of Wordle, we have designed an equivalent word game with a culinary lexicon: Foodle . The idea is to nudge people into eating better (nutritious and healthy) by invoking the power of gamification.
?Applying the same strategy to the game of Foodle, we get the following alphabet frequency graph and the letter index probability graphs.
The seed words set come out to be Salty, Boned, and Prick. When tested against the game of Foodle, this is how the program fairs.
The program successfully solves every word in Foodle, suggesting that Foodle is a simpler version of Wordle!
To play Foodle: https://cosylab.iiitd.edu.in/foodle/
Assistant Vice President at Bank of America
7 个月Super Proud Manit !