Wordle Solver with RL

A model that solves Wordle using reinforcement learning

The solution and code is a merge of Andrew Ho reinforcement learning solution and Morvan Zhou a3c Pytorch implementation

Instalation

Just make sure you have installed:

python >= 3.8
pip >= 22.3.1 And then, on the project root, execute:

pip install -r requirements.txt

Running and execution modes

To run the program, on the project root execute:

python main.py [ENV][MODE] [PARAMETERS]

[ENV] --> is the enviroment or size of the game loaded. It can be:
- WordleEnv100OneAction-v0 --> 100 elegible words but only one goal word
- WordleEnv100TwoAction-v0 --> 100 elegible words but only two goal words
- WordleEnv100fiftyAction-v0 -> 100 elegible words but fifty goal words
- WordleEnv100FullAction-v0 --> 100 elegible words and all words can be the goal word
- WordleEnv1000FullAction-v0 --> 1000 elegible words and all words can be the goal word
- WordleEnvFull-v0 --> All possible elegible words (2300) and all words can be the goal word
[MODE] --> is the execution mode
[PARAMETERS] --> are the arguments needed to run a certain mode.

There are three possible execution modes:

train --> Train a model (or models) to learn to play wordle with different game sizes and hyperparameters
eval --> Evaluate the trained model(s) saved for a particular game size
play --> Have a model suggest what is the next word for a wordle game given the words already tried and the game outputs

Below each of the execution modes are explained more detailed.

Training

For training a model, run:

python main.py [ENV] train -g [GAMES] --gamma [GAMMA] --seed [SEED] --save --min_reward [MIN_REWARD] --every_n_save [EVERY_N] --model_name [MODEL_NAME]

Parameters:

[ENV] --> the enviroment or game size for which the model will be trained, they are listed in the above section
-g [GAMES] --> number of training games played
--gamma [GAMMA] --> optional, default 0, the discount factor for the reward obtained by playing each step of the game
--seed [SEED] --> optional, default 100, seed used for the generation of random numbers
--save --> optional, default False, if present, instances of the trained model are saved while training
--min_reward [MIN_REWARD] --> optional, default 9.9, only works if --save is present, the min global reward the model has to reach to be saved
--every_n_save [EVERY_N] --> optional, default 100, only works if --save is present, inicates how often the model is saved (whether the model is saved or not depends on --min_reward)
--model_name [MODEL_NAME] --> optional, the name of the saved model if want to train from a pretrained model.

Evaluation

For evaluation a model of a particular enviroment, run:

python main.py [ENV] eval

[ENV] --> the environment on which the models will be evaluated, only models trained for that specific environment will be evaluated.

Play

For word suggestion, run:

python main.py enviroment play --words [WORDS] --states [STATES] --model_name [MODEL_NAME]

[ENV] --> the environment or game size from which a word is suggested
--words [WORDS] --> List of words played in the wordle game which is being played
--states [STATES] --> List of states returned by the result of playing each of the words, the state must be represented following this rules:
- a 0 if the letter wasn't in the word
- a 1 if the letter was in the word but not in the correct position
- a 2 if the letter was in the word and in the correct position
--model_name [MODEL_NAME] --> Name of the pretrained model file which will play the game

Branches

There are three different working branches on the project, each one with a different representation of the state of the game

main --> state represented as one hot encoding of letters and the state of the letters (if the letter was or not in the word) in every position of the words guessed
cosin-state --> state represented as the letters used in every guess and the state of letters in each guess. Letter are represented as a pair of cos and sin functions applied to an numerical representation of the letters
simpler-state --> state represented as the letters used in the current guess and the state of letters in the guess. Letter are represented by converting their integer representation to binary