File size: 4,465 Bytes
01007c3
 
 
 
21456ba
 
01007c3
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
# Wordle Solver with RL

A model that solves Wordle using reinforcement learning

The solution and code is a merge of [Andrew Ho reinforcement learning solution](https://andrewkho.github.io/wordle-solver/) and [Morvan Zhou a3c Pytorch implementation](https://github.com/MorvanZhou/pytorch-A3C)

## Instalation
Just make sure you have installed:
* python >= 3.8
* pip >= 22.3.1
And then, on the project root, execute:

`pip install -r requirements.txt`

## Running and execution modes
To run the program, on the project root execute:

`python main.py [ENV][MODE] [PARAMETERS]`

* [ENV] --> is the enviroment or size of the game loaded. It can be:
    * WordleEnv100OneAction-v0 --> 100 elegible words but only one goal word
    * WordleEnv100TwoAction-v0 --> 100 elegible words but only two goal words
    * WordleEnv100fiftyAction-v0 -> 100 elegible words but fifty goal words
    * WordleEnv100FullAction-v0 --> 100 elegible words and all words can be the goal word
    * WordleEnv1000FullAction-v0 --> 1000 elegible words and all words can be the goal word
    * WordleEnvFull-v0 --> All possible elegible words (2300) and all words can be the goal word
* [MODE] --> is the execution mode
* [PARAMETERS] --> are the arguments needed to run a certain mode.

There are three possible execution modes:
* train --> Train a model (or models) to learn to play wordle with different game sizes and hyperparameters
* eval --> Evaluate the trained model(s) saved for a particular game size
* play --> Have a model suggest what is the next word for a wordle game given the words already tried and the game outputs

Below each of the execution modes are explained more detailed.
### Training
For training a model, run:

`python main.py [ENV] train -g [GAMES] --gamma [GAMMA] --seed [SEED] --save --min_reward [MIN_REWARD] --every_n_save [EVERY_N] --model_name [MODEL_NAME]`

Parameters:
* [ENV] --> the enviroment or game size for which the model will be trained, they are listed in the above section
* -g [GAMES] --> number of training games played
* --gamma [GAMMA] --> optional, default 0, the discount factor for the reward obtained by playing each step of the game
* --seed [SEED] --> optional, default 100, seed used for the generation of random numbers
* --save --> optional, default False, if present, instances of the trained model are saved while training
* --min_reward [MIN_REWARD] --> optional, default 9.9, only works if --save is present, the min global reward the model has to reach to be saved
* --every_n_save [EVERY_N] --> optional, default 100, only works if --save is present, inicates how often the model is saved (whether the model is saved or not depends on --min_reward)
* --model_name [MODEL_NAME] --> optional, the name of the saved model if want to train from a pretrained model.

### Evaluation
For evaluation a model of a particular enviroment, run:

`python main.py [ENV] eval`

* [ENV] --> the environment on which the models will be evaluated, only models trained for that specific environment will be evaluated.

### Play
For word suggestion, run:

`python main.py enviroment play --words [WORDS] --states [STATES] --model_name [MODEL_NAME]`

* [ENV] --> the environment or game size from which a word is suggested
* --words [WORDS] --> List of words played in the wordle game which is being played
* --states [STATES] --> List of states returned by the result of playing each of the words, the state must be represented following this rules:
    * a 0 if the letter wasn't in the word
    * a 1 if the letter was in the word but not in the correct position
    * a 2 if the letter was in the word and in the correct position
* --model_name [MODEL_NAME] --> Name of the pretrained model file which will play the game

## Branches
There are three different working branches on the project, each one with a different representation of the state of the game

* main --> state represented as one hot encoding of letters and the state of the letters (if the letter was or not in the word) in every position of the words guessed
* cosin-state --> state represented as the letters used in every guess and the state of letters in each guess. Letter are represented as a pair of cos and sin functions applied to an numerical representation of the letters
* simpler-state --> state represented as the letters used in the current guess and the state of letters in the guess. Letter are represented by converting their integer representation to binary