Spaces:
Sleeping
Sleeping
Add readme, including instalation, execution modes and parameters and branch explanation
Browse files
README.md
ADDED
@@ -0,0 +1,74 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# Wordle Solver with RL
|
2 |
+
|
3 |
+
A model that solves Wordle using reinforcement learning
|
4 |
+
|
5 |
+
## Instalation
|
6 |
+
Just make sure you have installed:
|
7 |
+
* python >= 3.8
|
8 |
+
* pip >= 22.3.1
|
9 |
+
And then, on the project root, execute:
|
10 |
+
|
11 |
+
`pip install -r requirements.txt`
|
12 |
+
|
13 |
+
## Running and execution modes
|
14 |
+
To run the program, on the project root execute:
|
15 |
+
|
16 |
+
`python main.py [ENV][MODE] [PARAMETERS]`
|
17 |
+
|
18 |
+
* [ENV] --> is the enviroment or size of the game loaded. It can be:
|
19 |
+
* WordleEnv100OneAction-v0 --> 100 elegible words but only one goal word
|
20 |
+
* WordleEnv100TwoAction-v0 --> 100 elegible words but only two goal words
|
21 |
+
* WordleEnv100fiftyAction-v0 -> 100 elegible words but fifty goal words
|
22 |
+
* WordleEnv100FullAction-v0 --> 100 elegible words and all words can be the goal word
|
23 |
+
* WordleEnv1000FullAction-v0 --> 1000 elegible words and all words can be the goal word
|
24 |
+
* WordleEnvFull-v0 --> All possible elegible words (2300) and all words can be the goal word
|
25 |
+
* [MODE] --> is the execution mode
|
26 |
+
* [PARAMETERS] --> are the arguments needed to run a certain mode.
|
27 |
+
|
28 |
+
There are three possible execution modes:
|
29 |
+
* train --> Train a model (or models) to learn to play wordle with different game sizes and hyperparameters
|
30 |
+
* eval --> Evaluate the trained model(s) saved for a particular game size
|
31 |
+
* play --> Have a model suggest what is the next word for a wordle game given the words already tried and the game outputs
|
32 |
+
|
33 |
+
Below each of the execution modes are explained more detailed.
|
34 |
+
### Training
|
35 |
+
For training a model, run:
|
36 |
+
|
37 |
+
`python main.py [ENV] train -g [GAMES] --gamma [GAMMA] --seed [SEED] --save --min_reward [MIN_REWARD] --every_n_save [EVERY_N] --model_name [MODEL_NAME]`
|
38 |
+
|
39 |
+
Parameters:
|
40 |
+
* [ENV] --> the enviroment or game size for which the model will be trained, they are listed in the above section
|
41 |
+
* -g [GAMES] --> number of training games played
|
42 |
+
* --gamma [GAMMA] --> optional, default 0, the discount factor for the reward obtained by playing each step of the game
|
43 |
+
* --seed [SEED] --> optional, default 100, seed used for the generation of random numbers
|
44 |
+
* --save --> optional, default False, if present, instances of the trained model are saved while training
|
45 |
+
* --min_reward [MIN_REWARD] --> optional, default 9.9, only works if --save is present, the min global reward the model has to reach to be saved
|
46 |
+
* --every_n_save [EVERY_N] --> optional, default 100, only works if --save is present, inicates how often the model is saved (whether the model is saved or not depends on --min_reward)
|
47 |
+
* --model_name [MODEL_NAME] --> optional, the name of the saved model if want to train from a pretrained model.
|
48 |
+
|
49 |
+
### Evaluation
|
50 |
+
For evaluation a model of a particular enviroment, run:
|
51 |
+
|
52 |
+
`python main.py [ENV] eval`
|
53 |
+
|
54 |
+
* [ENV] --> the environment on which the models will be evaluated, only models trained for that specific environment will be evaluated.
|
55 |
+
|
56 |
+
### Play
|
57 |
+
For word suggestion, run:
|
58 |
+
|
59 |
+
`python main.py enviroment play --words [WORDS] --states [STATES] --model_name [MODEL_NAME]`
|
60 |
+
|
61 |
+
* [ENV] --> the environment or game size from which a word is suggested
|
62 |
+
* --words [WORDS] --> List of words played in the wordle game which is being played
|
63 |
+
* --states [STATES] --> List of states returned by the result of playing each of the words, the state must be represented following this rules:
|
64 |
+
* a 0 if the letter wasn't in the word
|
65 |
+
* a 1 if the letter was in the word but not in the correct position
|
66 |
+
* a 2 if the letter was in the word and in the correct position
|
67 |
+
* --model_name [MODEL_NAME] --> Name of the pretrained model file which will play the game
|
68 |
+
|
69 |
+
## Branches
|
70 |
+
There are three different working branches on the project, each one with a different representation of the state of the game
|
71 |
+
|
72 |
+
* main --> state represented as one hot encoding of letters and the state of the letters (if the letter was or not in the word) in every position of the words guessed
|
73 |
+
* cosin-state --> state represented as the letters used in every guess and the state of letters in each guess. Letter are represented as a pair of cos and sin functions applied to an numerical representation of the letters
|
74 |
+
* simpler-state --> state represented as the letters used in the current guess and the state of letters in the guess. Letter are represented by converting their integer representation to binary
|