santit96 commited on
Commit
01007c3
·
1 Parent(s): bb63fb8

Add readme, including instalation, execution modes and parameters and branch explanation

Browse files
Files changed (1) hide show
  1. README.md +74 -0
README.md ADDED
@@ -0,0 +1,74 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Wordle Solver with RL
2
+
3
+ A model that solves Wordle using reinforcement learning
4
+
5
+ ## Instalation
6
+ Just make sure you have installed:
7
+ * python >= 3.8
8
+ * pip >= 22.3.1
9
+ And then, on the project root, execute:
10
+
11
+ `pip install -r requirements.txt`
12
+
13
+ ## Running and execution modes
14
+ To run the program, on the project root execute:
15
+
16
+ `python main.py [ENV][MODE] [PARAMETERS]`
17
+
18
+ * [ENV] --> is the enviroment or size of the game loaded. It can be:
19
+ * WordleEnv100OneAction-v0 --> 100 elegible words but only one goal word
20
+ * WordleEnv100TwoAction-v0 --> 100 elegible words but only two goal words
21
+ * WordleEnv100fiftyAction-v0 -> 100 elegible words but fifty goal words
22
+ * WordleEnv100FullAction-v0 --> 100 elegible words and all words can be the goal word
23
+ * WordleEnv1000FullAction-v0 --> 1000 elegible words and all words can be the goal word
24
+ * WordleEnvFull-v0 --> All possible elegible words (2300) and all words can be the goal word
25
+ * [MODE] --> is the execution mode
26
+ * [PARAMETERS] --> are the arguments needed to run a certain mode.
27
+
28
+ There are three possible execution modes:
29
+ * train --> Train a model (or models) to learn to play wordle with different game sizes and hyperparameters
30
+ * eval --> Evaluate the trained model(s) saved for a particular game size
31
+ * play --> Have a model suggest what is the next word for a wordle game given the words already tried and the game outputs
32
+
33
+ Below each of the execution modes are explained more detailed.
34
+ ### Training
35
+ For training a model, run:
36
+
37
+ `python main.py [ENV] train -g [GAMES] --gamma [GAMMA] --seed [SEED] --save --min_reward [MIN_REWARD] --every_n_save [EVERY_N] --model_name [MODEL_NAME]`
38
+
39
+ Parameters:
40
+ * [ENV] --> the enviroment or game size for which the model will be trained, they are listed in the above section
41
+ * -g [GAMES] --> number of training games played
42
+ * --gamma [GAMMA] --> optional, default 0, the discount factor for the reward obtained by playing each step of the game
43
+ * --seed [SEED] --> optional, default 100, seed used for the generation of random numbers
44
+ * --save --> optional, default False, if present, instances of the trained model are saved while training
45
+ * --min_reward [MIN_REWARD] --> optional, default 9.9, only works if --save is present, the min global reward the model has to reach to be saved
46
+ * --every_n_save [EVERY_N] --> optional, default 100, only works if --save is present, inicates how often the model is saved (whether the model is saved or not depends on --min_reward)
47
+ * --model_name [MODEL_NAME] --> optional, the name of the saved model if want to train from a pretrained model.
48
+
49
+ ### Evaluation
50
+ For evaluation a model of a particular enviroment, run:
51
+
52
+ `python main.py [ENV] eval`
53
+
54
+ * [ENV] --> the environment on which the models will be evaluated, only models trained for that specific environment will be evaluated.
55
+
56
+ ### Play
57
+ For word suggestion, run:
58
+
59
+ `python main.py enviroment play --words [WORDS] --states [STATES] --model_name [MODEL_NAME]`
60
+
61
+ * [ENV] --> the environment or game size from which a word is suggested
62
+ * --words [WORDS] --> List of words played in the wordle game which is being played
63
+ * --states [STATES] --> List of states returned by the result of playing each of the words, the state must be represented following this rules:
64
+ * a 0 if the letter wasn't in the word
65
+ * a 1 if the letter was in the word but not in the correct position
66
+ * a 2 if the letter was in the word and in the correct position
67
+ * --model_name [MODEL_NAME] --> Name of the pretrained model file which will play the game
68
+
69
+ ## Branches
70
+ There are three different working branches on the project, each one with a different representation of the state of the game
71
+
72
+ * main --> state represented as one hot encoding of letters and the state of the letters (if the letter was or not in the word) in every position of the words guessed
73
+ * cosin-state --> state represented as the letters used in every guess and the state of letters in each guess. Letter are represented as a pair of cos and sin functions applied to an numerical representation of the letters
74
+ * simpler-state --> state represented as the letters used in the current guess and the state of letters in the guess. Letter are represented by converting their integer representation to binary