gsar78 commited on
Commit
4182019
·
verified ·
1 Parent(s): 60ab72f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -0
README.md CHANGED
@@ -42,6 +42,7 @@ Tokenized String: This Ġis Ġa Ġgame
42
  ### Recommendations
43
 
44
  When tokenizing Greek, Greek tokens may appear as gibberish, but actually this does not impact the downstream model pretraining.
 
45
  (An improved version of this tokenizer, without the gibberish Greek tokens can be found here: gsar78/Greek_Tokenizer)
46
 
47
  Can be used a good start for pretraining a GPT-based model or any other model using BPE.
 
42
  ### Recommendations
43
 
44
  When tokenizing Greek, Greek tokens may appear as gibberish, but actually this does not impact the downstream model pretraining.
45
+
46
  (An improved version of this tokenizer, without the gibberish Greek tokens can be found here: gsar78/Greek_Tokenizer)
47
 
48
  Can be used a good start for pretraining a GPT-based model or any other model using BPE.