GenerTeam commited on
Commit
9f6e787
·
verified ·
1 Parent(s): 72c1d4f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -3
README.md CHANGED
@@ -13,10 +13,10 @@ arxiv: 2502.07272
13
 
14
  ## **Important Notice**
15
  If you are using **GENERator** for sequence generation, please ensure that the length of each input sequence is a multiple of **6**. This can be achieved by either:
16
- 1. Padding the sequence on the left with `'A'` (**left padding**), or
17
- 2. Simply truncating the sequence from the left (**left truncation**).
18
 
19
- This requirement arises because **GENERator** employs a 6-mer tokenizer. If the input sequence length is not a multiple of **6**, the tokenizer will append an `<oov>` (out-of-vocabulary) token to the end of the token sequence. This can result in uninformative subsequent generations, such as repeated `'AAAAAA'`.
20
 
21
  We apologize for any inconvenience this may cause and recommend adhering to the above guidelines to ensure accurate and meaningful generation results.
22
 
 
13
 
14
  ## **Important Notice**
15
  If you are using **GENERator** for sequence generation, please ensure that the length of each input sequence is a multiple of **6**. This can be achieved by either:
16
+ 1. Padding the sequence on the left with `'A'` (**left padding**);
17
+ 2. Truncating the sequence from the left (**left truncation**).
18
 
19
+ This requirement arises because **GENERator** employs a 6-mer tokenizer. If the input sequence length is not a multiple of **6**, the tokenizer will append an `'<oov>'` (out-of-vocabulary) token to the end of the token sequence. This can result in uninformative subsequent generations, such as repeated `'AAAAAA'`.
20
 
21
  We apologize for any inconvenience this may cause and recommend adhering to the above guidelines to ensure accurate and meaningful generation results.
22