spt / config.json
imdatta0's picture
Add Sherlock Pretrained tranformer
be3a39d
raw
history blame contribute delete
373 Bytes
{
"architectures": ["SPTForCausalLM"],
"model_type": "spt",
"n_layers": 12,
"vocab_size": 97,
"hidden_size": 512,
"n_attn_heads": 16,
"n_kv_heads": 16,
"intermediate_size": 2048,
"max_len": 2048,
"residual": true,
"normalise": true,
"bos_token_id": 95,
"eos_token_id": 95,
"pad_token_id": 95,
"unk_token_id": 96
}