Intruction
This is a pre-trained weight for SimpleGPT.
It was trained on tinyshakespeare and used hyper-parameter as follows
n_layer: 12,
n_head: 12,
embed_dim: 768,
time_step: 256,
bias: False,
vocab_size: 65,
dropout: 0.0
iter_num: 50000
File Content
ckpt_iter_50000.pt contains 6 items
checkpoint = {
'state_dict': raw_model.state_dict(),
'optimizer': optimizer.state_dict(),
'model_args': model_args,
'iter_num': iter_num,
'best_val_loss': best_val_loss,
'config': config,
}
Use this little script to display them
import os
import torch
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
ckpt_path = "path/to/ckpt_iter_50000.pt"
assert os.path.exists(ckpt_path), f"{ckpt_path} doesn't exit."
checkpoint = torch.load(ckpt_path, map_location=device)
model_args = checkpoint['model_args']
print(model_args)
state_dict = checkpoint['state_dict']
for layer_name, weight_matrix in state_dict.items():
print(f"{layer_name}\t{weight_matrix.shape}")
Usage
git clone my repository SimpleGPT | whut-zhangwx. Follow the script generate.py to load checkpoint into GPT model to do generation.
Inference Providers
NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API:
The model has no library tag.
Model tree for whut-zhangwx/SimpleGPT
Unable to build the model tree, the base model loops to the model itself. Learn more.