Intruction

This is a pre-trained weight for SimpleGPT.

It was trained on tinyshakespeare and used hyper-parameter as follows

n_layer: 12,
n_head: 12,
embed_dim: 768,
time_step: 256,
bias: False,
vocab_size: 65,
dropout: 0.0
iter_num: 50000

File Content

ckpt_iter_50000.pt contains 6 items

checkpoint = {
  'state_dict': raw_model.state_dict(),
  'optimizer': optimizer.state_dict(),
  'model_args': model_args,
  'iter_num': iter_num,
  'best_val_loss': best_val_loss,
  'config': config,
}

Use this little script to display them

import os
import torch

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
ckpt_path = "path/to/ckpt_iter_50000.pt"
assert os.path.exists(ckpt_path), f"{ckpt_path} doesn't exit."
checkpoint = torch.load(ckpt_path, map_location=device)

model_args = checkpoint['model_args']
print(model_args)

state_dict = checkpoint['state_dict']
for layer_name, weight_matrix in state_dict.items():
  print(f"{layer_name}\t{weight_matrix.shape}")

Usage

git clone my repository SimpleGPT | whut-zhangwx. Follow the script generate.py to load checkpoint into GPT model to do generation.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.

Model tree for whut-zhangwx/SimpleGPT

Unable to build the model tree, the base model loops to the model itself. Learn more.