IJK Technology β ByteGPT-small
ByteGPT-small is a small GPT-style language model trained using byte tokenization inspired by the ByT5 paper. It is designed for use on compute- and memory-constrained devices, such as mobile phones and embedded systems.
π Overview
- Model Type: GPT-style causal language model
- Tokenizer: Byte-level tokenization (from ByT5)
- Intended Use: Edge devices, mobile phones, embedded systems
- Size: Small (initial prototype)
- Training: Custom-trained from scratch
π§ Why Byte Tokenization?
Byte tokenization offers several advantages for small-scale, efficient models:
Reduced Memory Footprint:
Byte-level tokenization drastically reduces the size of the embedding layer, making the model suitable for devices with limited RAM.No External Dependencies:
Unlike subword tokenizers (e.g., SentencePiece, BPE), byte tokenization requires no external libraries for tokenization. A simple Python script can handle tokenization.Robustness to Noise:
Byte-level models are more robust to misspellings, typos, and out-of-vocabulary tokens.
π‘ Future Plans
This is the first in a series of models. While this model is not yet highly useful due to its small size, it represents the foundation for future versions. Upcoming releases will include:
- Larger Models: Scaled-up versions with better performance
- Distilled Models: Using GPRO distillation to create highly efficient small models
- Benchmark Results: Comparative performance on mobile devices
π» Usage
Quick Start (with transformers
):
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("ijktech/ByteGPT-small", trust_remote_code=True)
tokenizer = AutoTokenizer.from_pretrained("ijktech/ByteGPT-small")
input_text = "What is the capital of France?"
inputs = tokenizer(input_text, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=100)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Tokenizer
The tokenizer is byte-level, compatible with AutoTokenizer from Hugging Face:
tokenizer = AutoTokenizer.from_pretrained("ijktech/ByteGPT-small")
π License
π Non-Commercial License: Free for hobbyists and personal projects.
πΌ Commercial Use: Contact IJK Technology Ltd for licensing.
π οΈ About IJK Technology Ltd
IJK Technology Ltd (IJKTech) develops innovative machine learning models optimized for on-device inference. Our focus is on efficiency, privacy, and usability across mobile and embedded platforms.