--- library_name: transformers tags: - gpt - byte-tokenization - mobile - embedded - onnx license: cc-by-nc-4.0 datasets: - custom - web language: en widget: - text: "In order to make pancakes, you need to" - text: "Once upon a time" ---

IJK Technology

IJK Technology – ByteGPT-small

**ByteGPT-small** is a small GPT-style language model trained using byte tokenization inspired by the ByT5 paper. It is designed for use on compute- and memory-constrained devices, such as mobile phones and embedded systems. ## 🚀 Overview - **Model Type:** GPT-style causal language model - **Tokenizer:** Byte-level tokenization (from ByT5) - **Intended Use:** Edge devices, mobile phones, embedded systems - **Size:** Small (initial prototype) - **Training:** Custom-trained from scratch ## 🧠 Why Byte Tokenization? Byte tokenization offers several advantages for small-scale, efficient models: 1. **Reduced Memory Footprint:** Byte-level tokenization drastically reduces the size of the embedding layer, making the model suitable for devices with limited RAM. 2. **No External Dependencies:** Unlike subword tokenizers (e.g., SentencePiece, BPE), byte tokenization requires no external libraries for tokenization. A simple Python script can handle tokenization. 3. **Robustness to Noise:** Byte-level models are more robust to misspellings, typos, and out-of-vocabulary tokens. ## 💡 Future Plans This is the **first** in a series of models. While this model is not yet highly useful due to its small size, it represents the foundation for future versions. Upcoming releases will include: - **Larger Models:** Scaled-up versions with better performance - **Distilled Models:** Using GPRO distillation to create highly efficient small models - **Benchmark Results:** Comparative performance on mobile devices ## 💻 Usage ### **Quick Start (with `transformers`):** ```python from transformers import AutoModelForCausalLM, AutoTokenizer model = AutoModelForCausalLM.from_pretrained("ijktech/ByteGPT-small", trust_remote_code=True) tokenizer = AutoTokenizer.from_pretrained("ijktech/ByteGPT-small") input_text = "What is the capital of France?" inputs = tokenizer(input_text, return_tensors="pt") outputs = model.generate(**inputs, max_new_tokens=100) print(tokenizer.decode(outputs[0], skip_special_tokens=True)) ``` ### Tokenizer The tokenizer is byte-level, compatible with AutoTokenizer from Hugging Face: ```python tokenizer = AutoTokenizer.from_pretrained("ijktech/ByteGPT-small") ``` ### ONNX The model is also available in ONNX format, and can be used with the ONNX Runtime: ```python import onnxruntime as ort import numpy as np # Create ONNX Runtime session ort_session = ort.InferenceSession("model.onnx") # Helper function to generate text using the ONNX model def generate_with_onnx(prompt_ids, max_new_tokens=50, temperature=1.0): input_ids = prompt_ids.clone() for _ in range(max_new_tokens): # Get the last block_size tokens if input is too long if input_ids.shape[1] > model.block_size: input_ids = input_ids[:, -model.block_size:] # Run inference ort_inputs = { 'input': input_ids.cpu().numpy() } logits = ort_session.run(None, ort_inputs)[0] # Get predictions for the next token logits = torch.from_numpy(logits) logits = logits[:, -1, :] # Only take the last token's predictions # Apply temperature if temperature != 1.0: logits = logits / temperature # Sample from the distribution probs = torch.nn.functional.softmax(logits, dim=-1) next_token = torch.multinomial(probs, num_samples=1) # Append the new token input_ids = torch.cat([input_ids, next_token], dim=1) return input_ids # Test the generation prompt = "Hello" prompt_ids = tok(prompt, return_tensors="pt")["input_ids"] generated_ids = generate_with_onnx(prompt_ids) generated_text = tok.decode(generated_ids[0], skip_special_tokens=True) print(f"Generated text: {generated_text}") #Generated text: Hello everyone! #A dinner is only available for St. Loui ``` ### Android Usage We've just released an Android SDK. You can find the SDK on our [GitHub](https://github.com/ijktech/ByteGPT-Android). The SDK can be included in your Android project by adding the following to your `build.gradle` file: ``` repositories { maven { url = uri("https://raw.githubusercontent.com/ijktech/ByteGPT-Android/maven-repo") } } dependencies { implementation("com.github.ijktech:ByteGPT-Android:1.0.9") } ``` ### iOS Usage Coming Soon! ## 📜 License 📍 **CC-BY-NC-4.0**: Free for non-commercial use. 💼 **Commercial Use**: Contact IJK Technology Ltd for licensing at [james@ijktech.com](mailto:james@ijktech.com). ## 🛠️ About IJK Technology Ltd IJK Technology Ltd (IJKTech) develops innovative machine learning models optimized for on-device inference. Our focus is on efficiency, privacy, and usability across mobile and embedded platforms.