ByteGPT-small / README.md

ijktech-jk

Update README with Android SDK Details

1f52143 verified 8 months ago

preview code

raw

history blame contribute delete

5.25 kB

metadata

library_name: transformers
tags:
  - gpt
  - byte-tokenization
  - mobile
  - embedded
  - onnx
license: cc-by-nc-4.0
datasets:
  - custom
  - web
language: en
widget:
  - text: In order to make pancakes, you need to
  - text: Once upon a time

IJK Technology – ByteGPT-small

ByteGPT-small is a small GPT-style language model trained using byte tokenization inspired by the ByT5 paper. It is designed for use on compute- and memory-constrained devices, such as mobile phones and embedded systems.

🚀 Overview

Model Type: GPT-style causal language model
Tokenizer: Byte-level tokenization (from ByT5)
Intended Use: Edge devices, mobile phones, embedded systems
Size: Small (initial prototype)
Training: Custom-trained from scratch

🧠 Why Byte Tokenization?

Byte tokenization offers several advantages for small-scale, efficient models:

Reduced Memory Footprint:
Byte-level tokenization drastically reduces the size of the embedding layer, making the model suitable for devices with limited RAM.
No External Dependencies:
Unlike subword tokenizers (e.g., SentencePiece, BPE), byte tokenization requires no external libraries for tokenization. A simple Python script can handle tokenization.
Robustness to Noise:
Byte-level models are more robust to misspellings, typos, and out-of-vocabulary tokens.

💡 Future Plans

This is the first in a series of models. While this model is not yet highly useful due to its small size, it represents the foundation for future versions. Upcoming releases will include:

Larger Models: Scaled-up versions with better performance
Distilled Models: Using GPRO distillation to create highly efficient small models
Benchmark Results: Comparative performance on mobile devices

💻 Usage

Quick Start (with `transformers`):

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("ijktech/ByteGPT-small", trust_remote_code=True)
tokenizer = AutoTokenizer.from_pretrained("ijktech/ByteGPT-small")

input_text = "What is the capital of France?"
inputs = tokenizer(input_text, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=100)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Tokenizer

The tokenizer is byte-level, compatible with AutoTokenizer from Hugging Face:

tokenizer = AutoTokenizer.from_pretrained("ijktech/ByteGPT-small")

ONNX

The model is also available in ONNX format, and can be used with the ONNX Runtime:

import onnxruntime as ort
import numpy as np

# Create ONNX Runtime session
ort_session = ort.InferenceSession("model.onnx")

# Helper function to generate text using the ONNX model
def generate_with_onnx(prompt_ids, max_new_tokens=50, temperature=1.0):
    input_ids = prompt_ids.clone()
    
    for _ in range(max_new_tokens):
        # Get the last block_size tokens if input is too long
        if input_ids.shape[1] > model.block_size:
            input_ids = input_ids[:, -model.block_size:]
            
        # Run inference
        ort_inputs = {
            'input': input_ids.cpu().numpy()
        }
        logits = ort_session.run(None, ort_inputs)[0]
        
        # Get predictions for the next token
        logits = torch.from_numpy(logits)
        logits = logits[:, -1, :] # Only take the last token's predictions
        
        # Apply temperature
        if temperature != 1.0:
            logits = logits / temperature
            
        # Sample from the distribution
        probs = torch.nn.functional.softmax(logits, dim=-1)
        next_token = torch.multinomial(probs, num_samples=1)
        
        # Append the new token
        input_ids = torch.cat([input_ids, next_token], dim=1)
    
    return input_ids

# Test the generation
prompt = "Hello"
prompt_ids = tok(prompt, return_tensors="pt")["input_ids"]
generated_ids = generate_with_onnx(prompt_ids)
generated_text = tok.decode(generated_ids[0], skip_special_tokens=True)
print(f"Generated text: {generated_text}")
#Generated text: Hello everyone!
#A dinner is only available for St. Loui

Android Usage

We've just released an Android SDK. You can find the SDK on our GitHub.

The SDK can be included in your Android project by adding the following to your build.gradle file:

repositories {
    maven { 
        url = uri("https://raw.githubusercontent.com/ijktech/ByteGPT-Android/maven-repo") 
    }
}

dependencies {
    implementation("com.github.ijktech:ByteGPT-Android:1.0.9")
}

iOS Usage

Coming Soon!

📜 License

📍 CC-BY-NC-4.0: Free for non-commercial use.

💼 Commercial Use: Contact IJK Technology Ltd for licensing at [email protected].

🛠️ About IJK Technology Ltd

IJK Technology Ltd (IJKTech) develops innovative machine learning models optimized for on-device inference. Our focus is on efficiency, privacy, and usability across mobile and embedded platforms.