--- library_name: transformers tags: - gpt - distillation - mobile - embedded - onnx license: cc-by-nc-4.0 datasets: - custom - web language: en widget: - text: "In order to make pancakes, you need to" - text: "Once upon a time" ---

IJK Technology

IJK Technology – ByteGPT-r1

**ByteGPT-r1** is a distilled version of DeepSeek's QWEN 1.5B model, optimized specifically for mobile and edge computing environments. It maintains impressive language capabilities while being designed for compute- and memory-constrained devices. ## 🚀 Overview - **Model Type:** Distilled GPT-style causal language model - **Base Model:** DeepSeek's QWEN 1.5B - **Intended Use:** Edge devices, mobile phones, embedded systems - **Size:** Optimized for mobile deployment - **Training:** Knowledge distillation from QWEN 1.5B ## 🧠 Why ByteGPT-r1? ByteGPT-r1 offers several advantages for mobile and edge deployment: 1. **Efficient Knowledge Distillation:** Carefully distilled from DeepSeek's QWEN 1.5B model to preserve capabilities while reducing computational requirements. 2. **Mobile-First Design:** Architected specifically for the constraints of mobile devices, with optimizations for both inference speed and memory usage. 3. **Balanced Performance:** Maintains a good balance between model size and language generation capabilities, making it practical for real-world mobile applications. ## 💡 Future Plans This model is part of our ongoing effort to bring powerful language models to edge devices. Upcoming releases will include: - **Specialized Variants:** Domain-specific versions optimized for particular use cases - **Further Optimizations:** Continued improvements in efficiency and performance - **Benchmark Results:** Comparative performance on various mobile devices - **Integration Examples:** More code samples for popular mobile frameworks ## 💻 Usage ### **Quick Start (with `transformers`):** ```python from transformers import AutoModelForCausalLM, AutoTokenizer model = AutoModelForCausalLM.from_pretrained("ijktech/ByteGPT-r1", trust_remote_code=True) tokenizer = AutoTokenizer.from_pretrained("ijktech/ByteGPT-r1") input_text = "What is the capital of France?" inputs = tokenizer(input_text, return_tensors="pt") outputs = model.generate(**inputs, max_new_tokens=100) print(tokenizer.decode(outputs[0], skip_special_tokens=True)) ``` ### Tokenizer The tokenizer is compatible with AutoTokenizer from Hugging Face: ```python tokenizer = AutoTokenizer.from_pretrained("ijktech/ByteGPT-r1") ``` ### ONNX The model is also available in ONNX format, and can be used with the ONNX Runtime: ```python import onnxruntime as ort import numpy as np # Create ONNX Runtime session ort_session = ort.InferenceSession("model.onnx") # Helper function to generate text using the ONNX model def generate_with_onnx(prompt_ids, max_new_tokens=50, temperature=1.0): input_ids = prompt_ids.clone() for _ in range(max_new_tokens): # Get the last block_size tokens if input is too long if input_ids.shape[1] > model.block_size: input_ids = input_ids[:, -model.block_size:] # Run inference ort_inputs = { 'input': input_ids.cpu().numpy() } logits = ort_session.run(None, ort_inputs)[0] # Get predictions for the next token logits = torch.from_numpy(logits) logits = logits[:, -1, :] # Only take the last token's predictions # Apply temperature if temperature != 1.0: logits = logits / temperature # Sample from the distribution probs = torch.nn.functional.softmax(logits, dim=-1) next_token = torch.multinomial(probs, num_samples=1) # Append the new token input_ids = torch.cat([input_ids, next_token], dim=1) return input_ids # Test the generation prompt = "Hello" prompt_ids = tok(prompt, return_tensors="pt")["input_ids"] generated_ids = generate_with_onnx(prompt_ids) generated_text = tok.decode(generated_ids[0], skip_special_tokens=True) print(f"Generated text: {generated_text}") #Generated text: Hello there! How can I assist you today? I'm a helpful AI assistant trained to provide information and answer questions on a wide range of topics. ``` ### Android Usage Coming Soon! ### iOS Usage Coming Soon! ## 📜 License 📍 **CC-BY-NC-4.0**: Free for non-commercial use. 💼 **Commercial Use**: Contact IJK Technology Ltd for licensing at [james@ijktech.com](mailto:james@ijktech.com). ## 🛠️ About IJK Technology Ltd IJK Technology Ltd (IJKTech) develops innovative machine learning models optimized for on-device inference. Our focus is on efficiency, privacy, and usability across mobile and embedded platforms.