large-traversaal
/

Alif-1.0-8B-Instruct

@@ -15,7 +15,7 @@ language:
 # Model Card for Alif Llama 3.1 8B Instruct
-**Alif Llama 3.1 8B Instruct** is an open-weight research release of a model with highly advanced multilingual reasoning capabilities. It focuses on human refined multilingual synthetic data distillation paired with reasoning to enhance cultural nuance and reasoning capabilities in english and urdu languages.
 - **Developed by:** large-traversaal
 - **License:** apache-2.0
@@ -31,25 +31,42 @@ This llama model was trained 2x faster with [Unsloth](https://github.com/unsloth
 Install the transformers library and load Alif Llama 3.1 8B Instruct as follows:
 ```python
-from transformers import AutoTokenizer, AutoModelForCausalLM
-model_id = "large-traversaal/Alif-Llama-3.1-8B-Instruct"
 tokenizer = AutoTokenizer.from_pretrained(model_id)
-model = AutoModelForCausalLM.from_pretrained(model_id)
-# Format the message with the chat template
-messages = [{"role": "user", "content": "کراچی کی اہمیت کیا ہے؟"}]
-input_ids = tokenizer.apply_chat_template(messages, tokenize=True, add_generation_prompt=True, return_tensors="pt")
-gen_tokens = model.generate(
-    input_ids,
-    max_new_tokens=100,
-    do_sample=True,
-    temperature=0.3,
-    )
-gen_text = tokenizer.decode(gen_tokens[0])
-print(gen_text)
 ```
 ## Model Details
@@ -58,7 +75,7 @@ print(gen_text)
 **Output**: Models generate text only.
-**Model Architecture**: Aya Expanse 8B is an auto-regressive language model that uses an optimized transformer architecture. Post-training includes continued pretraining and supervised finetuning.
 For more details about how the model was trained, check out [our blogpost]().

 # Model Card for Alif Llama 3.1 8B Instruct
+**Alif Llama 3.1 8B Instruct** is an open-weight model with highly advanced multilingual reasoning capabilities. It utilizes human refined multilingual synthetic data paired with reasoning to enhance cultural nuance and reasoning capabilities in english and urdu languages.
 - **Developed by:** large-traversaal
 - **License:** apache-2.0
 Install the transformers library and load Alif Llama 3.1 8B Instruct as follows:
 ```python
+from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline
+import torch
+from transformers import BitsAndBytesConfig
+model_id = "large-traversaal/Alif-Llama-3.1-8B-Instruct"  # Replace with your model
+# 4-bit quantization configuration
+quantization_config = BitsAndBytesConfig(
+    load_in_4bit=True,
+    bnb_4bit_compute_dtype=torch.float16,
+    bnb_4bit_use_double_quant=True,
+    bnb_4bit_quant_type="nf4"
+)
+# Load tokenizer and model in 4-bit
 tokenizer = AutoTokenizer.from_pretrained(model_id)
+model = AutoModelForCausalLM.from_pretrained(
+    model_id,
+    quantization_config=quantization_config,
+    device_map="auto"
+)
+# Create text generation pipeline
+chatbot = pipeline("text-generation", model=model, tokenizer=tokenizer, device_map="auto")
+# Function to chat
+def chat(message):
+    response = chatbot(message, max_new_tokens=100, do_sample=True, temperature=0.3)
+    return response[0]["generated_text"]
+# Example chat
+user_input = "کراچی کی اہمیت کیا ہے؟"
+bot_response = chat(user_input)
+print(bot_response)
 ```
 ## Model Details
 **Output**: Models generate text only.
+**Model Architecture**: Alif Llama 8B is an auto-regressive language model that uses an optimized transformer architecture. Post-training includes continued pretraining and supervised finetuning.
 For more details about how the model was trained, check out [our blogpost]().