curiositytech
/

MARS-v0.2

+---
+license: llama3
+language:
+- tr
+- en
+base_model: meta-llama/Meta-Llama-3-8B-Instruct
+model-index:
+- name: MARS
+  results:
+  - task:
+      type: text-generation
+      name: Text Generation
+    dataset:
+      name: AI2 Reasoning Challenge TR v0.2
+      type: ai2_arc
+      config: ARC-Challenge
+      split: test
+      args:
+        num_few_shot: 25
+    metrics:
+    - type: acc
+      value: 46.08
+      name: accuracy
+  - task:
+      type: text-generation
+      name: Text Generation
+    dataset:
+      name: MMLU TR v0.2
+      type: cais/mmlu
+      config: all
+      split: test
+      args:
+        num_few_shot: 5
+    metrics:
+    - type: acc
+      value: 47.02
+      name: accuracy
+  - task:
+      type: text-generation
+      name: Text Generation
+    dataset:
+      name: TruthfulQA TR v0.2
+      type: truthful_qa
+      config: multiple_choice
+      split: validation
+      args:
+        num_few_shot: 0
+    metrics:
+    - type: acc
+      name: accuracy
+      value: 49.38
+  - task:
+      type: text-generation
+      name: Text Generation
+    dataset:
+      name: Winogrande TR v0.2
+      type: winogrande
+      config: winogrande_xl
+      split: validation
+      args:
+        num_few_shot: 5
+    metrics:
+    - type: acc
+      value: 53.71
+      name: accuracy
+  - task:
+      type: text-generation
+      name: Text Generation
+    dataset:
+      name: GSM8k TR v0.2
+      type: gsm8k
+      config: main
+      split: test
+      args:
+        num_few_shot: 5
+    metrics:
+    - type: acc
+      value: 53.08
+      name: accuracy
+pipeline_tag: text-generation
+---
+<img src="MARS-1.0.png" alt="Curiosity MARS model logo" style="border-radius: 1rem; width: 100%">
+<div style="display: flex; justify-content: center; align-items: center; flex-direction: column">
+    <h1 style="font-size: 5em; margin-bottom: 0; padding-bottom: 0;">MARS</h1>
+    <aside>by <a href="https://curiosity.tech">Curiosity Technology</a></aside>
+</div>
+MARS is the first iteration of Curiosity Technology models, based on Llama 3 8B.
+We have trained MARS on in-house Turkish dataset, as well as several open-source datasets and their Turkish
+translations.
+It is our intention to release Turkish translations in near future for community to have their go on them.
+MARS have been trained for 3 days on 4xA100.
+## Model Details
+- **Base Model**: Meta Llama 3 8B Instruct
+- **Training Dataset**: In-house & Translated Open Source Turkish Datasets
+- **Training Method**: LoRA Fine Tuning
+## How to use
+You can run conversational inference using the Transformers pipeline abstraction, or by leveraging the Auto classes with the `generate()` function. Let's see examples of both.
+### Transformers pipeline
+```python
+import transformers
+import torch
+model_id = "curiositytech/MARS"
+pipeline = transformers.pipeline(
+    "text-generation",
+    model=model_id,
+    model_kwargs={"torch_dtype": torch.bfloat16},
+    device_map="auto",
+)
+messages = [
+    {"role": "system", "content": "Sen korsan gibi konuşan bir korsan chatbotsun!"},
+    {"role": "user", "content": "Sen kimsin?"},
+]
+terminators = [
+    pipeline.tokenizer.eos_token_id,
+    pipeline.tokenizer.convert_tokens_to_ids("<|eot_id|>")
+]
+outputs = pipeline(
+    messages,
+    max_new_tokens=256,
+    eos_token_id=terminators,
+    do_sample=True,
+    temperature=0.6,
+    top_p=0.9,
+)
+print(outputs[0]["generated_text"][-1])
+```
+### Transformers AutoModelForCausalLM
+```python
+from transformers import AutoTokenizer, AutoModelForCausalLM
+import torch
+model_id = "curiositytech/MARS"
+tokenizer = AutoTokenizer.from_pretrained(model_id)
+model = AutoModelForCausalLM.from_pretrained(
+    model_id,
+    torch_dtype=torch.bfloat16,
+    device_map="auto",
+)
+messages = [
+    {"role": "system", "content": "Sen korsan gibi konuşan bir korsan chatbotsun!"},
+    {"role": "user", "content": "Sen kimsin?"},
+]
+input_ids = tokenizer.apply_chat_template(
+    messages,
+    add_generation_prompt=True,
+    return_tensors="pt"
+).to(model.device)
+terminators = [
+    tokenizer.eos_token_id,
+    tokenizer.convert_tokens_to_ids("<|eot_id|>")
+]
+outputs = model.generate(
+    input_ids,
+    max_new_tokens=256,
+    eos_token_id=terminators,
+    do_sample=True,
+    temperature=0.6,
+    top_p=0.9,
+)
+response = outputs[0][input_ids.shape[-1]:]
+print(tokenizer.decode(response, skip_special_tokens=True))
+```