File size: 4,609 Bytes

36e9ecf
 
 
 
 
5df1613
36e9ecf
 
 
 
 
 
 
 
 
 
 
 
 
 
 
a8c03f7
36e9ecf
 
 
 
 
a8c03f7
 
 
36e9ecf
a8c03f7
36e9ecf
 
a8c03f7
36e9ecf
 
 
 
 
 
 
 
 
 
 
 
 
 
a8c03f7
36e9ecf
 
 
 
 
 
 
 
 
 
 
 
a8c03f7
36e9ecf
 
 
 
 
 
 
 
 
 
 
 
 
a8c03f7
36e9ecf
 
 
 
 
7029f7b
36e9ecf
 
 
f35b077
36e9ecf
 
 
a8c03f7
36e9ecf
a8c03f7
36e9ecf
 
 
 
 
a8c03f7
36e9ecf
 
 
 
 
 
 
 
 
 
 
 
 
 
f35b077
36e9ecf
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
f35b077
36e9ecf

---
license: llama3
language:
- tr
- en
base_model: meta-llama/Meta-Llama-3.1-8B-Instruct
model-index:
- name: MARS
  results:
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: AI2 Reasoning Challenge TR v0.2
      type: ai2_arc
      config: ARC-Challenge
      split: test
      args:
        num_few_shot: 25
    metrics:
    - type: acc
      value: 43.85
      name: accuracy
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: HellaSwag TR
      type: hellaswag
      split: validation
      args:
        num_few_shot: 10
    metrics:
    - type: acc
      value: 46.64
      name: accuracy
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: TruthfulQA TR v0.2
      type: truthful_qa
      config: multiple_choice
      split: validation
      args:
        num_few_shot: 0
    metrics:
    - type: acc
      name: accuracy
      value: 48.66
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: Winogrande TR v0.2
      type: winogrande
      config: winogrande_xl
      split: validation
      args:
        num_few_shot: 5
    metrics:
    - type: acc
      value: 52.84
      name: accuracy
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: GSM8k TR v0.2
      type: gsm8k
      config: main
      split: test
      args:
        num_few_shot: 5
    metrics:
    - type: acc
      value: 59.30
      name: accuracy
pipeline_tag: text-generation
---


<img src="MARS-2.0.png" alt="Curiosity MARS model logo" style="border-radius: 1rem; width: 100%">


<div style="display: flex; justify-content: center; align-items: center; flex-direction: column">
    <h1 style="font-size: 5em; margin-bottom: 0; padding-bottom: 0;">MARS-v0.2</h1>
    <aside>by <a href="https://curiosity.tech">Curiosity Technology</a></aside>
</div>

MARS-v0.2 is the second iteration of Curiosity Technology models, built on the foundation of Llama 3.1 8B. This version expands upon the initial MARS model by fine-tuning it with a more comprehensive dataset, with an increased emphasis on mathematical data to enhance its reasoning and problem-solving capabilities.

We've continued our commitment to Turkish language processing, utilizing both in-house Turkish datasets and a broader selection of translated open-source datasets. We believe this version will serve the community with even more versatility and depth.

MARS have been trained for 3 days on 4xA100.

## Model Details

- **Base Model**: Meta Llama 3.1 8B Instruct
- **Training Dataset**: In-house & Translated Open Source Turkish Datasets
- **Training Method**: LoRA Fine Tuning


## How to use

You can run conversational inference using the Transformers pipeline abstraction, or by leveraging the Auto classes with the `generate()` function. Let's see examples of both.

### Transformers pipeline

```python
import transformers
import torch

model_id = "curiositytech/MARS-v0.2"

pipeline = transformers.pipeline(
    "text-generation",
    model=model_id,
    model_kwargs={"torch_dtype": torch.bfloat16},
    device_map="auto",
)

messages = [
    {"role": "system", "content": "Sen korsan gibi konuşan bir korsan chatbotsun!"},
    {"role": "user", "content": "Sen kimsin?"},
]

terminators = [
    pipeline.tokenizer.eos_token_id,
    pipeline.tokenizer.convert_tokens_to_ids("<|eot_id|>")
]

outputs = pipeline(
    messages,
    max_new_tokens=256,
    eos_token_id=terminators,
    do_sample=True,
    temperature=0.6,
    top_p=0.9,
)
print(outputs[0]["generated_text"][-1])
```

### Transformers AutoModelForCausalLM

```python
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

model_id = "curiositytech/MARS-v0.2"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto",
)

messages = [
    {"role": "system", "content": "Sen korsan gibi konuşan bir korsan chatbotsun!"},
    {"role": "user", "content": "Sen kimsin?"},
]

input_ids = tokenizer.apply_chat_template(
    messages,
    add_generation_prompt=True,
    return_tensors="pt"
).to(model.device)

terminators = [
    tokenizer.eos_token_id,
    tokenizer.convert_tokens_to_ids("<|eot_id|>")
]

outputs = model.generate(
    input_ids,
    max_new_tokens=256,
    eos_token_id=terminators,
    do_sample=True,
    temperature=0.6,
    top_p=0.9,
)
response = outputs[0][input_ids.shape[-1]:]
print(tokenizer.decode(response, skip_special_tokens=True))
```