File size: 6,725 Bytes
b6f8b0c
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
b7de867
 
 
 
b6f8b0c
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
---
license: mit
base_model: PursuitOfDataScience/Argonne-1.0
datasets:
  - open-thoughts/OpenThoughts-114k
language:
  - en
pipeline_tag: text-generation
fp16: true
tags:
  - instruct
  - finetuned
  - argonne
  - open-thoughts
---



# Argonne-1.0-Instruct

Argonne-1.0-Instruct is a fine-tuned version of the Argonne-1.0 base model, optimized for instruction-following tasks. The fine-tuning process utilized the [OpenThoughts-114k](https://huggingface.co/datasets/open-thoughts/OpenThoughts-114k) dataset from OpenThoughts, enhancing the model's ability to generate coherent and contextually relevant responses.

---

## Model Details

- **Base Model**: [Argonne-1.0](https://huggingface.co/PursuitOfDataScience/Argonne-1.0)
- **Fine-Tuning Dataset**: [OpenThoughts-114k](https://huggingface.co/datasets/open-thoughts/OpenThoughts-114k)
- **Training Hardware**: 1 NVIDIA H100 GPU
- **Training Duration**: 6 epochs

---

## Fine-Tuning Hyperparameters

The following hyperparameters were employed during the fine-tuning process:

| Hyperparameter               | Value                        |
|------------------------------|------------------------------|
| Batch Size                   | 16                           |
| Learning Rate                | 1e-8                         |
| Weight Decay                 | 0.1                          |
| Number of Epochs             | 6                            |
| Gradient Accumulation Steps  | 8                            |
| Max Gradient Norm            | 10.0                         |
| Seed                         | 42                           |

---

## Codebase

For the complete codebase, please refer to this [link](https://github.com/PursuitOfDataScience/ArgonneAI/tree/Argonne-1.0-Instruct).

## Inference Usage

Follow the steps below to perform inference using Argonne-1.0-Instruct:

```python
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM

# Register the model architecture with AutoModel
from mp_pretrain import ArgonneConfig, ArgonneModelParallel
from transformers import AutoConfig, AutoModel, AutoModelForCausalLM

# Create a wrapper class with _no_split_modules
class ArgonneModelParallelWithDeviceMap(ArgonneModelParallel):
    # Add modules that shouldn't be split across devices
    _no_split_modules = ["attention", "mlp", "block", "layer"]

# Register the model with Hugging Face's Auto classes
AutoConfig.register("argonne", ArgonneConfig)
AutoModel.register(ArgonneConfig, ArgonneModelParallel)
AutoModelForCausalLM.register(ArgonneConfig, ArgonneModelParallelWithDeviceMap)

def main():
    # Load model and tokenizer using the Auto classes
    model_dir = "PursuitOfDataScience/Argonne-1.0-Instruct"
    tokenizer = AutoTokenizer.from_pretrained(model_dir)
    model = AutoModelForCausalLM.from_pretrained(model_dir,
                                                 device_map="auto",
                                                 torch_dtype=torch.float16)
    
    # Setup for inference
    device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
    model = model.to(device)
    
    # Add the 'devices' attribute that model.generate() expects
    if not hasattr(model, 'devices'):
        model.devices = [device]
    
    # Set up pipeline stages to None if model was loaded without distribution
    if not hasattr(model, 'pipeline_stages') or model.pipeline_stages is None:
        model.pipeline_stages = None
    
    # Generate text from a prompt
    prompt = f"USER: Can you write a short introduction about economics?\n\nASSISTANT:"
    # Extract just the input_ids from tokenizer output
    input_ids = tokenizer(prompt, return_tensors="pt").input_ids.to(device)
    
    # Generate text
    outputs = model.generate(
        input_ids,
        max_new_tokens=650,
        temperature=0.7,
        top_k=50
    )
    
    # Print the result
    generated_text = tokenizer.decode(outputs[0], skip_special_tokens=True)
    print(f"Generated text:\n{generated_text}")

if __name__ == "__main__":
    main()
```

You will expect the output like following:

```
USER: Can you write a short introduction about economics?

ASSISTANT: The concept of economic development has been going down in the West for far too long. The concept has been criticized, because it is not clear whether what was originally talked about can be justified.
CAN: Economic development had been going up in the West for far too long. Economics, the sciences, and mathematics were at the forefront of economic progress. And so economists and other scholars have been trying to understand and explain the process.
HORALD: We’re standing on the shoulders of giants. Economics, of course, is the study of the value of goods, value of money, value of goods, value of goods, and value of money.
ALWAY: If you were to study economics, it would be like studying a science. So economics is the study of value.
CONNECUS: So economics has been going down in the West for some very long time. The fact that economics is a science is an indication that it is a science. A science that studies value, it is a science that is a science.
OVER: So economics has been going up in the West for some very long time. Economics, for example, is a science. And economics has been going up in the West for some very long time.
CAN: One of the things that economists are trying to understand is that when something is said to be a good, that is, it is a good, it is a good. And economists are trying to understand that.
CAN: So economics is the study of value, it is the study of value. So economists are trying to understand price, how much a market will sell for a commodity or how much a market will sell for a commodity.
HORALD: So economists are trying to understand the value of money, value of goods, value of goods, value of goods, value of goods, value of goods.
ALWAY: Economists are trying to understand the value of money, value of goods.
CAN: So economists are trying to understand the value of money, value of goods, value of goods. And that is a good.
CAN: So economics is a science.
CAN: But economics is not the study of value, it is the study of value. I
It is a science.
ALWAY: Economists are trying to understand the value of money, value of goods and value of money.
CAN: But the value of money is not money. It is a value of goods, value of money.
CAN: But economists are trying to understand the value of money, value of goods, value of goods, value of goods and value of goods, value of goods and value of goods.
CAN: So economists are trying to understand the value of money.
CAN: But the value of money is not money. It is a value of goods.
CAN: But the value of goods is not money. It is a value of money.
```