Whispful commited on
Commit
ca52cd1
·
verified ·
1 Parent(s): ab134a4

Remove README.md

Browse files
Files changed (1) hide show
  1. README.md +0 -170
README.md DELETED
@@ -1,170 +0,0 @@
1
- ---
2
- library_name: peft
3
- base_model: rayonlabs/6259c3f5-19eb-4f1a-9530-69e345bdfc69
4
- tags:
5
- - axolotl
6
- - generated_from_trainer
7
- model-index:
8
- - name: c531f863-00d9-4f3a-9c41-8cf984d71723
9
- results: []
10
- ---
11
-
12
- <!-- This model card has been generated automatically according to the information the Trainer had access to. You
13
- should probably proofread and complete it, then remove this comment. -->
14
-
15
- [<img src="https://raw.githubusercontent.com/axolotl-ai-cloud/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/axolotl-ai-cloud/axolotl)
16
- <details><summary>See axolotl config</summary>
17
-
18
- axolotl version: `0.4.1`
19
- ```yaml
20
- adapter: lora
21
- base_model: rayonlabs/6259c3f5-19eb-4f1a-9530-69e345bdfc69
22
- bf16: auto
23
- bnb_4bit_compute_dtype: bfloat16
24
- bnb_4bit_quant_type: nf4
25
- chat_template: llama3
26
- cosine_min_lr_ratio: 0.1
27
- data_processes: 16
28
- dataset_prepared_path: null
29
- datasets:
30
- - data_files:
31
- - c7ee023794d7d85d_train_data.json
32
- ds_type: json
33
- format: custom
34
- path: /workspace/input_data/c7ee023794d7d85d_train_data.json
35
- type:
36
- field_input: context
37
- field_instruction: question
38
- field_output: final_decision
39
- format: '{instruction} {input}'
40
- no_input_format: '{instruction}'
41
- system_format: '{system}'
42
- system_prompt: ''
43
- debug: null
44
- deepspeed: null
45
- device_map: auto
46
- do_eval: true
47
- early_stopping_patience: 1
48
- eval_batch_size: 1
49
- eval_sample_packing: false
50
- eval_steps: 25
51
- evaluation_strategy: steps
52
- flash_attention: false
53
- fp16: null
54
- fsdp: null
55
- fsdp_config: null
56
- gradient_accumulation_steps: 32
57
- gradient_checkpointing: true
58
- group_by_length: true
59
- hub_model_id: Whispful/c531f863-00d9-4f3a-9c41-8cf984d71723
60
- hub_repo: stevemonite
61
- hub_strategy: checkpoint
62
- hub_token: null
63
- learning_rate: 1.6151457061744963e-05
64
- load_in_4bit: true
65
- load_in_8bit: false
66
- local_rank: null
67
- logging_steps: 1
68
- lora_alpha: 3450000000
69
- lora_dropout: 0.05
70
- lora_fan_in_fan_out: null
71
- lora_model_dir: null
72
- lora_r: 128
73
- lora_target_linear: true
74
- lora_target_modules:
75
- - q_proj
76
- - v_proj
77
- lr_scheduler: cosine
78
- max_grad_norm: 1.0
79
- max_memory:
80
- 0: 76GiB
81
- max_steps: 5
82
- micro_batch_size: 1
83
- mlflow_experiment_name: /tmp/c7ee023794d7d85d_train_data.json
84
- model_type: AutoModelForCausalLM
85
- num_epochs: 4
86
- optim_args:
87
- adam_beta1: 0.9
88
- adam_beta2: 0.95
89
- adam_epsilon: 1e-5
90
- optimizer: adamw_torch
91
- output_dir: miner_id_24
92
- pad_to_sequence_len: true
93
- resume_from_checkpoint: null
94
- s2_attention: null
95
- sample_packing: false
96
- save_steps: 50
97
- save_strategy: steps
98
- sequence_len: 2048
99
- special_tokens:
100
- pad_token: <|end_of_text|>
101
- strict: false
102
- tf32: false
103
- tokenizer_type: AutoTokenizer
104
- torch_dtype: bfloat16
105
- train_on_inputs: false
106
- trust_remote_code: true
107
- val_set_size: 50
108
- wandb_entity: null
109
- wandb_mode: online
110
- wandb_name: eadd58e0-1e15-4859-bf02-0db212f00a46
111
- wandb_project: Public_TuningSN
112
- wandb_run: miner_id_24
113
- wandb_runid: eadd58e0-1e15-4859-bf02-0db212f00a46
114
- warmup_raio: 0.03
115
- warmup_ratio: 3.5
116
- weight_decay: 0.01
117
- xformers_attention: null
118
-
119
- ```
120
-
121
- </details><br>
122
-
123
- # c531f863-00d9-4f3a-9c41-8cf984d71723
124
-
125
- This model is a fine-tuned version of [rayonlabs/6259c3f5-19eb-4f1a-9530-69e345bdfc69](https://huggingface.co/rayonlabs/6259c3f5-19eb-4f1a-9530-69e345bdfc69) on the None dataset.
126
- It achieves the following results on the evaluation set:
127
- - Loss: 0.0568
128
-
129
- ## Model description
130
-
131
- More information needed
132
-
133
- ## Intended uses & limitations
134
-
135
- More information needed
136
-
137
- ## Training and evaluation data
138
-
139
- More information needed
140
-
141
- ## Training procedure
142
-
143
- ### Training hyperparameters
144
-
145
- The following hyperparameters were used during training:
146
- - learning_rate: 1.6151457061744963e-05
147
- - train_batch_size: 1
148
- - eval_batch_size: 1
149
- - seed: 42
150
- - gradient_accumulation_steps: 32
151
- - total_train_batch_size: 32
152
- - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=adam_beta1=0.9,adam_beta2=0.95,adam_epsilon=1e-5
153
- - lr_scheduler_type: cosine
154
- - lr_scheduler_warmup_steps: 17
155
- - training_steps: 5
156
-
157
- ### Training results
158
-
159
- | Training Loss | Epoch | Step | Validation Loss |
160
- |:-------------:|:------:|:----:|:---------------:|
161
- | 0.0638 | 0.0002 | 1 | 0.0568 |
162
-
163
-
164
- ### Framework versions
165
-
166
- - PEFT 0.13.2
167
- - Transformers 4.46.0
168
- - Pytorch 2.5.0+cu124
169
- - Datasets 3.0.1
170
- - Tokenizers 0.20.1