marcuscedricridia commited on
Commit
879c76f
·
verified ·
1 Parent(s): fdac54b

Upload folder using huggingface_hub

Browse files
README.md CHANGED
@@ -1,10 +1,7 @@
1
  ---
2
  base_model:
3
- - marcuscedricridia/Hush-Qwen2.5-7B-della4
4
- - marcuscedricridia/Hush-Qwen2.5-7B-della3
5
- - marcuscedricridia/Hush-Qwen2.5-7B-della2
6
- - marcuscedricridia/Hush-Qwen2.5-7B-della1
7
- - marcuscedricridia/Hush-Qwen2.5-7B-RP-v1.4-1M
8
  library_name: transformers
9
  tags:
10
  - mergekit
@@ -18,32 +15,28 @@ This is a merge of pre-trained language models created using [mergekit](https://
18
  ## Merge Details
19
  ### Merge Method
20
 
21
- This model was merged using the [Model Stock](https://arxiv.org/abs/2403.19522) merge method using [marcuscedricridia/Hush-Qwen2.5-7B-RP-v1.4-1M](https://huggingface.co/marcuscedricridia/Hush-Qwen2.5-7B-RP-v1.4-1M) as a base.
22
 
23
  ### Models Merged
24
 
25
  The following models were included in the merge:
26
- * [marcuscedricridia/Hush-Qwen2.5-7B-della4](https://huggingface.co/marcuscedricridia/Hush-Qwen2.5-7B-della4)
27
- * [marcuscedricridia/Hush-Qwen2.5-7B-della3](https://huggingface.co/marcuscedricridia/Hush-Qwen2.5-7B-della3)
28
- * [marcuscedricridia/Hush-Qwen2.5-7B-della2](https://huggingface.co/marcuscedricridia/Hush-Qwen2.5-7B-della2)
29
- * [marcuscedricridia/Hush-Qwen2.5-7B-della1](https://huggingface.co/marcuscedricridia/Hush-Qwen2.5-7B-della1)
30
 
31
  ### Configuration
32
 
33
  The following YAML configuration was used to produce this model:
34
 
35
  ```yaml
36
- merge_method: model_stock
37
- base_model: marcuscedricridia/Hush-Qwen2.5-7B-RP-v1.4-1M
38
  models:
39
- - model: marcuscedricridia/Hush-Qwen2.5-7B-della1
40
- - model: marcuscedricridia/Hush-Qwen2.5-7B-della2
41
- - model: marcuscedricridia/Hush-Qwen2.5-7B-della3
42
- - model: marcuscedricridia/Hush-Qwen2.5-7B-della4
43
  dtype: bfloat16
44
  tokenizer_source: base
45
- int8_mask: true
46
  normalize: true
47
- name: Hush-Qwen2.5-7B-v1.4
 
48
 
49
  ```
 
1
  ---
2
  base_model:
3
+ - Qwen/Qwen2.5-7B-Instruct-1M
4
+ - marcuscedricridia/Hush-Qwen2.5-7B-RP-v1.4
 
 
 
5
  library_name: transformers
6
  tags:
7
  - mergekit
 
15
  ## Merge Details
16
  ### Merge Method
17
 
18
+ This model was merged using the [SCE](https://arxiv.org/abs/2408.07990) merge method using [Qwen/Qwen2.5-7B-Instruct-1M](https://huggingface.co/Qwen/Qwen2.5-7B-Instruct-1M) as a base.
19
 
20
  ### Models Merged
21
 
22
  The following models were included in the merge:
23
+ * [marcuscedricridia/Hush-Qwen2.5-7B-RP-v1.4](https://huggingface.co/marcuscedricridia/Hush-Qwen2.5-7B-RP-v1.4)
 
 
 
24
 
25
  ### Configuration
26
 
27
  The following YAML configuration was used to produce this model:
28
 
29
  ```yaml
30
+ merge_method: sce
 
31
  models:
32
+ - model: marcuscedricridia/Hush-Qwen2.5-7B-RP-v1.4
33
+ base_model: Qwen/Qwen2.5-7B-Instruct-1M
34
+ parameters:
35
+ select_topk: 1
36
  dtype: bfloat16
37
  tokenizer_source: base
 
38
  normalize: true
39
+ int8_mask: true
40
+ name: Hush-Qwen2.5-7B-RP-v1.4-1M
41
 
42
  ```
config.json CHANGED
@@ -1,5 +1,5 @@
1
  {
2
- "_name_or_path": "marcuscedricridia/Hush-Qwen2.5-7B-RP-v1.4-1M",
3
  "architectures": [
4
  "Qwen2ForCausalLM"
5
  ],
 
1
  {
2
+ "_name_or_path": "Qwen/Qwen2.5-7B-Instruct-1M",
3
  "architectures": [
4
  "Qwen2ForCausalLM"
5
  ],
mergekit_config.yml CHANGED
@@ -1,12 +1,11 @@
1
- merge_method: model_stock
2
- base_model: marcuscedricridia/Hush-Qwen2.5-7B-RP-v1.4-1M
3
  models:
4
- - model: marcuscedricridia/Hush-Qwen2.5-7B-della1
5
- - model: marcuscedricridia/Hush-Qwen2.5-7B-della2
6
- - model: marcuscedricridia/Hush-Qwen2.5-7B-della3
7
- - model: marcuscedricridia/Hush-Qwen2.5-7B-della4
8
  dtype: bfloat16
9
  tokenizer_source: base
10
- int8_mask: true
11
  normalize: true
12
- name: Hush-Qwen2.5-7B-v1.4
 
 
1
+ merge_method: sce
 
2
  models:
3
+ - model: marcuscedricridia/Hush-Qwen2.5-7B-RP-v1.4
4
+ base_model: Qwen/Qwen2.5-7B-Instruct-1M
5
+ parameters:
6
+ select_topk: 1
7
  dtype: bfloat16
8
  tokenizer_source: base
 
9
  normalize: true
10
+ int8_mask: true
11
+ name: Hush-Qwen2.5-7B-RP-v1.4-1M
model-00001-of-00004.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:e0e265d51702279301544912f9e2e88614faa93dac57adb48f4facb88d8512b1
3
  size 4970978712
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e6ab227a818d16ed0d99568eb39128f6bc5eff3a52b91d8970298aabcaea2fe0
3
  size 4970978712
model-00002-of-00004.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:2af3f58da59eca08804c7cd8bb7072f964642b9809f6e9a5e093ed9051595168
3
  size 4932751032
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1cad3f2fefdb8ac91adb28b6c8fb3d3f9ac5574b3a1cb73ec2fd1877e962f4aa
3
  size 4932751032
model-00003-of-00004.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:a635b04ad03e8f0e754233d14675656bda3140e298e4dd1662c610166699cd0a
3
  size 4991495808
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9f63487eaba343672a7fe1e339025106d9201988d24ac5ee364a40c2feca75b8
3
  size 4991495808
model-00004-of-00004.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:8efd5a03b34100f0d96e1fcc82eb552e786942baf5bddf0648422faa819a147b
3
  size 330326240
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:511c1588760348f57f0877b013cf62822a45c2d2c8e158cb6adaafe63bc40f78
3
  size 330326240