machinez commited on
Commit
143f5d3
·
verified ·
1 Parent(s): d733a3c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +55 -0
README.md CHANGED
@@ -25,6 +25,61 @@ Each branch contains an individual bits per weight, with the main one containing
25
 
26
  <a href="https://huggingface.co/machinez/zephyr-orpo-141b-A35b-v0.1-exl2/tree/2_75">2.75 bits per weight - Fits Quad Nvidia Tesla P100 16gb at 16k context</a>
27
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
28
  ## Download instructions
29
 
30
  With git:
 
25
 
26
  <a href="https://huggingface.co/machinez/zephyr-orpo-141b-A35b-v0.1-exl2/tree/2_75">2.75 bits per weight - Fits Quad Nvidia Tesla P100 16gb at 16k context</a>
27
 
28
+ ## Sample instructions to load in TabbyAPI @ 1.5bpw on 3x Nvidia Tesla P100 16gb at 4k context
29
+ ```JSON
30
+ {
31
+ "name": "Machinez_zephyr-orpo-141b-A35b-v0.1_1.5bpw",
32
+ "max_seq_len": 4096,
33
+ "override_base_seq_len": 4096,
34
+ "gpu_split_auto": false,
35
+ "autosplit_reserve": [
36
+ 96
37
+ ],
38
+ "gpu_split": [
39
+ 14.15,
40
+ 14,
41
+ 15
42
+ ],
43
+ "rope_scale": 1,
44
+ "rope_alpha": 1,
45
+ "no_flash_attention": false,
46
+ "cache_mode": "fp16",
47
+ "prompt_template": "string",
48
+ "num_experts_per_token": 0,
49
+ "use_cfg": true,
50
+ "fasttensors": false,
51
+ "skip_queue": false
52
+ }
53
+ ```
54
+
55
+ ## Sample instructions to load in TabbyAPI @ 2.75bpw on 4x Nvidia Tesla P100 16gb at 16k context
56
+ ```JSON
57
+ {
58
+ "name": "Machinez_zephyr-orpo-141b-A35b-v0.1_2.75bpw",
59
+ "max_seq_len": 16384,
60
+ "override_base_seq_len": 16384,
61
+ "gpu_split_auto": false,
62
+ "autosplit_reserve": [
63
+ 96
64
+ ],
65
+ "gpu_split": [
66
+ 12.5,
67
+ 13,
68
+ 13,
69
+ 16.1
70
+ ],
71
+ "rope_scale": 1,
72
+ "rope_alpha": 1,
73
+ "no_flash_attention": false,
74
+ "cache_mode": "fp16",
75
+ "prompt_template": "string",
76
+ "num_experts_per_token": 0,
77
+ "use_cfg": true,
78
+ "fasttensors": false,
79
+ "skip_queue": false
80
+ }
81
+ ```
82
+
83
  ## Download instructions
84
 
85
  With git: