kevin009 commited on
Commit
0187039
·
verified ·
1 Parent(s): a5b3473

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +150 -0
README.md ADDED
@@ -0,0 +1,150 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ language:
4
+ - en
5
+ base_model:
6
+ - meta-llama/Llama-3.1-8B-instruct
7
+ pipeline_tag: text-generation
8
+ tags:
9
+ - lora
10
+ - adapter
11
+ - writing
12
+ - CoT
13
+ ---
14
+ # Merged-Llama-Adapters-317-320
15
+
16
+ A merged LoRA adapter combining four fine-tuned adapters (317-320) for the Llama-3.1-8B language model.
17
+
18
+ ## Model Details
19
+
20
+ - Base Model: meta-llama/Llama-3.1-8B-instruct
21
+ - Adaptation Method: Merged LoRA
22
+ - Source Adapters:
23
+ - https://huggingface.co/kevin009/llama317
24
+ - https://huggingface.co/kevin009/llama318
25
+ - https://huggingface.co/kevin009/llama319
26
+ - https://huggingface.co/kevin009/llama320
27
+ - https://huggingface.co/kevin009/llama326
28
+ - https://huggingface.co/kevin009/llama324
29
+
30
+ ## Merger Configuration
31
+
32
+ ### Source Adapters
33
+
34
+ All source adapters share the following configuration:
35
+ - Rank (r): 16
36
+ - Alpha: 16
37
+ - Target Modules:
38
+ - q_proj (Query projection)
39
+ - k_proj (Key projection)
40
+ - v_proj (Value projection)
41
+ - o_proj (Output projection)
42
+ - up_proj (Upsampling projection)
43
+ - down_proj (Downsampling projection)
44
+ - gate_proj (Gate projection)
45
+
46
+ ### Merger Details
47
+
48
+ - Merger Method: Linear interpolation
49
+ - Merger Weights: Equal weights (0.25) for each adapter
50
+ - Combined Rank: 16 (maintained from source adapters)
51
+
52
+ ## Usage
53
+
54
+ This merged adapter must be used with the base Llama-3.1-8B-instruct model.
55
+
56
+ ### Loading the Model
57
+
58
+ ```python
59
+ # Initialize with first F32 model
60
+ peft_model = PeftModel.from_pretrained(model, "llama319", adapter_name="llama319")
61
+
62
+ # Load F32 models (higher precision)
63
+ peft_model.load_adapter("llama324", adapter_name="llama324")
64
+ peft_model.load_adapter("llama320", adapter_name="llama320")
65
+ peft_model.load_adapter("llama318", adapter_name="llama318")
66
+ peft_model.load_adapter("llama317", adapter_name="llama317")
67
+
68
+ adapters = ["llama319", "llama320", "llama317", "llama324", "llama318"]
69
+ weights = [1.0, 1.0, 1.0, 1.0, 1.0]
70
+ peft_model.add_weighted_adapter(adapters, weights, "merge", combination_type="ties", density=0.2)
71
+
72
+ peft_model.set_adapter("merges")
73
+ peft_model.save_pretrained("merged")
74
+ ```
75
+
76
+ ## Limitations and Biases
77
+
78
+ - This merged adapter inherits limitations and biases from:
79
+ - The base Llama-3.1-8B-instruct model
80
+ - All four source adapters
81
+ - The merging process may result in:
82
+ - Potential loss of specialized capabilities from individual adapters
83
+ - Averaged behavior across different adapter specializations
84
+ - Possible interference between adapter weights
85
+
86
+ ## Merging Process
87
+
88
+ The adapters were merged using the following approach:
89
+ 1. Linear interpolation of adapter weights
90
+ 2. Equal weighting (0.25) applied to each source adapter
91
+ 3. Preservation of original LoRA rank and architecture
92
+
93
+ ### Method Used
94
+
95
+ The adapters were merged using PEFT (Parameter-Efficient Fine-Tuning) library's weighted adapter combination feature. The process combines multiple LoRA adapters using linear interpolation with specified weights.
96
+
97
+
98
+ ### Key Parameters
99
+
100
+ - `combination_type="ties"`: Uses the TIES (Task Interference Edge Selection) method for combining adapters
101
+ - `density=0.2`: Controls the sparsity of the merged weights
102
+
103
+
104
+ ### Notes
105
+
106
+ - The order of loading adapters may affect the final result
107
+ - Equal weights were chosen to maintain balanced influence from each adapter
108
+ - The merged adapter maintains the same architecture and rank as the original adapters
109
+ - While this adapter merges multiple fine-tunes, each component was developed as part of independent research efforts to explore and language model capabilities as part of R&D process.
110
+
111
+
112
+ ## Datasets
113
+
114
+ - Not yet released, but should be released after evaluation has completed.
115
+ - Creating dataset alone tooks more than 3 month for creating 30k pairs dataset.
116
+ - Only 1k pairs example considered to be synthetic dataset, the rest half synthetic and human written text.
117
+
118
+ ### Use Cases
119
+
120
+ - This merged adapter can be used for a wide range of tasks, including but not limited to:
121
+ - Accessibility
122
+ - Revision & Editing
123
+ - instruction-following use with xml tags
124
+ - Thinking & reasoning with xml tag of <thinking> and </thinking>, if being asked i the instructions.
125
+
126
+
127
+ These Models not optimized for code, math, or other specialized tasks that need Perefence Optimization.
128
+
129
+ ## Why SFT Instead of RLHF/DPO?
130
+ - RLHF and DPO approaches often lead to vocabulary limitations and overfitting due to their optimization objectives
131
+
132
+
133
+ ## Why Multiple Adapters?
134
+ - Resource Issue: Placing the training into smaller adapters requires less GPU memory and compute time while gives more control over the training process.
135
+ - Iterative Development: Each adapter can be developed and tested independently
136
+ - Training Infrastructure: The complete fine-tuning process was conducted across multiple sessions, totaling over 100 hours on high-end GPUs (H100, H200, or L40s)
137
+ - Flexibility: Multiple adapters allow for different combinations or weightings
138
+
139
+
140
+ ## License
141
+
142
+ Licensed under Apache 2.0 License.
143
+
144
+ This merged adapter is part of independent individual research work. While the code is open-source under the Apache 2.0 license, please note:
145
+
146
+ - You are free to use, modify, and distribute this adapter following the Apache 2.0 license terms
147
+ - This work is provided "as is" without warranties or conditions of any kind
148
+ - This is an independent research project and not affiliated with any organization
149
+ - Attribution is appreciated but not required
150
+ - For full license details, see: https://www.apache.org/licenses/LICENSE-2.0