Text Generation
GGUF
English
nlp
code
aberrio commited on
Commit
750e02a
·
1 Parent(s): b0c74f0

docs: Update Refact 1.6B FIM GGUF Documentation

Browse files

- Add Acknowledgments section for tensor conversion
- Include example shell command for testing against Hugging Face
- Resolve llama.cpp issue #3061

This commit updates the documentation for the Refact 1.6B FIM GGUF model, adding an Acknowledgments section and providing additional example usage. It also addresses a specific issue in the llama.cpp repository.

Files changed (1) hide show
  1. README.md +110 -0
README.md ADDED
@@ -0,0 +1,110 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ pipeline_tag: text-generation
3
+ inference: true
4
+ widget:
5
+ - text: 'def print_hello_world():'
6
+ example_title: Hello world
7
+ group: Python
8
+ license: bigscience-openrail-m
9
+ pretrain-datasets:
10
+ - books
11
+ - arxiv
12
+ - c4
13
+ - falcon-refinedweb
14
+ - wiki
15
+ - github-issues
16
+ - stack_markdown
17
+ - self-made dataset of permissive github code
18
+ datasets:
19
+ - bigcode/the-stack-dedup
20
+ - rombodawg/2XUNCENSORED_MegaCodeTraining188k
21
+ - bigcode/commitpackft
22
+ library_name: llama.cpp
23
+ tags:
24
+ - code
25
+ language:
26
+ - en
27
+ ---
28
+
29
+ # Refact 1.6B FIM GGUF
30
+
31
+ ## Introduction
32
+
33
+ The Refact 1.6B FIM GGUF model is a state-of-the-art AI-powered coding assistant developed by Small Magellanic Cloud AI Ltd. This model is designed to assist developers with code completion, refactoring, and chat-based interactions, excelling in code-related natural language understanding and generation tasks.
34
+
35
+ ## Quantized Model Files
36
+
37
+ The model comes in various quantized versions to suit different computational needs:
38
+
39
+ - **refact-1.6B-fim-q4_0.gguf**: A 4-bit quantized model with a file size of 878 MB.
40
+ - **refact-1.6B-fim-q5_0.gguf**: A 5-bit quantized model with a file size of 1.1 GB.
41
+ - **refact-1.6B-fim-q8_0.gguf**: An 8-bit quantized model with a file size of 1.6 GB.
42
+
43
+ ## Features and Usage
44
+
45
+ The model is versatile and can be employed for:
46
+
47
+ - Code completion
48
+ - Code refactoring
49
+ - Chat-based interactions
50
+
51
+ ### Example Usage
52
+
53
+ Here's a sample shell command to invoke the model:
54
+
55
+ ```sh
56
+ # Sample shell command to use the model
57
+ ./main -m models/smallcloudai/Refact-1_6B-fim/ggml-model-f16.gguf -n 300 -p "write a function to multiply two integers in python" --temp 1.0 --top-p 1.0 --top-k 1 --repeat_penalty 1.0
58
+ ```
59
+
60
+ ## Performance Metrics
61
+
62
+ The model outperforms many existing models in both code completion and chat-based interactions, as evidenced by the HumanEval results.
63
+
64
+ | Model | Size | HumanEval pass@1 | HumanEval pass@10 |
65
+ |----------------------|-------|------------------|-------------------|
66
+ | **Refact-1.6-fim** | 1.6b | 32.0% | 53.0% |
67
+ | StableCode | 3b | 20.2% | 33.8% |
68
+ | ReplitCode v1 | 3b | 21.9% | N/A |
69
+
70
+ ## Installation and Setup
71
+
72
+ The model can be integrated into your IDE via the [Refact plugin](https://refact.ai/). For self-hosting, an [open-source Docker container](https://github.com/smallcloudai/refact) is available.
73
+
74
+ ## Limitations and Bias
75
+
76
+ The model primarily focuses on English text, which may result in lower performance for non-English languages.
77
+
78
+ ## Technical Specifications
79
+
80
+ - **Architecture**: LLAMA-like model with multi-query attention
81
+ - **Training Tokens**: 1.2T for pretraining, 40B for fine-tuning
82
+ - **Precision**: bfloat16
83
+ - **Training Time**: 28 days
84
+
85
+ ## License
86
+
87
+ The model is licensed under the BigScience OpenRAIL-M v1 license agreement.
88
+
89
+ ## Citation
90
+
91
+ If you use this model in your work, please cite it by linking back to the following page for proper attribution:
92
+
93
+ [Refact 1.6B FIM Model](https://huggingface.co/smallcloudai/Refact-1_6B-fim)
94
+
95
+ ## Acknowledgments
96
+
97
+ Special thanks to [ds5t5](https://github.com/ggerganov/llama.cpp/pull/3329) for their contribution in implementing the source for converting the model's tensors from Hugging Face to GGUF format. Their work has been instrumental in enhancing the model's versatility.
98
+
99
+ ### Example Command for Testing
100
+
101
+ To test the model against Hugging Face, you can use the following command:
102
+
103
+ ```sh
104
+ # Example command for testing against Hugging Face
105
+ python3 convert-refact-hf-to-gguf.py ./Refact-1_6B-fim 1
106
+
107
+ ./main -m ./Refact-1_6B-fim/ggml-model-f16.gguf -n 300 -p "write a function to multiply two integers in python" --temp 1.0 --top-p 1.0 --top-k 1 --repeat_penalty 1.0
108
+ ```
109
+
110
+ This resolves llama.cpp issue [#3061](https://github.com/ggerganov/llama.cpp/issues/3061).