fix hallucination
Browse files
README.md
CHANGED
@@ -51,8 +51,8 @@ Think step by step. Solve this problem without removing any existing functionali
|
|
51 |
| 5* | Qwen2-7b-Instruct bf16 | 78.22 | Average, can think of correct approaches | Sometimes helps generate new ideas | High speed, useful for generating ideas. |
|
52 |
| 5* | AutoCoder.IQ4_K.gguf (this repo) | 26.43 | Excellent at solutions that require one to few lines of edits | Generates useful short code segments | Try Precise Mode or Balanced Mode. |
|
53 |
| 7 | GPT-4o-mini | N/A | Decent, but struggles with complex debugging tasks | Reliable for shorter or simpler code generation tasks | Suitable for less complex coding tasks. |
|
54 |
-
| 8 | Meta-Llama-3.1-70B-Instruct-IQ2_XS.gguf | 2.55 | Poor,
|
55 |
-
| 9 | Trinity-2-Codestral-22B-Q6_K_L | N/A | Poor, similar issues to DeepSeekV2 in outputing the same code |
|
56 |
| 10 | DeepSeekV2 Coder Lite Instruct Q_8L | N/A | Poor, repeats code similar to other models in its family | Not as effective in my context | Not recommended overall based on my criteria. |
|
57 |
|
58 |
|
|
|
51 |
| 5* | Qwen2-7b-Instruct bf16 | 78.22 | Average, can think of correct approaches | Sometimes helps generate new ideas | High speed, useful for generating ideas. |
|
52 |
| 5* | AutoCoder.IQ4_K.gguf (this repo) | 26.43 | Excellent at solutions that require one to few lines of edits | Generates useful short code segments | Try Precise Mode or Balanced Mode. |
|
53 |
| 7 | GPT-4o-mini | N/A | Decent, but struggles with complex debugging tasks | Reliable for shorter or simpler code generation tasks | Suitable for less complex coding tasks. |
|
54 |
+
| 8 | Meta-Llama-3.1-70B-Instruct-IQ2_XS.gguf | 2.55 | Poor, occasionally helps generate ideas | --- | Speed is a significant limitation. |
|
55 |
+
| 9 | Trinity-2-Codestral-22B-Q6_K_L | N/A | Poor, similar issues to DeepSeekV2 in outputing the same code | --- | Similar problem to DeepSeekV2, not recommended for my complex tasks. |
|
56 |
| 10 | DeepSeekV2 Coder Lite Instruct Q_8L | N/A | Poor, repeats code similar to other models in its family | Not as effective in my context | Not recommended overall based on my criteria. |
|
57 |
|
58 |
|