smorce
/

Qwen2.5-Coder-32B-Instruct-karakuri-thinking-slerp-AWQ

Text Generation

text-generation-inference

Inference Endpoints

4-bit precision

Model card Files Files and versions Community

karakuri-lm-32b-thinking-2501-exp-AWQ

カラクリ様が公開されている karakuri-lm-32b-thinking-2501-exp と Qwenチームが公開されている Qwen2.5-Coder-32B-Instruct をマージし、それを AWQ 4bit で量子化したモデルになります。

キャリブレーション用データセットは izumi-lab/wikipedia-ja-20230720 を使用しました。
※TFMC/imatrix-dataset-for-japanese-llm ではございません。

量子化前のモデルとマージ設定は以下の通りです:
Qwen2.5-Coder-32B-Instruct-karakuri-thinking-slerp

作成意図

日本語のReasoningモデルにコーディング能力を付与する目的で作成しました。

量子化の設定

quant_config = {
  "zero_point": True,
  "q_group_size": 128,
  "w_bit": 4,
  "version": "GEMM"
}

このモデルは崩壊してしまい、失敗でした。

Downloads last month: 6

Safetensors

Model size

5.73B params

Tensor type

I32

·

BF16

·

FP16

·

Inference Providers NEW

Text Generation

This model is not currently available via any of the supported Inference Providers.

Model tree for smorce/Qwen2.5-Coder-32B-Instruct-karakuri-thinking-slerp-AWQ

Base model

smorce/Qwen2.5-Coder-32B-Instruct-karakuri-thinking-slerp

Quantized

(1)

this model

Dataset used to train smorce/Qwen2.5-Coder-32B-Instruct-karakuri-thinking-slerp-AWQ