yuanzu
/

DeepSeek-R1-INT8

Text Generation

8-bit precision

Model card Files Files and versions Community

yuanzu commited on 28 days ago

Commit

48de0c9

·

verified ·

1 Parent(s): 3eff130

Update README.md

Files changed (1) hide show

README.md +4 -0

README.md CHANGED Viewed

@@ -46,6 +46,10 @@ library_name: transformers
   <a href="https://github.com/deepseek-ai/DeepSeek-R1/blob/main/DeepSeek_R1.pdf"><b>Paper Link</b>👁️</a>
 </p>
 ## 1. Introduction

   <a href="https://github.com/deepseek-ai/DeepSeek-R1/blob/main/DeepSeek_R1.pdf"><b>Paper Link</b>👁️</a>
 </p>
+## 0. INT8 Quantization
+We apply a INT8 quantization on the BF16 checkpoints, where weight scales are determined by dividing he block-wise maximum of element values by the INT8 type maximum.
+The quantization script is provided in inference/bf16_case_int8.py.
 ## 1. Introduction