shubhrapandit commited on
Commit
ac009f1
·
verified ·
1 Parent(s): db3d325

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -3
README.md CHANGED
@@ -19,8 +19,8 @@ library_name: transformers
19
  - **Input:** Vision-Text
20
  - **Output:** Text
21
  - **Model Optimizations:**
22
- - **Weight quantization:** INT4
23
- - **Activation quantization:** FP16
24
  - **Release Date:** 2/24/2025
25
  - **Version:** 1.0
26
  - **Model Developers:** Neural Magic
@@ -29,7 +29,7 @@ Quantized version of [Qwen/Qwen2-VL-72B-Instruct](https://huggingface.co/Qwen/Qw
29
 
30
  ### Model Optimizations
31
 
32
- This model was obtained by quantizing the weights of [Qwen/Qwen2-VL-72B-Instruct](https://huggingface.co/Qwen/Qwen2-VL-72B-Instruct) to INT8 data type, ready for inference with vLLM >= 0.5.2.
33
 
34
  ## Deployment
35
 
 
19
  - **Input:** Vision-Text
20
  - **Output:** Text
21
  - **Model Optimizations:**
22
+ - **Weight quantization:** FP8
23
+ - **Activation quantization:** FP8
24
  - **Release Date:** 2/24/2025
25
  - **Version:** 1.0
26
  - **Model Developers:** Neural Magic
 
29
 
30
  ### Model Optimizations
31
 
32
+ This model was obtained by quantizing the weights of [Qwen/Qwen2-VL-72B-Instruct](https://huggingface.co/Qwen/Qwen2-VL-72B-Instruct) to FP8 data type, ready for inference with vLLM >= 0.5.2.
33
 
34
  ## Deployment
35