Update README.md
Browse files
README.md
CHANGED
@@ -19,8 +19,8 @@ library_name: transformers
|
|
19 |
- **Input:** Vision-Text
|
20 |
- **Output:** Text
|
21 |
- **Model Optimizations:**
|
22 |
-
- **Weight quantization:**
|
23 |
-
- **Activation quantization:**
|
24 |
- **Release Date:** 2/24/2025
|
25 |
- **Version:** 1.0
|
26 |
- **Model Developers:** Neural Magic
|
@@ -29,7 +29,7 @@ Quantized version of [Qwen/Qwen2-VL-72B-Instruct](https://huggingface.co/Qwen/Qw
|
|
29 |
|
30 |
### Model Optimizations
|
31 |
|
32 |
-
This model was obtained by quantizing the weights of [Qwen/Qwen2-VL-72B-Instruct](https://huggingface.co/Qwen/Qwen2-VL-72B-Instruct) to
|
33 |
|
34 |
## Deployment
|
35 |
|
|
|
19 |
- **Input:** Vision-Text
|
20 |
- **Output:** Text
|
21 |
- **Model Optimizations:**
|
22 |
+
- **Weight quantization:** FP8
|
23 |
+
- **Activation quantization:** FP8
|
24 |
- **Release Date:** 2/24/2025
|
25 |
- **Version:** 1.0
|
26 |
- **Model Developers:** Neural Magic
|
|
|
29 |
|
30 |
### Model Optimizations
|
31 |
|
32 |
+
This model was obtained by quantizing the weights of [Qwen/Qwen2-VL-72B-Instruct](https://huggingface.co/Qwen/Qwen2-VL-72B-Instruct) to FP8 data type, ready for inference with vLLM >= 0.5.2.
|
33 |
|
34 |
## Deployment
|
35 |
|