Update README.md
Browse files
README.md
CHANGED
@@ -80,7 +80,6 @@ See details at [https://huggingface.co/microsoft/Phi-4-mini-instruct/blob/main/R
|
|
80 |
## Performance Comparison
|
81 |
|Hardware | ONNX | PyTorch | speedup |
|
82 |
|-------|----------|------|---------|
|
83 |
-
| A100 GPU | fp16 = 136.359 tokens/sec int4-rtn = 233.036 tokens/sec | fp16: 34.200 tokens/sec fp32: 37.899 tokens/sec | 4X (fp16), 6X(int4-rtn) |
|
84 |
| RTX 4090 GPU | int4: 260.045 tokens/sec fp16: 97.463 tokens/se fp32: 19.320 tokens/sec | fp16: 43.957 tokens/sec | 5x(fp16) |
|
85 |
| Intel Xeon Platinum 8272CL CPU | int4: 16.89 tokens/sec | fp32: 1.636 tokens/sec | 10x |
|
86 |
| Intel Xeon Platinum 8573B CPU | int4: 23.978 tokens/sec | fp32: 4.479 tokens/sec | 5.35X |
|
|
|
80 |
## Performance Comparison
|
81 |
|Hardware | ONNX | PyTorch | speedup |
|
82 |
|-------|----------|------|---------|
|
|
|
83 |
| RTX 4090 GPU | int4: 260.045 tokens/sec fp16: 97.463 tokens/se fp32: 19.320 tokens/sec | fp16: 43.957 tokens/sec | 5x(fp16) |
|
84 |
| Intel Xeon Platinum 8272CL CPU | int4: 16.89 tokens/sec | fp32: 1.636 tokens/sec | 10x |
|
85 |
| Intel Xeon Platinum 8573B CPU | int4: 23.978 tokens/sec | fp32: 4.479 tokens/sec | 5.35X |
|