parinitarahi commited on
Commit
5c6b624
·
verified ·
1 Parent(s): eb74a42

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +0 -1
README.md CHANGED
@@ -80,7 +80,6 @@ See details at [https://huggingface.co/microsoft/Phi-4-mini-instruct/blob/main/R
80
  ## Performance Comparison
81
  |Hardware | ONNX | PyTorch | speedup |
82
  |-------|----------|------|---------|
83
- | A100 GPU | fp16 = 136.359 tokens/sec int4-rtn = 233.036 tokens/sec | fp16: 34.200 tokens/sec fp32: 37.899 tokens/sec | 4X (fp16), 6X(int4-rtn) |
84
  | RTX 4090 GPU | int4: 260.045 tokens/sec fp16: 97.463 tokens/se fp32: 19.320 tokens/sec | fp16: 43.957 tokens/sec | 5x(fp16) |
85
  | Intel Xeon Platinum 8272CL CPU | int4: 16.89 tokens/sec | fp32: 1.636 tokens/sec | 10x |
86
  | Intel Xeon Platinum 8573B CPU | int4: 23.978 tokens/sec | fp32: 4.479 tokens/sec | 5.35X |
 
80
  ## Performance Comparison
81
  |Hardware | ONNX | PyTorch | speedup |
82
  |-------|----------|------|---------|
 
83
  | RTX 4090 GPU | int4: 260.045 tokens/sec fp16: 97.463 tokens/se fp32: 19.320 tokens/sec | fp16: 43.957 tokens/sec | 5x(fp16) |
84
  | Intel Xeon Platinum 8272CL CPU | int4: 16.89 tokens/sec | fp32: 1.636 tokens/sec | 10x |
85
  | Intel Xeon Platinum 8573B CPU | int4: 23.978 tokens/sec | fp32: 4.479 tokens/sec | 5.35X |