shubhrapandit commited on
Commit
d7486e2
·
verified ·
1 Parent(s): 322e34f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +63 -1
README.md CHANGED
@@ -345,6 +345,37 @@ The following performance benchmarks were conducted with [vLLM](https://docs.vll
345
  </tr>
346
  </thead>
347
  <tbody style="text-align: center">
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
348
  <tr>
349
  <th rowspan="3" valign="top">A100x1</th>
350
  <th>Qwen/Qwen2.5-VL-7B-Instruct</th>
@@ -439,9 +470,40 @@ The following performance benchmarks were conducted with [vLLM](https://docs.vll
439
  </tr>
440
  </thead>
441
  <tbody style="text-align: center">
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
442
  <tr>
443
  <th rowspan="3" valign="top">A100x1</th>
444
- <th>Qwen/Qwen2.5-VL-7B-Instruct-quantized.</th>
445
  <td></td>
446
  <td>0.7</td>
447
  <td>1347</td>
 
345
  </tr>
346
  </thead>
347
  <tbody style="text-align: center">
348
+ <tr>
349
+ <th rowspan="3" valign="top">A6000x1</th>
350
+ <th>Qwen/Qwen2.5-VL-7B-Instruct</th>
351
+ <td></td>
352
+ <td>4.9</td>
353
+ <td>912</td>
354
+ <td>3.2</td>
355
+ <td>1386</td>
356
+ <td>3.1</td>
357
+ <td>1431</td>
358
+ </tr>
359
+ <tr>
360
+ <th>neuralmagic/Qwen2.5-VL-7B-Instruct-quantized.w8a8</th>
361
+ <td>1.50</td>
362
+ <td>3.6</td>
363
+ <td>1248</td>
364
+ <td>2.1</td>
365
+ <td>2163</td>
366
+ <td>2.0</td>
367
+ <td>2237</td>
368
+ </tr>
369
+ <tr>
370
+ <th>neuralmagic/Qwen2.5-VL-7B-Instruct-quantized.w4a16</th>
371
+ <td>2.05</td>
372
+ <td>3.3</td>
373
+ <td>1351</td>
374
+ <td>1.4</td>
375
+ <td>3252</td>
376
+ <td>1.4</td>
377
+ <td>3321</td>
378
+ </tr>
379
  <tr>
380
  <th rowspan="3" valign="top">A100x1</th>
381
  <th>Qwen/Qwen2.5-VL-7B-Instruct</th>
 
470
  </tr>
471
  </thead>
472
  <tbody style="text-align: center">
473
+ <tr>
474
+ <th rowspan="3" valign="top">A6000x1</th>
475
+ <th>Qwen/Qwen2.5-VL-7B-Instruct</th>
476
+ <td></td>
477
+ <td>0.4</td>
478
+ <td>1837</td>
479
+ <td>1.5</td>
480
+ <td>6846</td>
481
+ <td>1.7</td>
482
+ <td>7638</td>
483
+ </tr>
484
+ <tr>
485
+ <th>neuralmagic/Qwen2.5-VL-7B-Instruct-quantized.w8a8</th>
486
+ <td>1.41</td>
487
+ <td>0.5</td>
488
+ <td>2297</td>
489
+ <td>2.3</td>
490
+ <td>10137</td>
491
+ <td>2.5</td>
492
+ <td>11472</td>
493
+ </tr>
494
+ <tr>
495
+ <th>neuralmagic/Qwen2.5-VL-7B-Instruct-quantized.w4a16</th>
496
+ <td>1.60</td>
497
+ <td>0.4</td>
498
+ <td>1828</td>
499
+ <td>2.7</td>
500
+ <td>12254</td>
501
+ <td>3.4</td>
502
+ <td>15477</td>
503
+ </tr>
504
  <tr>
505
  <th rowspan="3" valign="top">A100x1</th>
506
+ <th>Qwen/Qwen2.5-VL-7B-Instruct</th>
507
  <td></td>
508
  <td>0.7</td>
509
  <td>1347</td>