Update README.md

Files changed (1) hide show

README.md CHANGED Viewed

@@ -156,7 +156,17 @@ lm_eval --model hf --model_args pretrained=microsoft/Phi-4-mini-instruct --tasks
 ## int8 dynamic activation and int4 weight quantization (8da4w)
 ```
-lm_eval --model hf --model_args pretrained=pytorch/Phi-4-mini-instruct-8da4w --tasks hellaswag --device cuda:0 --batch_size 64
 ```
 | Benchmark                        |             |                   |
@@ -187,7 +197,7 @@ We can run the quantized model on a mobile phone using [ExecuTorch](https://gith
 Once ExecuTorch is [set-up](https://pytorch.org/executorch/main/getting-started.html), exporting and running the model on device is a breeze.
 We first convert the quantized checkpoint to one ExecuTorch's LLM export script expects by renaming some of the checkpoint keys.
-The following script does this for you.
 ```
 python -m executorch.examples.models.phi_4_mini.convert_weights pytorch_model.bin phi4-mini-8da4w-converted.bin
 ```

 ## int8 dynamic activation and int4 weight quantization (8da4w)
 ```
+import lm_eval
+from lm_eval import evaluator
+from lm_eval.utils import (
+    make_table,
+)
+lm_eval_model = lm_eval.models.huggingface.HFLM(pretrained=quantized_model, batch_size=64)
+results = evaluator.simple_evaluate(
+    lm_eval_model, tasks=["hellaswag"], device="cuda:0", batch_size="auto"
+)
+print(make_table(results))
 ```
 | Benchmark                        |             |                   |
 Once ExecuTorch is [set-up](https://pytorch.org/executorch/main/getting-started.html), exporting and running the model on device is a breeze.
 We first convert the quantized checkpoint to one ExecuTorch's LLM export script expects by renaming some of the checkpoint keys.
+The following script does this for you.  We have uploaded phi4-mini-8da4w-converted.bin here for convenience.
 ```
 python -m executorch.examples.models.phi_4_mini.convert_weights pytorch_model.bin phi4-mini-8da4w-converted.bin
 ```