Update README.md
Browse files
README.md
CHANGED
@@ -26,9 +26,10 @@ We provide the [quantized pte](https://huggingface.co/pytorch/Qwen3-4B-8da4w/blo
|
|
26 |
|
27 |
# Running in a mobile app
|
28 |
The [pte file](https://huggingface.co/pytorch/Qwen3-4B-8da4w/blob/main/qwen3-4B-8da4w-1024-cxt.pte) can be run with ExecuTorch on a mobile phone. See the [instructions](https://pytorch.org/executorch/main/llm/llama-demo-ios.html) for doing this in iOS.
|
29 |
-
On iPhone 15 Pro, the model runs at
|
|
|
|
|
30 |
|
31 |
-
[TODO: ADD SCREENSHOT]
|
32 |
|
33 |
# Quantization Recipe
|
34 |
|
|
|
26 |
|
27 |
# Running in a mobile app
|
28 |
The [pte file](https://huggingface.co/pytorch/Qwen3-4B-8da4w/blob/main/qwen3-4B-8da4w-1024-cxt.pte) can be run with ExecuTorch on a mobile phone. See the [instructions](https://pytorch.org/executorch/main/llm/llama-demo-ios.html) for doing this in iOS.
|
29 |
+
On iPhone 15 Pro, the model runs at 14.8 tokens/sec and uses 3379 Mb of memory.
|
30 |
+
|
31 |
+

|
32 |
|
|
|
33 |
|
34 |
# Quantization Recipe
|
35 |
|