Spaces:
Build error
Build error
# llama.cpp/example/run | |
The purpose of this example is to demonstrate a minimal usage of llama.cpp for running models. | |
```bash | |
llama-run granite3-moe | |
``` | |
```bash | |
Description: | |
Runs a llm | |
Usage: | |
llama-run [options] model [prompt] | |
Options: | |
-c, --context-size <value> | |
Context size (default: 2048) | |
-n, -ngl, --ngl <value> | |
Number of GPU layers (default: 0) | |
--temp <value> | |
Temperature (default: 0.8) | |
-v, --verbose, --log-verbose | |
Set verbosity level to infinity (i.e. log all messages, useful for debugging) | |
-h, --help | |
Show help message | |
Commands: | |
model | |
Model is a string with an optional prefix of | |
huggingface:// (hf://), ollama://, https:// or file://. | |
If no protocol is specified and a file exists in the specified | |
path, file:// is assumed, otherwise if a file does not exist in | |
the specified path, ollama:// is assumed. Models that are being | |
pulled are downloaded with .partial extension while being | |
downloaded and then renamed as the file without the .partial | |
extension when complete. | |
Examples: | |
llama-run llama3 | |
llama-run ollama://granite-code | |
llama-run ollama://smollm:135m | |
llama-run hf://QuantFactory/SmolLM-135M-GGUF/SmolLM-135M.Q2_K.gguf | |
llama-run huggingface://bartowski/SmolLM-1.7B-Instruct-v0.2-GGUF/SmolLM-1.7B-Instruct-v0.2-IQ3_M.gguf | |
llama-run https://example.com/some-file1.gguf | |
llama-run some-file2.gguf | |
llama-run file://some-file3.gguf | |
llama-run --ngl 999 some-file4.gguf | |
llama-run --ngl 999 some-file5.gguf Hello World | |
``` | |