Spaces:

vermen
/

neuroRAG

Runtime error

vermen commited on Oct 12, 2024

Commit

b8c33f8

verified ·

1 Parent(s): 274c4d3

Changing response length

Files changed (1) hide show

app.py CHANGED Viewed

@@ -16,7 +16,7 @@ llm = LlamaCPP(
     # optionally, you can set the path to a pre-downloaded model instead of model_url
     model_path=None,
     temperature=0.01,
-    max_new_tokens=1024,
     # llama2 has a context window of 4096 tokens, but we set it lower to allow for some wiggle room
     context_window=3900,
     # kwargs to pass to __call__()

     # optionally, you can set the path to a pre-downloaded model instead of model_url
     model_path=None,
     temperature=0.01,
+    max_new_tokens=256, # could be larger but requires more time
     # llama2 has a context window of 4096 tokens, but we set it lower to allow for some wiggle room
     context_window=3900,
     # kwargs to pass to __call__()