Spaces:

AstroMLab
/

AstroSage-8B

Runtime error

Tijmen2 commited on Nov 20, 2024

Commit

839a5ef

verified ·

1 Parent(s): 086415c

Update app.py

Files changed (1) hide show

app.py CHANGED Viewed

@@ -15,6 +15,10 @@ llm = Llama(
     chat_format="llama-3",
     n_gpu_layers=-1,  # ensure all layers are on GPU
     n_threads=1, # no CPU multi-threading
 )
 # Placeholder responses for when context is empty

     chat_format="llama-3",
     n_gpu_layers=-1,  # ensure all layers are on GPU
     n_threads=1, # no CPU multi-threading
+    offload_kqv=True, # store kqv on GPU
+    vocab_only=False,
+    use_mmap=True,
+    use_mlock=False,
 )
 # Placeholder responses for when context is empty