Christoph Holthaus commited on
Commit
4a70f4c
·
1 Parent(s): 42e10de
Files changed (1) hide show
  1. README.md +14 -1
README.md CHANGED
@@ -11,4 +11,17 @@ license: apache-2.0
11
  Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
12
 
13
 
14
- This is a test ...
 
 
 
 
 
 
 
 
 
 
 
 
 
 
11
  Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
12
 
13
 
14
+ This is a test ...
15
+
16
+ TASKS:
17
+ - rewrite generation from scratch or use the one of mistral space if possible. alternative use https://github.com/abetlen/llama-cpp-python#chat-completion
18
+ - write IN LARGE LETTERS that this is not the original model but a quantified one that is able to run on free CPU Inference
19
+ - check ho wmuch parallel generation is possible or only one que and set max context etc accordingly. Maybe live-log ram free etc to the interface on "system health" graph if not too resource hungry on its own ... -> Gradio for display??
20
+ - live stream response (see mistral space!!)
21
+ - log memory usage to console? Maybe auto reboot if too slim
22
+ - readd system prompt? maybe checkout how to setup from lm studio - could be a dropdown with an option to set one fix also via env var when only one model available ...
23
+ - move model to DL into env-var with proper error handling
24
+ - chore: cleanup ignore, dockerfile etc.
25
+ - update all deps to one up to date version, then PIN them!
26
+ - make a short info on how to clone and run custom 7b models in separate spaces
27
+ - make a pr for popular repos to include in their readme etc.