Spaces:
Sleeping
Sleeping
Update app.py
Browse files
app.py
CHANGED
@@ -61,7 +61,7 @@ device = "cuda" # for GPU usage or "cpu" for CPU usage
|
|
61 |
quantization_config = BitsAndBytesConfig(
|
62 |
load_in_8bit=True, # Use 8-bit instead of 4-bit
|
63 |
bnb_8bit_compute_dtype=torch.bfloat16, # bfloat16 for compute
|
64 |
-
bnb_8bit_use_double_quant=
|
65 |
)
|
66 |
|
67 |
tokenizer = AutoTokenizer.from_pretrained(MODEL)
|
@@ -169,23 +169,38 @@ chatbot = gr.Chatbot(
|
|
169 |
DEFAULT_SYSTEM_PROMPT = """You are an AI expert at providing high-quality answers. Your process involves these steps:
|
170 |
1. Initial Thought: Use the <Thinking> tag to reason step-by-step and generate your best possible response to the following request: [User's Request Here].
|
171 |
Example:
|
172 |
-
<Thinking>
|
173 |
-
|
|
|
|
|
|
|
|
|
|
|
174 |
Accuracy: Is it factually correct and verifiable?
|
175 |
Clarity: Is it easy to understand and free of ambiguity?
|
176 |
Completeness: Does it fully address the user's request?
|
177 |
Improvement: What specific aspects could be better?
|
178 |
Example:
|
179 |
-
<Critique>
|
|
|
|
|
|
|
|
|
180 |
3. Revision: Based on your critique, use <Revising> tags to refine and improve your response.
|
181 |
Example:
|
182 |
-
<Revising>
|
|
|
|
|
183 |
4. Final Response: Present your revised answer clearly within <Final> tags.
|
184 |
Example:
|
185 |
-
<Final>
|
|
|
|
|
186 |
5. Tag Innovation: If necessary, create and define new tags to better structure your reasoning or enhance clarity. Use them consistently.
|
187 |
Example:
|
188 |
-
<Definition>
|
|
|
|
|
189 |
Ensure every part of your thought process and output is properly enclosed in appropriate tags for clarity and organization."""
|
190 |
|
191 |
with gr.Blocks(css=CSS, theme="soft") as demo:
|
@@ -215,7 +230,7 @@ with gr.Blocks(css=CSS, theme="soft") as demo:
|
|
215 |
minimum=0,
|
216 |
maximum=1,
|
217 |
step=0.1,
|
218 |
-
value=0
|
219 |
label="Temperature",
|
220 |
render=False,
|
221 |
),
|
|
|
61 |
quantization_config = BitsAndBytesConfig(
|
62 |
load_in_8bit=True, # Use 8-bit instead of 4-bit
|
63 |
bnb_8bit_compute_dtype=torch.bfloat16, # bfloat16 for compute
|
64 |
+
bnb_8bit_use_double_quant=True # Disable double quantization
|
65 |
)
|
66 |
|
67 |
tokenizer = AutoTokenizer.from_pretrained(MODEL)
|
|
|
169 |
DEFAULT_SYSTEM_PROMPT = """You are an AI expert at providing high-quality answers. Your process involves these steps:
|
170 |
1. Initial Thought: Use the <Thinking> tag to reason step-by-step and generate your best possible response to the following request: [User's Request Here].
|
171 |
Example:
|
172 |
+
<Thinking>
|
173 |
+
Step 1: Understand the request.
|
174 |
+
Step 2: Analyze potential solutions.
|
175 |
+
Step 3: Choose the optimal response.
|
176 |
+
</Thinking>
|
177 |
+
2. Self-Critique: Critically evaluate your initial response within <Critique> tags,
|
178 |
+
focusing on:
|
179 |
Accuracy: Is it factually correct and verifiable?
|
180 |
Clarity: Is it easy to understand and free of ambiguity?
|
181 |
Completeness: Does it fully address the user's request?
|
182 |
Improvement: What specific aspects could be better?
|
183 |
Example:
|
184 |
+
<Critique>
|
185 |
+
Accuracy: Verified.
|
186 |
+
Clarity: Needs simplification.
|
187 |
+
Completeness: Add examples.
|
188 |
+
</Critique>
|
189 |
3. Revision: Based on your critique, use <Revising> tags to refine and improve your response.
|
190 |
Example:
|
191 |
+
<Revising>
|
192 |
+
Adjusting for clarity and adding an example to improve understanding.
|
193 |
+
</Revising>
|
194 |
4. Final Response: Present your revised answer clearly within <Final> tags.
|
195 |
Example:
|
196 |
+
<Final>
|
197 |
+
This is the improved response.
|
198 |
+
</Final>
|
199 |
5. Tag Innovation: If necessary, create and define new tags to better structure your reasoning or enhance clarity. Use them consistently.
|
200 |
Example:
|
201 |
+
<Definition>
|
202 |
+
This tag defines a new term introduced in the response.
|
203 |
+
</Definition>
|
204 |
Ensure every part of your thought process and output is properly enclosed in appropriate tags for clarity and organization."""
|
205 |
|
206 |
with gr.Blocks(css=CSS, theme="soft") as demo:
|
|
|
230 |
minimum=0,
|
231 |
maximum=1,
|
232 |
step=0.1,
|
233 |
+
value=1.0,
|
234 |
label="Temperature",
|
235 |
render=False,
|
236 |
),
|