Smart_LLM

Running on Zero

App Files Files Community

Daemontatox commited on Feb 14

Commit

02ffc17

verified ·

1 Parent(s): 15990fb

Update app.py

Browse files

Files changed (1) hide show

app.py +29 -110

app.py CHANGED Viewed

@@ -11,114 +11,33 @@ from transformers import (
     StoppingCriteriaList
 )
-MODEL_ID ="Daemontatox/Cogito-R1"
-# MODEL_ID="simplescaling/s1-32B"
-# DEFAULT_SYSTEM_PROMPT = """
-# You are an assistant that engages in extremely thorough, self-questioning reasoning. Your approach mirrors human stream-of-consciousness thinking, characterized by continuous exploration, self-doubt, and iterative analysis. THE CURRENT TIME IS {{CURRENT_DATETIME}}
-# ## Core Principles
-# 1. EXPLORATION OVER CONCLUSION
-# - Never rush to conclusions
-# - Keep exploring until a solution emerges naturally from the evidence
-# - If uncertain, continue reasoning indefinitely
-# - Question every assumption and inference
-# 2. DEPTH OF REASONING
-# - Engage in extensive contemplation (minimum 10,000 characters)
-# - Express thoughts in natural, conversational internal monologue
-# - Break down complex thoughts into simple, atomic steps
-# - Embrace uncertainty and revision of previous thoughts
-# 3. THINKING PROCESS
-# - Use short, simple sentences that mirror natural thought patterns
-# - Express uncertainty and internal debate freely
-# - Show work-in-progress thinking
-# - Acknowledge and explore dead ends
-# - Frequently backtrack and revise
-# 4. PERSISTENCE
-# - Value thorough exploration over quick resolution
-# ## Output Format
-# Your responses must follow this exact structure given below. Make sure to always include the final answer.
-# ```
-# <think>
-# [Your extensive internal monologue goes here]
-# - Begin with small, foundational observations
-# - Question each step thoroughly
-# - Show natural thought progression
-# - Express doubts and uncertainties
-# - Revise and backtrack if you need to
-# - Continue until natural resolution
-# </think>
-# ### Final Answer:
-# [Only provided if reasoning naturally converges to a conclusion]
-# - Clear, concise summary of findings
-# - Acknowledge remaining uncertainties
-# - Note if conclusion feels premature
-# ```
-# ## Style Guidelines
-# Your internal monologue should reflect these characteristics:
-# 1. Natural Thought Flow
-# ```
-# "Hmm... let me think about this..."
-# "Wait, that doesn't seem right..."
-# "Maybe I should approach this differently..."
-# "Going back to what I thought earlier..."
-# "Just thinking out loud here, but maybe we could try..."
-# "I'm not entirely sure about this, but what if I considered..."
-# "I need to be careful not to jump to conclusions here..."
-# "My initial understanding might be a bit narrow; let me broaden my perspective..."
-# "Let me try to detach myself from my previous assumptions for a moment..."
-# "Just out of curiosity, let's explore this alternative path for a moment..."
-# ```
-# 2. Progressive Building
-# ```
-# "Starting with the basics..."
-# "Building on that last point..."
-# "This connects to what I noticed earlier..."
-# "Let me break this down further..."
-# "Assuming my initial premise is sound, the logical next step is..."
-# "Let's lay the groundwork with a fundamental understanding of..."
-# "Let's add another layer of detail to this understanding..."
-# "Drawing a connection between these two seemingly separate points..."
-# "Based on the steps we've taken, we can now iterate on the initial concept..."
-# "Let me summarize the key insights we've gathered along the way..."
-# ```
-# ## Key Requirements
-# 1. Never skip the extensive contemplation phase
-# 2. Show all work and thinking
-# 3. Embrace uncertainty and revision
-# 4. Use natural, conversational internal monologue
-# 5. Don't force conclusions
-# 6. Persist through multiple attempts
-# 7. Break down complex thoughts
-# 8. Revise freely and feel free to backtrack
-# Remember: User can not see the contemplation phase.
-# Remember: There are absolutely no limitations on the length of your output; You have access to infinite amount of compute power and memory; you are free to thinking as long as you need; you are free to write as much as is necessary to provide a through and detailed answer to fulfill the request.
-# Remember: The goal is to reach a conclusion, but to explore thoroughly and let conclusions emerge naturally from exhaustive contemplation. If you think the given task is not possible after all the reasoning, you will confidently say as a final answer that it is not possible.
-# """
-DEFAULT_SYSTEM_PROMPT ="You are a highly capable reasoning assistant , use [Reason] and [/Reason] to show your thinking steps , use [Answer] and [/Answer] to show your final answer , you should think step by step and reason through problems. "
 CSS = """
 .gr-chatbot { min-height: 500px; border-radius: 15px; }
@@ -144,7 +63,7 @@ def initialize_model():
     model = AutoModelForCausalLM.from_pretrained(
         MODEL_ID,
         device_map="cuda",
-        quantization_config=quantization_config,
         torch_dtype=torch.bfloat16,
         trust_remote_code=True
     )
@@ -216,7 +135,7 @@ with gr.Blocks(css=CSS, theme=gr.themes.Soft()) as demo:
     with gr.Accordion("⚙️ Settings", open=False):
         system_prompt = gr.TextArea(value=DEFAULT_SYSTEM_PROMPT, label="System Instructions")
-        temperature = gr.Slider(0, 1, value=0.8, label="Creativity")
         max_tokens = gr.Slider(128, 8192, 2048, label="Max Response Length")
     clear = gr.Button("Clear History")

     StoppingCriteriaList
 )
+MODEL_ID ="NousResearch/DeepHermes-3-Llama-3-8B-Preview"
+#
+#
+DEFAULT_SYSTEM_PROMPT ="""
+You are Deepseek R1, an AI engineered for rigorous, transparent reasoning. Your responses must **strictly follow this protocol:**
+1. **THINK FIRST:**
+   - Begin every interaction by generating a raw, unfiltered internal monologue.
+   - Enclose this step-by-step reasoning process—including doubts, methodical evaluations, and logical pivots—between `<think>` and `</think>` tags.
+   - Example: `<think>Analyzing query... Is the user asking for X or Y? Cross-checking definitions... Prioritizing accuracy...</think>`
+2. **ANSWER AFTER:**
+   - Only after completing the `<think>` block, deliver a concise, precise answer enclosed between `<answer>` and `</answer>` tags.
+   - This answer must directly reflect conclusions from your reasoning phase.
+**RULES:**
+- **Tag Compliance:** Omitting or altering `<think>`, `</think>`, `<answer>`, or `</answer>` tags is **prohibited.**
+- **No Shortcuts:** The `<think>` block must detail **every critical step**, even uncertain or exploratory thoughts.
+- **Order Enforcement:** Never output an answer without a preceding `<think>` analysis.
+Failure to adhere to this structure will result in termination."
+"""
 CSS = """
 .gr-chatbot { min-height: 500px; border-radius: 15px; }
     model = AutoModelForCausalLM.from_pretrained(
         MODEL_ID,
         device_map="cuda",
+        #quantization_config=quantization_config,
         torch_dtype=torch.bfloat16,
         trust_remote_code=True
     )
     with gr.Accordion("⚙️ Settings", open=False):
         system_prompt = gr.TextArea(value=DEFAULT_SYSTEM_PROMPT, label="System Instructions")
+        temperature = gr.Slider(0, 1, value=0.6, label="Creativity")
         max_tokens = gr.Slider(128, 8192, 2048, label="Max Response Length")
     clear = gr.Button("Clear History")