Spaces:

Fraser
/

piclets

Running

App Files Files Community

Fraser commited on 15 days ago

Commit

6a18e94

1 Parent(s): 3eb1d35

zephyr only

Browse files

Files changed (4) hide show

CLAUDE.md +9 -10
src/lib/components/MonsterGenerator/MonsterGenerator.svelte +2 -11
src/lib/services/qwen3Client.ts +0 -239
src/lib/services/textGenerationClient.ts +0 -149

CLAUDE.md CHANGED Viewed

@@ -85,9 +85,8 @@ const client = await window.gradioClient.Client.connect("space-name");
 **Current Gradio Connections:**
 - **Flux Image Generation**: `Fraser/flux`
-- **Joy Caption**: `fancyfeast/joy-caption-pre-alpha`
-- **Zephyr-7B Text Generation**: `Fraser/zephyr-7b` (fallback)
-- **Qwen3 Text Generation**: `Qwen/Qwen3-Demo` (primary)
 **Build Notes:**
 - DO NOT install Gradio Client via npm (`npm install @gradio/client`) - it causes build failures
@@ -95,10 +94,10 @@ const client = await window.gradioClient.Client.connect("space-name");
 - All Gradio connections should use the established pattern from App.svelte
 ### Text Generation Architecture
-The project uses a smart fallback system:
-1. **Primary**: Qwen3 via proper Gradio Client connection to `/add_message` endpoint
-2. **Fallback**: Zephyr-7B for when Qwen3 is unavailable
-3. **Manager**: `textGenerationManager` handles automatic switching with connection testing
 ## Troubleshooting
@@ -108,13 +107,13 @@ The project uses a smart fallback system:
 - **Missing dependencies**: Run `npm install` if packages are missing
 ### Monster Generation Issues
-- **Name extraction problems**: Check `MonsterGenerator.svelte` line 322 - regex should extract content after `# Monster Name`
-- **Qwen3 connection failures**: System automatically falls back to Zephyr-7B if Qwen3 is unavailable
 - **Image processing errors**: Verify Flux and Joy Caption clients are properly connected
 ### Performance
 - **Large image files**: Consider image compression before upload
-- **Slow generation**: Qwen3 may take 10-30 seconds for complex monster concepts
 - **Battle lag**: IndexedDB operations are async - ensure proper await usage
 ## Important Notes

 **Current Gradio Connections:**
 - **Flux Image Generation**: `Fraser/flux`
+- **Joy Caption**: `fancyfeast/joy-caption-alpha-two`
+- **Zephyr-7B Text Generation**: `Fraser/zephyr-7b`
 **Build Notes:**
 - DO NOT install Gradio Client via npm (`npm install @gradio/client`) - it causes build failures
 - All Gradio connections should use the established pattern from App.svelte
 ### Text Generation Architecture
+The project uses a simple, direct approach:
+1. **Zephyr-7B**: Direct connection to `Fraser/zephyr-7b` space for all text generation
+2. **Direct API calls**: Components use `zephyrClient.predict("/chat", [...])` directly
+3. **No fallback complexity**: Simple, reliable single-client architecture
 ## Troubleshooting
 - **Missing dependencies**: Run `npm install` if packages are missing
 ### Monster Generation Issues
+- **Name extraction problems**: Check `MonsterGenerator.svelte` - regex should extract content after `# Monster Name`
+- **Zephyr-7B connection failures**: Verify `Fraser/zephyr-7b` space is accessible
 - **Image processing errors**: Verify Flux and Joy Caption clients are properly connected
 ### Performance
 - **Large image files**: Consider image compression before upload
+- **Slow generation**: Zephyr-7B may take 10-30 seconds for complex monster concepts
 - **Battle lag**: IndexedDB operations are async - ensure proper await usage
 ## Important Notes

src/lib/components/MonsterGenerator/MonsterGenerator.svelte CHANGED Viewed

@@ -9,20 +9,11 @@
   import { extractPicletMetadata } from '$lib/services/picletMetadata';
   import { savePicletInstance } from '$lib/db/piclets';
   import { PicletType, TYPE_DATA } from '$lib/types/picletTypes';
-  import { textGenerationManager } from '$lib/services/textGenerationClient';
   interface Props extends MonsterGeneratorProps {}
   let { joyCaptionClient, zephyrClient, fluxClient }: Props = $props();
-  // Initialize text generation manager with Zephyr-7B fallback support
-  $effect(() => {
-    if (zephyrClient) {
-      textGenerationManager.setFallbackClient(zephyrClient);
-      textGenerationManager.initialize();
-    }
-  });
   let state: MonsterWorkflowState = $state({
     currentStep: 'upload',
     userImage: null,
@@ -228,7 +219,7 @@ Focus on: colors, body shape, eyes, limbs, mouth, and key visual features. Omit
       console.log('Using smart text generation for visual description extraction');
       try {
-        const output = await textGenerationManager.predict("/chat", [
           promptGenerationPrompt, // message
           [],                     // chat_history
           systemPrompt,          // system_prompt
@@ -391,7 +382,7 @@ Write your response within \`\`\`json\`\`\``;
     console.log('Generating monster stats from concept');
     try {
-      const output = await textGenerationManager.predict("/chat", [
         statsPrompt,          // message
         [],                   // chat_history
         systemPrompt,         // system_prompt

   import { extractPicletMetadata } from '$lib/services/picletMetadata';
   import { savePicletInstance } from '$lib/db/piclets';
   import { PicletType, TYPE_DATA } from '$lib/types/picletTypes';
   interface Props extends MonsterGeneratorProps {}
   let { joyCaptionClient, zephyrClient, fluxClient }: Props = $props();
   let state: MonsterWorkflowState = $state({
     currentStep: 'upload',
     userImage: null,
       console.log('Using smart text generation for visual description extraction');
       try {
+        const output = await zephyrClient!.predict("/chat", [
           promptGenerationPrompt, // message
           [],                     // chat_history
           systemPrompt,          // system_prompt
     console.log('Generating monster stats from concept');
     try {
+      const output = await zephyrClient!.predict("/chat", [
         statsPrompt,          // message
         [],                   // chat_history
         systemPrompt,         // system_prompt

src/lib/services/qwen3Client.ts DELETED Viewed

@@ -1,239 +0,0 @@
-/**
- * Qwen3 Client - Drop-in replacement for rwkvClient using Qwen3 HF Space
- * Compatible with existing rwkvClient.predict("/chat", [...]) API
- * Uses proper Gradio Client connection instead of direct HTTP calls
- */
-interface Qwen3Message {
-  role: 'user' | 'assistant' | 'system';
-  content: string;
-}
-interface Qwen3ClientOptions {
-  huggingFaceSpace: string;
-  model: string;
-  apiKey?: string;
-}
-export class Qwen3Client {
-  private options: Qwen3ClientOptions;
-  private sessionId: string;
-  private gradioClient: any = null;
-  constructor(options: Partial<Qwen3ClientOptions> = {}) {
-    this.options = {
-      huggingFaceSpace: 'Qwen/Qwen3-Demo',
-      model: 'qwen2.5-72b-instruct', // Use Qwen2.5-72B for best performance
-      ...options
-    };
-    this.sessionId = this.generateSessionId();
-  }
-  private generateSessionId(): string {
-    return Math.random().toString(36).substring(2, 15) + Math.random().toString(36).substring(2, 15);
-  }
-  /**
-   * Initialize Gradio Client connection to Qwen3 Space
-   */
-  private async initializeGradioClient(): Promise<void> {
-    if (this.gradioClient) {
-      return; // Already initialized
-    }
-    try {
-      // Use the same approach as App.svelte - access window.gradioClient
-      if (!window.gradioClient?.Client) {
-        throw new Error('Gradio Client not available - ensure App.svelte has loaded the client');
-      }
-      console.log(`🔗 Connecting to ${this.options.huggingFaceSpace}...`);
-      this.gradioClient = await window.gradioClient.Client.connect(this.options.huggingFaceSpace);
-      console.log(`✅ Connected to Qwen3 space: ${this.options.huggingFaceSpace}`);
-    } catch (error) {
-      console.error('Failed to initialize Qwen3 Gradio Client:', error);
-      throw new Error(`Could not connect to Qwen3 space: ${error}`);
-    }
-  }
-  /**
-   * Predict method that mimics rwkvClient.predict("/chat", [...]) API
-   * @param endpoint Should be "/chat" for compatibility
-   * @param params Array of parameters: [message, chat_history, system_prompt, max_new_tokens, temperature, top_p, top_k, repetition_penalty]
-   * @returns Promise<{data: any[]}>
-   */
-  async predict(endpoint: string, params: any[]): Promise<{data: any[]}> {
-    if (endpoint !== '/chat') {
-      throw new Error('Qwen3Client only supports "/chat" endpoint');
-    }
-    // Note: Qwen3-Demo only uses these 3 parameters from the rwkv-compatible API
-    const [
-      message,
-      chat_history = [],
-      system_prompt = "You are a helpful assistant."
-    ] = params;
-    try {
-      // Ensure Gradio client is initialized
-      await this.initializeGradioClient();
-      // Use the proper Gradio Client API to call the add_message function
-      // Only pass parameters that actually exist in the Qwen3 Gradio app
-      const response = await this.callQwen3API(message, {
-        sys_prompt: system_prompt,
-        model: this.options.model
-      });
-      // Return in the expected format: {data: [response_text]}
-      return {
-        data: [response]
-      };
-    } catch (error) {
-      console.error('Qwen3Client error:', error);
-      throw new Error(`Qwen3 API call failed: ${error}`);
-    }
-  }
-  private async callQwen3API(message: string, options: any): Promise<string> {
-    try {
-      if (!this.gradioClient) {
-        throw new Error('Gradio client not initialized');
-      }
-      // Prepare settings for the Qwen3 space based on actual app.py structure
-      // Only use parameters that actually exist in the Gradio app
-      const settingsFormValue = {
-        model: options.model || this.options.model,
-        sys_prompt: options.sys_prompt || "You are a helpful assistant.",
-        thinking_budget: 38 // Use maximum thinking budget for best quality
-      };
-      // Thinking button state - disable for faster responses
-      const thinkingBtnState = {
-        enable_thinking: false
-      };
-      // Initial state for the conversation
-      const stateValue = {
-        conversation_contexts: {},
-        conversations: [],
-        conversation_id: this.sessionId
-      };
-      console.log(`🤖 Calling Qwen3 add_message with: "${message.substring(0, 50)}..."`);
-      // Call the add_message function from the Gradio app
-      // Based on app.py line 170: add_message(input_value, settings_form_value, thinking_btn_state_value, state_value)
-      const result = await this.gradioClient.predict("/add_message", [
-        message,                // input_value
-        settingsFormValue,      // settings_form_value
-        thinkingBtnState,       // thinking_btn_state_value
-        stateValue              // state_value
-      ]);
-      console.log('🔍 Raw Qwen3 response:', result);
-      // Extract the response text from the Gradio result
-      if (result && result.data && Array.isArray(result.data)) {
-        // The response format should include the chatbot data
-        // Look for the chatbot component data (usually index 2 or 3)
-        for (let i = 0; i < result.data.length; i++) {
-          const item = result.data[i];
-          if (Array.isArray(item) && item.length > 0) {
-            // Look for the last assistant message
-            const lastMessage = item[item.length - 1];
-            if (lastMessage && lastMessage.role === 'assistant' && lastMessage.content) {
-              // Extract text content from the structured content
-              if (Array.isArray(lastMessage.content)) {
-                for (const contentItem of lastMessage.content) {
-                  if (contentItem.type === 'text' && contentItem.content) {
-                    console.log('✅ Extracted Qwen3 response:', contentItem.content.substring(0, 100) + '...');
-                    return contentItem.content;
-                  }
-                }
-              } else if (typeof lastMessage.content === 'string') {
-                console.log('✅ Extracted Qwen3 response:', lastMessage.content.substring(0, 100) + '...');
-                return lastMessage.content;
-              }
-            }
-          }
-        }
-      }
-      // If we can't extract the response, throw an error to trigger fallback
-      throw new Error('Could not extract text response from Qwen3 API result');
-    } catch (error) {
-      console.warn('Qwen3 Gradio API call failed, using fallback strategy:', error);
-      // Development fallback: Generate a reasonable response based on the input
-      // If it's a JSON generation request, provide a structured response
-      if (message.includes('JSON') || message.includes('json') || options.sys_prompt?.includes('JSON')) {
-        if (message.includes('monster') || message.includes('stats')) {
-          return this.generateFallbackMonsterStats(message);
-        }
-        return '```json\n{"status": "Qwen3 temporarily unavailable", "using_fallback": true}\n```';
-      }
-      // For text generation, provide a reasonable response
-      if (message.includes('visual description') || message.includes('image generation')) {
-        return this.generateFallbackImageDescription(message);
-      }
-      return `I understand you're asking about: "${message.substring(0, 100)}..."\n\nHowever, I'm currently unable to connect to the Qwen3 service. The system will automatically fall back to an alternative model for your request.`;
-    }
-  }
-  private generateFallbackMonsterStats(userMessage: string): string {
-    // Extract key information from the user message to generate reasonable stats
-    const isRare = userMessage.toLowerCase().includes('rare') || userMessage.toLowerCase().includes('legendary');
-    const isCommon = userMessage.toLowerCase().includes('common') || userMessage.toLowerCase().includes('basic');
-    let baseStats = isRare ? 70 : isCommon ? 25 : 45;
-    let variation = isRare ? 25 : isCommon ? 15 : 20;
-    const stats = {
-      rarity: isRare ? 'rare' : isCommon ? 'common' : 'uncommon',
-      picletType: 'beast', // Default fallback
-      height: Math.round((Math.random() * 3 + 0.5) * 10) / 10,
-      weight: Math.round((Math.random() * 100 + 10) * 10) / 10,
-      HP: Math.round(Math.max(10, Math.min(100, baseStats + Math.random() * variation - variation/2))),
-      defence: Math.round(Math.max(10, Math.min(100, baseStats + Math.random() * variation - variation/2))),
-      attack: Math.round(Math.max(10, Math.min(100, baseStats + Math.random() * variation - variation/2))),
-      speed: Math.round(Math.max(10, Math.min(100, baseStats + Math.random() * variation - variation/2))),
-      monsterLore: "A mysterious creature discovered through advanced AI analysis. Its true nature remains to be studied.",
-      specialPassiveTraitDescription: "Adaptive Resilience - This creature adapts to its environment.",
-      attackActionName: "Strike",
-      attackActionDescription: "A focused attack that deals moderate damage.",
-      buffActionName: "Focus",
-      buffActionDescription: "Increases concentration, boosting attack power temporarily.",
-      debuffActionName: "Intimidate",
-      debuffActionDescription: "Reduces the opponent's confidence, lowering their attack.",
-      specialActionName: "Signature Move",
-      specialActionDescription: "A powerful technique unique to this creature."
-    };
-    return '```json\n' + JSON.stringify(stats, null, 2) + '\n```';
-  }
-  private generateFallbackImageDescription(userMessage: string): string {
-    // Generate a basic visual description based on common elements
-    const colors = ['vibrant blue', 'emerald green', 'golden yellow', 'deep purple', 'crimson red'];
-    const features = ['large expressive eyes', 'sleek form', 'distinctive markings', 'graceful limbs'];
-    const color = colors[Math.floor(Math.random() * colors.length)];
-    const feature = features[Math.floor(Math.random() * features.length)];
-    return `A ${color} creature with ${feature}, designed in an anime-inspired style with clean lines and appealing proportions.`;
-  }
-  /**
-   * No connection testing - let natural failures trigger fallback to Zephyr-7B
-   */
-}
-// Export a default instance
-export const qwen3Client = new Qwen3Client();

src/lib/services/textGenerationClient.ts DELETED Viewed

@@ -1,149 +0,0 @@
-/**
- * Text Generation Client Manager
- * Provides unified interface for text generation with automatic fallback
- * Primary: Qwen3 (Qwen/Qwen3-Demo), Fallback: Zephyr-7B (Fraser/zephyr-7b)
- */
-import { qwen3Client } from './qwen3Client';
-interface TextGenerationClient {
-  predict(endpoint: string, params: any[]): Promise<{data: any[]}>;
-}
-class TextGenerationManager {
-  private primaryClient: TextGenerationClient;
-  private fallbackClient: TextGenerationClient | null = null;
-  private useQwen3: boolean = true;
-  private connectionTested: boolean = false;
-  constructor() {
-    this.primaryClient = qwen3Client;
-  }
-  /**
-   * Set the fallback client (Zephyr-7B)
-   */
-  setFallbackClient(client: TextGenerationClient) {
-    this.fallbackClient = client;
-  }
-  /**
-   * Initialize without testing - assume Qwen3 is available and test on first real use
-   */
-  async initialize(): Promise<void> {
-    if (this.connectionTested) return;
-    console.log('🔧 Initializing text generation manager - using Qwen3 but will fallback to Zephyr-7B on failure');
-    // Default to using Qwen3, test will happen on first predict() call
-    this.useQwen3 = true;
-    this.connectionTested = true;
-    console.log('✅ Text generation manager initialized - ready to use Qwen3 (with fallback to Zephyr-7B)');
-  }
-  /**
-   * Get the active client for text generation
-   */
-  private getActiveClient(): TextGenerationClient {
-    if (this.useQwen3) {
-      return this.primaryClient;
-    } else if (this.fallbackClient) {
-      return this.fallbackClient;
-    } else {
-      console.warn('No fallback client available, using Qwen3 client');
-      return this.primaryClient;
-    }
-  }
-  /**
-   * Predict method with automatic fallback - tests on first failure
-   */
-  async predict(endpoint: string, params: any[]): Promise<{data: any[]}> {
-    // Ensure initialization has been attempted
-    if (!this.connectionTested) {
-      await this.initialize();
-    }
-    const activeClient = this.getActiveClient();
-    const clientName = this.useQwen3 ? 'Qwen3' : 'Zephyr-7B';
-    console.log(`🤖 Using ${clientName} for text generation`);
-    try {
-      const result = await activeClient.predict(endpoint, params);
-      return result;
-    } catch (error) {
-      console.error(`${clientName} prediction failed:`, error);
-      // If primary client fails and we have a fallback, try it
-      if (this.useQwen3 && this.fallbackClient) {
-        console.log('🔄 Qwen3 failed, switching to fallback Zephyr-7B...');
-        try {
-          const fallbackResult = await this.fallbackClient.predict(endpoint, params);
-          // Mark for future calls to use fallback
-          this.useQwen3 = false;
-          console.log('✅ Fallback to Zephyr-7B successful - will use Zephyr-7B for future requests');
-          return fallbackResult;
-        } catch (fallbackError) {
-          console.error('Fallback client also failed:', fallbackError);
-          throw new Error(`Both primary (${clientName}) and fallback clients failed`);
-        }
-      }
-      throw error;
-    }
-  }
-  /**
-   * Force switch to Qwen3
-   */
-  useQwen3Client() {
-    this.useQwen3 = true;
-    console.log('🔄 Switched to Qwen3 client');
-  }
-  /**
-   * Force switch to fallback (Zephyr-7B)
-   */
-  useFallbackClient() {
-    if (this.fallbackClient) {
-      this.useQwen3 = false;
-      console.log('🔄 Switched to fallback (Zephyr-7B) client');
-    } else {
-      console.warn('No fallback client available');
-    }
-  }
-  /**
-   * Get current client status
-   */
-  getStatus() {
-    return {
-      usingQwen3: this.useQwen3,
-      hasFallback: this.fallbackClient !== null,
-      connectionTested: this.connectionTested,
-      activeClient: this.useQwen3 ? 'Qwen3' : 'Zephyr-7B'
-    };
-  }
-  /**
-   * Reset connection testing to allow re-initialization
-   */
-  resetConnectionTest() {
-    this.connectionTested = false;
-    console.log('🔄 Connection test reset - will re-test on next prediction');
-  }
-  /**
-   * Force re-test connection and re-initialize
-   */
-  async retestConnection(): Promise<void> {
-    this.connectionTested = false;
-    await this.initialize();
-  }
-}
-// Export singleton instance
-export const textGenerationManager = new TextGenerationManager();