Spaces:

Cognitive-Lab
/

EurekaAgent

Running

App Files Files Community

AdithyaSK commited on 16 days ago

Commit

744e5e2

1 Parent(s): 2ea19ae

Eureka agent init - Adithya S K

Browse files

Files changed (10) hide show

.env.example +29 -0
.gitattributes +2 -0
LICENSE +200 -0
README.md +153 -13
app.py +1761 -0
jupyter_agent.py +1463 -0
jupyter_handler.py +1161 -0
modal_sandbox.py +794 -0
requirements.txt +17 -0
system_prompt.txt +326 -0

.env.example ADDED Viewed

	@@ -0,0 +1,29 @@

+# OpenAI Configuration (choose ONE of the options below)
+# Option 1: Standard OpenAI
+# OPENAI_API_KEY=sk-your-openai-api-key-here
+# MODEL_NAME=gpt-4
+# Option 2: Azure OpenAI
+# AZURE_OPENAI_ENDPOINT=https://your-resource-name.openai.azure.com/
+# AZURE_OPENAI_API_KEY=your-azure-openai-api-key-here
+# MODEL_NAME=gpt-4  # This should be your deployment name in Azure
+# Option 3: Custom Provider (e.g., Cerebras)
+# PROVIDER_API_ENDPOINT=https://api.cerebras.ai/v1
+# PROVIDER_API_KEY=your-cerebras-api-key-here
+# MODEL_NAME=llama3.1-70b
+# Phoenix Tracing (Optional)
+# PHOENIX_API_KEY=your-phoenix-api-key-here
+# PHOENIX_COLLECTOR_ENDPOINT=https://app.phoenix.arize.com/v1/traces
+# Modal Configuration (for sandbox execution)
+# MODAL_TOKEN_ID=your-modal-token-id
+# MODAL_TOKEN_SECRET=your-modal-token-secret
+# Hugging Face Token (Optional)
+# HF_TOKEN=your-huggingface-token
+# TAVILY API KEY FOR WEB SEARCH
+# TAVILY_API_KEY=

.gitattributes CHANGED Viewed

@@ -33,3 +33,5 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text

 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text
+jupyter-agent-2.png filter=lfs diff=lfs merge=lfs -text
+powered-by.png filter=lfs diff=lfs merge=lfs -text

LICENSE ADDED Viewed

	@@ -0,0 +1,200 @@

+                                 Apache License
+                           Version 2.0, January 2004
+                        http://www.apache.org/licenses/
+   TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
+   1. Definitions.
+      "License" shall mean the terms and conditions for use, reproduction,
+      and distribution as defined by Sections 1 through 9 of this document.
+      "Licensor" shall mean the copyright owner or entity granting the License.
+      "Legal Entity" shall mean the union of the acting entity and all
+      other entities that control, are controlled by, or are under common
+      control with that entity. For the purposes of this definition,
+      "control" means (i) the power, direct or indirect, to cause the
+      direction or management of such entity, whether by contract or
+      otherwise, or (ii) ownership of fifty percent (50%) or more of the
+      outstanding shares, or (iii) beneficial ownership of such entity.
+      "You" (or "Your") shall mean an individual or Legal Entity
+      exercising permissions granted by this License.
+      "Source" form shall mean the preferred form for making modifications,
+      including but not limited to software source code, documentation
+      source, and configuration files.
+      "Object" form shall mean any form resulting from mechanical
+      transformation or translation of a Source form, including but
+      not limited to compiled object code, generated documentation,
+      and conversions to other media types.
+      "Work" shall mean the work of authorship, whether in Source or
+      Object form, made available under the License, as indicated by a
+      copyright notice that is included in or attached to the work
+      (which may not be construed as modifying the License).
+      "Derivative Works" shall mean any work, whether in Source or Object
+      form, that is based upon (or derived from) the Work and for which the
+      editorial revisions, annotations, elaborations, or other modifications
+      represent, as a whole, an original work of authorship. For the purposes
+      of this License, Derivative Works shall not include works that remain
+      separable from, or merely link (or bind by name) to the interfaces of,
+      the Work.
+      "Contribution" shall mean any work of authorship, including
+      the original version of the Work and any modifications or additions
+      to that Work or Derivative Works thereof, that is intentionally
+      submitted to Licensor for inclusion in the Work by the copyright owner
+      or by an individual or Legal Entity authorized to submit on behalf of
+      the copyright owner. For the purposes of this definition, "submitted"
+      means any form of electronic, verbal, or written communication sent
+      to the Licensor or its representatives, including but not limited to
+      communication on electronic mailing lists, source code control systems,
+      and issue tracking systems that are managed by, or on behalf of, the
+      Licensor for the purpose of discussing and improving the Work, but
+      excluding communication that is conspicuously marked or otherwise
+      designated in writing by the copyright owner as "Not a Contribution."
+      "Contributor" shall mean Licensor and any individual or Legal Entity
+      on behalf of whom a Contribution has been received by Licensor and
+      subsequently incorporated within the Work.
+   2. Grant of Copyright License. Subject to the terms and conditions of
+      this License, each Contributor hereby grants to You a perpetual,
+      worldwide, non-exclusive, no-charge, royalty-free, irrevocable
+      copyright license to use, reproduce, prepare Derivative Works of,
+      publicly perform, publicly display, sublicense, and distribute the
+      Work and Derivative Works thereof in Source or Object form.
+   3. Grant of Patent License. Subject to the terms and conditions of
+      this License, each Contributor hereby grants to You a perpetual,
+      worldwide, non-exclusive, no-charge, royalty-free, irrevocable
+      (except as stated in this section) patent license to make, have made,
+      use, offer to sell, sell, import, and otherwise transfer the Work,
+      where such license applies only to those patent claims licensable
+      by such Contributor that are necessarily infringed by their
+      Contribution(s) alone or by combination of their Contribution(s)
+      with the Work to which such Contribution(s) was submitted. If You
+      institute patent litigation against any entity (including a
+      cross-claim or counterclaim in a lawsuit) alleging that the Work
+      or a Contribution incorporated within the Work constitutes direct
+      or contributory patent infringement, then any patent licenses
+      granted to You under this License for that Work shall terminate
+      as of the date such litigation is filed.
+   4. Redistribution. You may reproduce and distribute copies of the
+      Work or Derivative Works thereof in any medium, with or without
+      modifications, and in Source or Object form, provided that You
+      meet the following conditions:
+      (a) You must give any other recipients of the Work or
+          Derivative Works a copy of this License; and
+      (b) You must cause any modified files to carry prominent notices
+          stating that You changed the files; and
+      (c) You must retain, in the Source form of any Derivative Works
+          that You distribute, all copyright, trademark, patent,
+          attribution and other notices from the Source form of the Work,
+          excluding those notices that do not pertain to any part of
+          the Derivative Works; and
+      (d) If the Work includes a "NOTICE" file as part of its
+          distribution, then any Derivative Works that You distribute must
+          include a readable copy of the attribution notices contained
+          within such NOTICE file, excluding those notices that do not
+          pertain to any part of the Derivative Works, in at least one
+          of the following places: within a NOTICE file distributed
+          as part of the Derivative Works; within the Source form or
+          documentation, if provided along with the Derivative Works; or,
+          within a display generated by the Derivative Works, if and
+          wherever such third-party notices normally appear. The contents
+          of the NOTICE file are for informational purposes only and
+          do not modify the License. You may add Your own attribution
+          notices within Derivative Works that You distribute, alongside
+          or as an addendum to the NOTICE text from the Work, provided
+          that such additional attribution notices cannot be construed
+          as modifying the License.
+      You may add Your own copyright statement for Your modifications
+      and may provide additional or different license terms and conditions
+      for use, reproduction, or distribution of Your modifications, or
+      for any such Derivative Works as a whole, provided Your use,
+      reproduction, and distribution of the Work otherwise complies with
+      the conditions stated in this License.
+   5. Submission of Contributions. Unless You explicitly state otherwise,
+      any Contribution intentionally submitted for inclusion in the Work
+      by You to the Licensor shall be under the terms and conditions of
+      this License, without any additional terms or conditions.
+      Notwithstanding the above, nothing herein shall supersede or modify
+      the terms of any separate license agreement you may have executed
+      with Licensor regarding such Contributions.
+   6. Trademarks. This License does not grant permission to use the trade
+      names, trademarks, service marks, or product names of the Licensor,
+      except as required for reasonable and customary use in describing the
+      origin of the Work and reproducing the content of the NOTICE file.
+   7. Disclaimer of Warranty. Unless required by applicable law or
+      agreed to in writing, Licensor provides the Work (and each
+      Contributor provides its Contributions) on an "AS IS" BASIS,
+      WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
+      implied, including, without limitation, any warranties or conditions
+      of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
+      PARTICULAR PURPOSE. You are solely responsible for determining the
+      appropriateness of using or redistributing the Work and assume any
+      risks associated with Your exercise of permissions under this License.
+   8. Limitation of Liability. In no event and under no legal theory,
+      whether in tort (including negligence), contract, or otherwise,
+      unless required by applicable law (such as deliberate and grossly
+      negligent acts) or agreed to in writing, shall any Contributor be
+      liable to You for damages, including any direct, indirect, special,
+      incidental, or consequential damages of any character arising as a
+      result of this License or out of the use or inability to use the
+      Work (including but not limited to damages for loss of goodwill,
+      work stoppage, computer failure or malfunction, or any and all
+      other commercial damages or losses), even if such Contributor
+      has been advised of the possibility of such damages.
+   9. Accepting Support, Warranty or Additional Liability. While redistributing
+      the Work or Derivative Works thereof, You may choose to offer,
+      and charge a fee for, acceptance of support, warranty, indemnity,
+      or other liability obligations and/or rights consistent with this
+      License. However, in accepting such obligations, You may act only
+      on Your own behalf and on Your sole responsibility, not on behalf
+      of any other Contributor, and only if You agree to indemnify,
+      defend, and hold each Contributor harmless for any liability
+      incurred by, or claims asserted against, such Contributor by reason
+      of your accepting any such warranty or additional liability.
+   END OF TERMS AND CONDITIONS
+   APPENDIX: How to apply the Apache License to your work.
+      To apply the Apache License to your work, attach the following
+      boilerplate notice, with the fields enclosed by brackets "[]"
+      replaced with your own identifying information. (Don't include
+      the brackets!)  The text should be enclosed in the appropriate
+      comment syntax for the file format. We also recommend that a
+      file or class name and description of purpose be included on the
+      same "printed page" as the copyright notice for easier
+      identification within third-party archives.
+   Copyright 2024 adithya-s-k
+   Licensed under the Apache License, Version 2.0 (the "License");
+   you may not use this file except in compliance with the License.
+   You may obtain a copy of the License at
+       http://www.apache.org/licenses/LICENSE-2.0
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.

README.md CHANGED Viewed

@@ -1,13 +1,153 @@
----
-title: EurekaAgent
-emoji: 😻
-colorFrom: pink
-colorTo: pink
-sdk: gradio
-sdk_version: 5.43.1
-app_file: app.py
-pinned: false
-license: apache-2.0
----
-Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

+# Eureka Agent
+An AI-powered research automation system that can execute Python code, analyze data, and generate insights through an interactive Jupyter-like interface.
+<img width="1936" height="855" alt="Screenshot 2025-08-22 at 11 45 12 PM" src="https://github.com/user-attachments/assets/8d4ea793-4027-4aa3-8d6f-cbebbbd6e0c2" />
+## 🎯 What it does
+Eureka Agent automates research workflows by:
+- **Executing Python code** in a secure containerized environment
+- **Analyzing data** with full context awareness across conversations
+- **Generating visualizations** and interactive outputs
+- **Iterative development** - builds upon previous code and results
+- **Error recovery** - learns from execution failures and improves
+## ⚡ Key Features
+- **Stateful Jupyter Environment**: Variables and imports persist across all code executions
+- **GPU/CPU Support**: Configurable hardware (CPU, T4, L4, A100, H100)
+- **Interactive Development**: Build complex solutions incrementally
+- **Rich Output Support**: Plots, tables, HTML, and multimedia content
+- **Error Handling**: Intelligent error recovery and debugging assistance
+- **File Upload**: Process your own datasets and documents
+## 🚀 Quick Start
+### Prerequisites
+- Python 3.8+
+- Modal account (for containerized execution)
+- OpenAI API key or compatible LLM provider
+### Installation
+1. Clone the repository:
+```bash
+git clone https://github.com/adithya-s-k/EurekaAgent
+cd EurekaAgent
+```
+2. Install dependencies:
+```bash
+pip install -r requirements.txt
+```
+3. Set up environment variables:
+```bash
+export OPENAI_API_KEY="your-api-key"
+export MODAL_TOKEN_ID="your-modal-token-id"
+export MODAL_TOKEN_SECRET="your-modal-token-secret"
+```
+### Running the Application
+```bash
+python app.py
+```
+The application will launch a Gradio interface accessible via your web browser.
+## 🔧 Configuration
+### Environment Variables
+| Variable                     | Description                   | Required | Format/Example                  |
+| ---------------------------- | ----------------------------- | -------- | ------------------------------- |
+| `MODAL_TOKEN_ID`             | Modal token ID                | Yes      | `ak-...`                        |
+| `MODAL_TOKEN_SECRET`         | Modal token secret            | Yes      | `as-...`                        |
+| `PROVIDER_API_KEY`           | AI Provider API key           | Yes\*    | `sk-...`, `gsk_...`, `csk-...`  |
+| `PROVIDER_API_ENDPOINT`      | AI Provider API endpoint      | Yes\*    | `https://api.anthropic.com/v1/` |
+| `MODEL_NAME`                 | Model to use                  | Yes\*    | `claude-sonnet-4-20250514`      |
+| `HF_TOKEN`                   | Hugging Face token (optional) | No       | `hf_...`                        |
+| `TAVILY_API_KEY`             | Tavily API key for web search | No       | `tvly-...`                      |
+| `PHOENIX_API_KEY`            | Phoenix tracing API key       | No       | -                               |
+| `PHOENIX_COLLECTOR_ENDPOINT` | Phoenix collector endpoint    | No       | -                               |
+| `ENVIRONMENT`                | Environment mode              | No       | `dev`/`prod`                    |
+\*At least one complete AI provider configuration must be provided
+**Legacy OpenAI Support:**
+| Variable | Description | Required |
+| ----------------------- | ----------------------------- | -------- |
+| `OPENAI_API_KEY` | OpenAI API key | No |
+| `AZURE_OPENAI_ENDPOINT` | Azure OpenAI endpoint | No |
+| `AZURE_OPENAI_API_KEY` | Azure OpenAI API key | No |
+### Hardware Options
+- **CPU Only**: Free, suitable for basic tasks
+- **NVIDIA T4**: Low-cost GPU for small models
+- **NVIDIA L4**: Mid-range GPU for better performance
+- **NVIDIA A100**: High-end GPU for large models (40GB/80GB variants)
+- **NVIDIA H100**: Latest flagship GPU for maximum performance
+## 💡 Usage Examples
+### Basic Data Analysis
+```
+"Analyze the uploaded CSV file and create visualizations showing key trends"
+```
+### Machine Learning
+```
+"Train a neural network to classify the iris dataset and evaluate its performance"
+```
+### Research Tasks
+```
+"Download stock price data for the last year and perform technical analysis"
+```
+## 🏗️ Architecture
+- **Frontend**: Gradio web interface with real-time status updates
+- **Backend**: Python application with multi-provider AI integration
+- **Execution Environment**: Modal containerized sandboxes with GPU support
+- **Code Execution**: Persistent Jupyter-like stateful environment
+- **Session Management**: Comprehensive session state tracking with Phoenix tracing
+- **Storage**: File-based session persistence with notebook compatibility
+- **Web Search**: Integrated Tavily search for current information
+- **Hardware Support**: CPU, T4, L4, A100, H100 configurations
+## 📁 Project Structure
+```
+EurekaAgent/
+├── app.py              # Main Gradio application with API key management
+├── jupyter_handler.py  # Jupyter notebook management and rendering
+├── jupyter_agent.py           # Utility functions, execution logic, and session management
+├── modal_sandbox.py   # Modal sandbox configuration with GPU support
+├── system_prompt.txt  # System prompt for the AI agent
+├── requirements.txt   # Python dependencies
+└── temp/              # Temporary files, notebooks, and session states
+    ├── <session_id>/
+    │   ├── session_state.json    # Complete session state and history
+    │   └── jupyter-agent.ipynb   # Legacy notebook file for UI compatibility
+    └── jupyter-agent.ipynb       # Default notebook template
+```
+## 🤝 Contributing
+This project is a fork of [Jupyter Agent 2](https://huggingface.co/spaces/lvwerra/jupyter-agent-2) by Hugging Face. Contributions are welcome!
+## 📄 License
+See [LICENSE](./LICENSE) file for details.

app.py ADDED Viewed

	@@ -0,0 +1,1761 @@

+import os
+import logging
+import gradio as gr
+from gradio.utils import get_space
+from modal_sandbox import create_modal_sandbox
+from pathlib import Path
+import json
+from datetime import datetime
+import threading
+import re
+from openai import OpenAI, AzureOpenAI
+from jupyter_handler import JupyterNotebook
+if not get_space():
+    try:
+        from dotenv import load_dotenv
+        load_dotenv()
+    except (ImportError, ModuleNotFoundError):
+        pass
+from jupyter_agent import (
+    run_interactive_notebook_with_session_state,
+    SessionStateManager,
+)
+TMP_DIR = './temp/'
+# Environment and API key management utilities
+def get_environment():
+    """Get the current environment (dev/prod)"""
+    return os.environ.get("ENVIRONMENT", "prod").lower()
+def is_dev_environment():
+    """Check if running in development environment"""
+    return get_environment() == "dev"
+def get_required_api_keys():
+    """Get dictionary of required API keys and their current status"""
+    required_keys = {
+        "MODAL_TOKEN_ID": {
+            "value": os.environ.get("MODAL_TOKEN_ID"),
+            "required": True,
+            "description": "Modal Token ID for sandbox access"
+        },
+        "MODAL_TOKEN_SECRET": {
+            "value": os.environ.get("MODAL_TOKEN_SECRET"),
+            "required": True,
+            "description": "Modal Token Secret for sandbox access"
+        },
+        "HF_TOKEN": {
+            "value": os.environ.get("HF_TOKEN"),
+            "required": False,
+            "description": "Hugging Face Token for model access"
+        },
+        "PROVIDER_API_KEY": {
+            "value": os.environ.get("PROVIDER_API_KEY"),
+            "required": True,
+            "description": "AI Provider API Key (Anthropic, OpenAI, etc.)"
+        },
+        "PROVIDER_API_ENDPOINT": {
+            "value": os.environ.get("PROVIDER_API_ENDPOINT"),
+            "required": True,
+            "description": "AI Provider API Endpoint"
+        },
+        "MODEL_NAME": {
+            "value": os.environ.get("MODEL_NAME"),
+            "required": True,
+            "description": "Model name to use"
+        },
+        "TAVILY_API_KEY": {
+            "value": os.environ.get("TAVILY_API_KEY"),
+            "required": False,
+            "description": "Tavily API Key for web search functionality"
+        }
+    }
+    return required_keys
+def get_missing_api_keys():
+    """Get list of missing required API keys"""
+    required_keys = get_required_api_keys()
+    missing_keys = {}
+    for key, config in required_keys.items():
+        if config["required"] and not config["value"]:
+            missing_keys[key] = config
+    return missing_keys
+def validate_api_key_format(key_name, key_value):
+    """Basic validation for API key formats"""
+    if not key_value or not key_value.strip():
+        return False, "API key cannot be empty"
+    key_value = key_value.strip()
+    # Basic format validation
+    if key_name == "MODAL_TOKEN_ID" and not key_value.startswith("ak-"):
+        return False, "Modal Token ID should start with 'ak-'"
+    elif key_name == "MODAL_TOKEN_SECRET" and not key_value.startswith("as-"):
+        return False, "Modal Token Secret should start with 'as-'"
+    elif key_name == "HF_TOKEN" and not key_value.startswith("hf_"):
+        return False, "Hugging Face token should start with 'hf_'"
+    elif key_name == "PROVIDER_API_KEY":
+        # Check for common API key prefixes
+        valid_prefixes = ["sk-", "gsk_", "csk-"]
+        if not any(key_value.startswith(prefix) for prefix in valid_prefixes):
+            return False, "API key format may be invalid (expected prefixes: sk-, gsk_, csk-)"
+    elif key_name == "PROVIDER_API_ENDPOINT" and not (key_value.startswith("http://") or key_value.startswith("https://")):
+        return False, "API endpoint should start with http:// or https://"
+    elif key_name == "TAVILY_API_KEY" and not key_value.startswith("tvly-"):
+        return False, "Tavily API key should start with 'tvly-'"
+    return True, "Valid format"
+def apply_user_api_keys(api_keys_dict):
+    """Apply user-provided API keys to environment"""
+    for key, value in api_keys_dict.items():
+        if value and value.strip():
+            os.environ[key] = value.strip()
+            logger.info(f"Applied user-provided API key: {key}")
+def get_previous_notebooks():
+    """Get list of previous notebook sessions (dev only)"""
+    if not is_dev_environment():
+        return []
+    notebooks = []
+    tmp_dir = Path(TMP_DIR)
+    if not tmp_dir.exists():
+        return notebooks
+    for session_dir in tmp_dir.iterdir():
+        if session_dir.is_dir() and session_dir.name != ".":
+            notebook_file = session_dir / "jupyter-agent.ipynb"
+            if notebook_file.exists():
+                try:
+                    # Get creation time and basic info
+                    stat = notebook_file.stat()
+                    size = stat.st_size
+                    modified = stat.st_mtime
+                    # Try to read basic notebook info
+                    with open(notebook_file, 'r') as f:
+                        notebook_data = json.load(f)
+                        cell_count = len(notebook_data.get('cells', []))
+                    # Format timestamp
+                    formatted_time = datetime.fromtimestamp(modified).strftime("%Y-%m-%d %H:%M")
+                    # Try to load session state for additional info
+                    config_info = ""
+                    try:
+                        session_manager = SessionStateManager(session_dir.name, TMP_DIR)
+                        session_state = session_manager.load_state()
+                        if session_state:
+                            hardware = session_state.get("hardware_config", {})
+                            gpu = hardware.get("gpu_type", "unknown")
+                            config_info = f", {gpu}"
+                    except Exception:
+                        pass
+                    notebooks.append({
+                        'session_id': session_dir.name,
+                        'path': str(notebook_file),
+                        'modified': modified,
+                        'size': size,
+                        'cell_count': cell_count,
+                        'display_name': f"{session_dir.name} ({cell_count} cells{config_info}, {formatted_time})"
+                    })
+                except Exception as e:
+                    logger.warning(f"Failed to read notebook info for {session_dir.name}: {e}")
+    # Sort by modification time (newest first)
+    notebooks.sort(key=lambda x: x['modified'], reverse=True)
+    return notebooks
+def parse_environment_variables(env_vars_text):
+    """
+    Parse environment variables from text input
+    Args:
+        env_vars_text: String containing environment variables in KEY=value format, one per line
+    Returns:
+        dict: Dictionary of parsed environment variables
+    """
+    env_dict = {}
+    if not env_vars_text or not env_vars_text.strip():
+        return env_dict
+    for line in env_vars_text.strip().split('\n'):
+        line = line.strip()
+        if not line or line.startswith('#'):  # Skip empty lines and comments
+            continue
+        if '=' in line:
+            key, value = line.split('=', 1)  # Split only on first =
+            key = key.strip()
+            value = value.strip()
+            if key:  # Only add if key is not empty
+                env_dict[key] = value
+        else:
+            logger.warning(f"Skipping invalid environment variable format: {line}")
+    return env_dict
+def create_notification_html(message, notification_type="info", show_spinner=False):
+    """
+    Create HTML for notification messages
+    Args:
+        message: The notification message
+        notification_type: Type of notification ('info', 'success', 'warning', 'error')
+        show_spinner: Whether to show a loading spinner
+    """
+    colors = {
+        'info': '#3498db',
+        'success': '#27ae60',
+        'warning': '#f39c12',
+        'error': '#e74c3c',
+        'loading': '#6c5ce7'
+    }
+    icons = {
+        'info': '🔄',
+        'success': '✅',
+        'warning': '⚠️',
+        'error': '❌',
+        'loading': '⏳'
+    }
+    color = colors.get(notification_type, colors['info'])
+    icon = icons.get(notification_type, icons['info'])
+    spinner_html = ""
+    if show_spinner or notification_type == 'loading':
+        spinner_html = """
+        <div style="
+            display: inline-block;
+            width: 20px;
+            height: 20px;
+            border: 2px solid #f3f3f3;
+            border-top: 2px solid {color};
+            border-radius: 50%;
+            animation: spin 1s linear infinite;
+            margin-right: 8px;
+        "></div>
+        <style>
+        @keyframes spin {{
+            0% {{ transform: rotate(0deg); }}
+            100% {{ transform: rotate(360deg); }}
+        }}
+        </style>
+        """.format(color=color)
+    return f"""
+    <div style="
+        background-color: {color}20;
+        border-left: 4px solid {color};
+        padding: 12px 16px;
+        margin: 10px 0;
+        border-radius: 4px;
+        font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, sans-serif;
+        font-size: 14px;
+        color: #2c3e50;
+        display: flex;
+        align-items: center;
+    ">
+        {spinner_html}
+        <strong>{icon} {message}</strong>
+    </div>
+    """
+def create_progress_notification(message, progress_percent=None):
+    """Create a progress notification with optional progress bar"""
+    progress_html = ""
+    if progress_percent is not None:
+        progress_html = f"""
+        <div style="
+            width: 100%;
+            background-color: #e0e0e0;
+            border-radius: 5px;
+            margin-top: 8px;
+            height: 8px;
+        ">
+            <div style="
+                width: {progress_percent}%;
+                background-color: #3498db;
+                height: 8px;
+                border-radius: 5px;
+                transition: width 0.3s ease;
+            "></div>
+        </div>
+        <small style="color: #666; margin-top: 4px; display: block;">{progress_percent}% complete</small>
+        """
+    return create_notification_html(message, "loading", show_spinner=True) + progress_html
+def initialize_phoenix_tracing():
+    """Initialize Phoenix tracing with proper error handling and session support"""
+    try:
+        from phoenix.otel import register
+        phoenix_api_key = os.getenv("PHOENIX_API_KEY")
+        collector_endpoint = os.getenv("PHOENIX_COLLECTOR_ENDPOINT")
+        if not phoenix_api_key:
+            logger.info("Phoenix API key not found, skipping Phoenix tracing initialization")
+            return None
+        if not collector_endpoint:
+            logger.info("Phoenix collector endpoint not found, skipping Phoenix tracing initialization")
+            return None
+        logger.info("Initializing Phoenix tracing with session support...")
+        # Set required environment variables
+        os.environ["PHOENIX_API_KEY"] = phoenix_api_key
+        os.environ["PHOENIX_COLLECTOR_ENDPOINT"] = collector_endpoint
+        os.environ["OTEL_EXPORTER_OTLP_HEADERS"] = f"api_key={phoenix_api_key}"
+        os.environ["PHOENIX_CLIENT_HEADERS"] = f"api_key={phoenix_api_key}"
+        # Configure the Phoenix tracer with OpenAI instrumentation enabled
+        tracer_provider = register(
+            project_name="eureka-agent",
+            auto_instrument=True,  # Keep auto-instrument enabled for OpenAI tracing
+            set_global_tracer_provider=True
+        )
+        # Additional instrumentation setup for session tracking
+        try:
+            from openinference.instrumentation.openai import OpenAIInstrumentor
+            # Ensure OpenAI instrumentation is properly configured
+            if not OpenAIInstrumentor().is_instrumented_by_opentelemetry:
+                OpenAIInstrumentor().instrument()
+                logger.info("OpenAI instrumentation configured for Phoenix session tracking")
+            else:
+                logger.info("OpenAI instrumentation already active")
+        except ImportError:
+            logger.warning("OpenAI instrumentation not available - session grouping may not work optimally")
+        except Exception as e:
+            logger.warning(f"Failed to configure OpenAI instrumentation: {str(e)}")
+        logger.info("Phoenix tracing initialized successfully with session support")
+        return tracer_provider
+    except ImportError:
+        logger.info("Phoenix not installed, skipping tracing initialization")
+        return None
+    except Exception as e:
+        logger.warning(f"Failed to initialize Phoenix tracer (non-critical): {str(e)}")
+        return None
+# Configure logging
+logging.basicConfig(
+    level=logging.INFO,
+    format='%(asctime)s - %(name)s - %(levelname)s - %(message)s',
+    handlers=[
+        logging.FileHandler('jupyter_agent.log'),
+        logging.StreamHandler()
+    ]
+)
+logger = logging.getLogger(__name__)
+# Initialize Phoenix tracing
+tracer_provider = initialize_phoenix_tracing()
+MODAL_TOKEN_ID = os.environ.get("MODAL_TOKEN_ID")
+MODAL_TOKEN_SECRET = os.environ.get("MODAL_TOKEN_SECRET")
+HF_TOKEN = os.environ.get("HF_TOKEN")
+SANDBOXES = {}
+SANDBOX_TIMEOUT = 300
+STOP_EVENTS = {}  # Store stop events for each session
+EXECUTION_STATES = {}  # Store execution states for each session
+# GPU configuration options for the UI
+GPU_OPTIONS = [
+    ("CPU Only", "cpu"),
+    ("NVIDIA T4 (16GB)", "T4"),
+    ("NVIDIA L4 (24GB)", "L4"),
+    ("NVIDIA A100 40GB", "A100-40GB"),
+    ("NVIDIA A100 80GB", "A100-80GB"),
+    ("NVIDIA H100 (80GB)", "H100")
+]
+def initialize_openai_client():
+    """Initialize OpenAI client with proper error handling and fallbacks"""
+    client = None
+    model_name = None
+    # Check if we have any API keys configured
+    has_azure = os.environ.get("AZURE_OPENAI_ENDPOINT") and os.environ.get("AZURE_OPENAI_API_KEY")
+    has_provider = os.environ.get("PROVIDER_API_ENDPOINT") and os.environ.get("PROVIDER_API_KEY")
+    has_openai = os.environ.get("OPENAI_API_KEY")
+    if not (has_azure or has_provider or has_openai):
+        logger.warning("No API keys found in environment - client will be initialized later when user provides keys")
+        return None, None
+    try:
+        # Option 1: Azure OpenAI
+        if has_azure:
+            logger.info("Initializing Azure OpenAI client")
+            client = AzureOpenAI(
+                api_version="2024-12-01-preview",
+                azure_endpoint=os.environ.get("AZURE_OPENAI_ENDPOINT"),
+                api_key=os.environ.get("AZURE_OPENAI_API_KEY")
+            )
+            model_name = os.environ.get("MODEL_NAME", "gpt-4")  # Default fallback
+            logger.info(f"Azure OpenAI client initialized with model: {model_name}")
+        # Option 2: Custom Provider (Cerebras, etc.)
+        elif has_provider:
+            logger.info("Initializing custom provider OpenAI client")
+            client = OpenAI(
+                base_url=os.environ.get("PROVIDER_API_ENDPOINT"),
+                api_key=os.environ.get("PROVIDER_API_KEY")
+            )
+            model_name = os.environ.get("MODEL_NAME", "gpt-4")  # Default fallback
+            logger.info(f"Custom provider client initialized with model: {model_name}")
+        # Option 3: Standard OpenAI
+        elif has_openai:
+            logger.info("Initializing standard OpenAI client")
+            client = OpenAI(
+                api_key=os.environ.get("OPENAI_API_KEY")
+            )
+            model_name = os.environ.get("MODEL_NAME", "gpt-4")  # Default fallback
+            logger.info(f"OpenAI client initialized with model: {model_name}")
+        # Test the client with a simple request (optional - skip if client initialization should be fast)
+        if client:
+            logger.info("Testing client connection...")
+            try:
+                # Simple test to verify the client works
+                _ = client.chat.completions.create(
+                    model=model_name,
+                    messages=[{"role": "user", "content": "Hello"}],
+                    max_tokens=5
+                )
+                logger.info("Client connection test successful")
+            except Exception as test_error:
+                logger.error(f"Client connection test failed: {str(test_error)}")
+                # Don't raise here, let the main application handle it
+        return client, model_name
+    except Exception as e:
+        logger.error(f"Failed to initialize OpenAI client: {str(e)}")
+        logger.warning("Client will be initialized later when user provides valid API keys")
+        return None, None
+client, model_name = initialize_openai_client()
+# If no client was initialized, it means no API keys are available
+if client is None:
+    logger.info("No OpenAI client initialized - waiting for user to provide API keys through UI")
+init_notebook = JupyterNotebook()
+if not os.path.exists(TMP_DIR):
+    os.makedirs(TMP_DIR)
+    logger.info(f"Created temporary directory: {TMP_DIR}")
+else:
+    logger.info(f"Using existing temporary directory: {TMP_DIR}")
+with open(TMP_DIR+"jupyter-agent.ipynb", 'w', encoding='utf-8') as f:
+    json.dump(JupyterNotebook().data, f, indent=2)
+logger.info(f"Initialized default notebook file: {TMP_DIR}jupyter-agent.ipynb")
+try:
+    with open("system_prompt.txt", "r") as f:
+        DEFAULT_SYSTEM_PROMPT = f.read()
+    logger.info("Loaded system prompt from ds-system-prompt.txt")
+except FileNotFoundError:
+    logger.warning("ds-system-prompt.txt not found, using fallback system prompt")
+def execute_jupyter_agent(
+    user_input, files, message_history, gpu_type, cpu_cores, memory_gb, timeout_sec, env_vars_text,
+    modal_token_id, modal_token_secret, hf_token, provider_api_key, provider_api_endpoint, user_model_name,
+    tavily_api_key, enable_web_search, request: gr.Request
+):
+    session_id = request.session_hash
+    logger.info(f"Starting execution for session {session_id}")
+    logger.info(f"Hardware config: GPU={gpu_type}, CPU={cpu_cores}, Memory={memory_gb}GB, Timeout={timeout_sec}s")
+    logger.info(f"User input length: {len(user_input)} characters")
+    # Check if execution is already running for this session
+    if session_id in EXECUTION_STATES and EXECUTION_STATES[session_id].get("running", False):
+        error_message = "❌ Execution already in progress for this session. Please wait for it to complete or stop it first."
+        error_notification = create_notification_html(error_message, "warning")
+        # Return current state without starting new execution
+        session_dir = os.path.join(TMP_DIR, session_id)
+        save_dir = os.path.join(session_dir, 'jupyter-agent.ipynb')
+        if os.path.exists(save_dir):
+            yield error_notification, message_history, save_dir
+        else:
+            yield error_notification, message_history, TMP_DIR + "jupyter-agent.ipynb"
+        return
+    # Initialize session state manager
+    session_manager = SessionStateManager(session_id, TMP_DIR)
+    # Check if this is a continuing session
+    existing_session_state = session_manager.load_state()
+    is_continuing_session = existing_session_state is not None
+    if is_continuing_session:
+        logger.info(f"Found existing session state for {session_id} - continuing from previous state")
+    else:
+        logger.info(f"No existing session state found for {session_id} - starting new session")
+    # Apply user-provided API keys if any are provided
+    user_api_keys = {}
+    if modal_token_id:
+        user_api_keys["MODAL_TOKEN_ID"] = modal_token_id
+    if modal_token_secret:
+        user_api_keys["MODAL_TOKEN_SECRET"] = modal_token_secret
+    if hf_token:
+        user_api_keys["HF_TOKEN"] = hf_token
+    if provider_api_key:
+        user_api_keys["PROVIDER_API_KEY"] = provider_api_key
+    if provider_api_endpoint:
+        user_api_keys["PROVIDER_API_ENDPOINT"] = provider_api_endpoint
+    if user_model_name:
+        user_api_keys["MODEL_NAME"] = user_model_name
+    if tavily_api_key:
+        user_api_keys["TAVILY_API_KEY"] = tavily_api_key
+    # Check if we have a client or need to initialize one with user keys
+    global client, model_name
+    if client is None and not user_api_keys:
+        missing_keys = get_missing_api_keys()
+        if missing_keys:
+            error_message = f"""❌ Missing Required API Keys
+Please provide the following API keys to continue:
+{chr(10).join([f"• {key}: {config['description']}" for key, config in missing_keys.items()])}
+You can either:
+1. Add them to your .env file, or
+2. Enter them in the API Keys section above"""
+            error_notification = create_notification_html(error_message, "error")
+            yield error_notification, message_history, TMP_DIR + "jupyter-agent.ipynb"
+            return
+    # Validate user-provided API keys
+    if user_api_keys:
+        validation_message = "🔍 Validating API keys..."
+        validation_notification = create_progress_notification(validation_message)
+        yield validation_notification, message_history, TMP_DIR + "jupyter-agent.ipynb"
+        validation_errors = []
+        for key, value in user_api_keys.items():
+            is_valid, message = validate_api_key_format(key, value)
+            if not is_valid:
+                validation_errors.append(f"{key}: {message}")
+        if validation_errors:
+            error_message = f"❌ API Key Validation Failed:\n" + "\n".join(f"• {error}" for error in validation_errors)
+            error_notification = create_notification_html(error_message, "error")
+            yield error_notification, message_history, TMP_DIR + "jupyter-agent.ipynb"
+            return
+        logger.info(f"Applying user-provided API keys: {list(user_api_keys.keys())}")
+        apply_user_api_keys(user_api_keys)
+        # Reinitialize OpenAI client with new keys if provider keys were updated
+        if any(key in user_api_keys for key in ["PROVIDER_API_KEY", "PROVIDER_API_ENDPOINT", "MODEL_NAME"]):
+            try:
+                reinit_message = "🔄 Reinitializing AI client with new credentials..."
+                reinit_notification = create_progress_notification(reinit_message)
+                yield reinit_notification, message_history, TMP_DIR + "jupyter-agent.ipynb"
+                client, model_name = initialize_openai_client()
+                if client is None:
+                    error_message = "Failed to initialize client with provided API keys. Please check your credentials."
+                    logger.error(error_message)
+                    error_notification = create_notification_html(error_message, "error")
+                    yield error_notification, message_history, TMP_DIR + "jupyter-agent.ipynb"
+                    return
+                logger.info("Reinitialized OpenAI client with user-provided keys")
+                success_message = "✅ API credentials validated and applied successfully!"
+                success_notification = create_notification_html(success_message, "success")
+                yield success_notification, message_history, TMP_DIR + "jupyter-agent.ipynb"
+            except Exception as e:
+                error_message = f"Failed to initialize client with provided API keys: {str(e)}"
+                logger.error(error_message)
+                error_notification = create_notification_html(error_message, "error")
+                yield error_notification, message_history, TMP_DIR + "jupyter-agent.ipynb"
+                return
+    # Initialize or reset stop event for this session
+    STOP_EVENTS[session_id] = threading.Event()
+    EXECUTION_STATES[session_id] = {"running": True, "paused": False, "current_phase": "initializing"}
+    # Set up save directory early for notifications
+    session_dir = os.path.join(TMP_DIR, request.session_hash)
+    os.makedirs(session_dir, exist_ok=True)
+    save_dir = os.path.join(session_dir, 'jupyter-agent.ipynb')
+    # Create initial notebook file so it exists for Gradio
+    with open(save_dir, 'w', encoding='utf-8') as f:
+        json.dump(init_notebook.data, f, indent=2)
+    logger.info(f"Initialized notebook for session {session_id}")
+    # Session configuration is now handled by SessionStateManager
+    if request.session_hash not in SANDBOXES:
+        logger.info(f"Creating new Modal sandbox for session {session_id}")
+        # Show initialization notification with spinner
+        gpu_info = gpu_type.upper() if gpu_type != "cpu" else "CPU Only"
+        if gpu_type in ["T4", "L4", "A100-40GB", "A100-80GB", "H100"]:
+            gpu_info = f"NVIDIA {gpu_type}"
+        init_message = f"Initializing {gpu_info} sandbox with {cpu_cores} CPU cores and {memory_gb}GB RAM..."
+        notification_html = create_progress_notification(init_message)
+        yield notification_html, message_history, save_dir
+        # Create Modal sandbox with user-specified configuration
+        environment_vars = {}
+        if MODAL_TOKEN_ID and MODAL_TOKEN_SECRET:
+            environment_vars.update({
+                "MODAL_TOKEN_ID": MODAL_TOKEN_ID,
+                "MODAL_TOKEN_SECRET": MODAL_TOKEN_SECRET
+            })
+            logger.debug(f"Modal credentials configured for session {session_id}")
+        # Parse and add user-provided environment variables
+        user_env_vars = parse_environment_variables(env_vars_text)
+        if user_env_vars:
+            environment_vars.update(user_env_vars)
+            logger.info(f"Added {len(user_env_vars)} custom environment variables for session {session_id}")
+            logger.debug(f"Custom environment variables: {list(user_env_vars.keys())}")
+        try:
+            SANDBOXES[request.session_hash] = create_modal_sandbox(
+                gpu_config=gpu_type,
+                cpu_cores=cpu_cores,
+                memory_gb=memory_gb,
+                timeout=int(timeout_sec),
+                environment_vars=environment_vars
+            )
+            logger.info(f"Successfully created Modal sandbox for session {session_id}")
+            # Show success notification
+            success_message = f"✨ {gpu_info} sandbox ready! Environment initialized with all packages."
+            success_notification_html = create_notification_html(success_message, "success")
+            yield success_notification_html, message_history, save_dir
+        except Exception as e:
+            logger.error(f"Failed to create Modal sandbox for session {session_id}: {str(e)}")
+            # Show error notification
+            error_message = f"Failed to initialize sandbox: {str(e)}"
+            error_notification_html = create_notification_html(error_message, "error")
+            yield error_notification_html, message_history, save_dir
+            raise
+    else:
+        logger.info(f"Reusing existing Modal sandbox for session {session_id}")
+        # Show reuse notification
+        gpu_info = gpu_type.upper() if gpu_type != "cpu" else "CPU Only"
+        if gpu_type in ["T4", "L4", "A100-40GB", "A100-80GB", "H100"]:
+            gpu_info = f"NVIDIA {gpu_type}"
+        reuse_message = f"Using existing {gpu_info} sandbox - ready to execute!"
+        reuse_notification_html = create_notification_html(reuse_message, "success")
+        yield reuse_notification_html, message_history, save_dir
+    sbx = SANDBOXES[request.session_hash]
+    logger.debug(f"Notebook will be saved to: {save_dir}")
+    # Initial notebook render
+    yield init_notebook.render(), message_history, save_dir
+    filenames = []
+    if files is not None:
+        logger.info(f"Processing {len(files)} uploaded files for session {session_id}")
+        for filepath in files:
+            filpath = Path(filepath)
+            try:
+                # Get file size for verification
+                file_size = os.path.getsize(filepath)
+                with open(filepath, "rb") as file:
+                    logger.info(f"Uploading file {filepath} ({file_size} bytes) to session {session_id}")
+                    sbx.files.write(filpath.name, file)
+                    # Verify upload succeeded
+                    if sbx.files.verify_file_upload(filpath.name, file_size):
+                        filenames.append(filpath.name)
+                        logger.debug(f"Successfully uploaded and verified {filpath.name}")
+                    else:
+                        logger.error(f"File upload verification failed for {filpath.name}")
+                        raise RuntimeError(f"File upload verification failed for {filpath.name}")
+            except Exception as e:
+                logger.error(f"Failed to upload file {filepath} for session {session_id}: {str(e)}")
+                raise
+    else:
+        logger.info(f"No files to upload for session {session_id}")
+    # Initialize or continue session state
+    if is_continuing_session:
+        # Load existing session state
+        session_state = existing_session_state
+        # Validate and repair conversation history to prevent API errors
+        session_manager.validate_and_repair_conversation(session_state)
+        message_history = session_manager.get_conversation_history(session_state)
+        logger.info(f"Continuing session {session_id} with {len(message_history)} existing messages")
+        # Add new user input if provided
+        if user_input and user_input.strip():
+            # Check if this input was already added by comparing with the last message
+            last_message = message_history[-1] if message_history else None
+            should_add_input = True
+            if last_message and last_message.get("role") == "user":
+                # If the last message is from user and has the same content, don't add duplicate
+                if last_message.get("content") == user_input:
+                    should_add_input = False
+                    logger.debug(f"User input already present in session {session_id}")
+            if should_add_input:
+                session_manager.add_message(session_state, "user", user_input)
+                message_history = session_manager.get_conversation_history(session_state)
+                logger.info(f"Added new user input to existing session {session_id}")
+                # Show notification that we're continuing the conversation
+                continue_message = "🔄 Continuing conversation with new input..."
+                continue_notification = create_progress_notification(continue_message)
+                yield continue_notification, message_history, save_dir
+    else:
+        # Create new session state
+        logger.info(f"Initializing new session {session_id}")
+        # Format files section
+        if files is None:
+            files_section = "- None"
+        else:
+            files_section = "- " + "\n- ".join(filenames)
+            logger.info(f"System prompt includes {len(filenames)} files: {filenames}")
+        # Format GPU information
+        gpu_info = gpu_type.upper() if gpu_type != "cpu" else "CPU Only"
+        if gpu_type in ["T4", "L4", "A100-40GB", "A100-80GB", "H100"]:
+            gpu_info = f"NVIDIA {gpu_type}"
+        # Format available packages based on hardware configuration
+        packages_list = sbx.available_packages
+        packages_section = "\n".join([f"- {package}" for package in packages_list])
+        # Format the complete system prompt with named placeholders
+        system_prompt = DEFAULT_SYSTEM_PROMPT.replace("{AVAILABLE_FILES}", files_section)
+        system_prompt = system_prompt.replace("{GPU_TYPE}", gpu_info)
+        system_prompt = system_prompt.replace("{CPU_CORES}", str(cpu_cores))
+        system_prompt = system_prompt.replace("{MEMORY_GB}", str(memory_gb))
+        system_prompt = system_prompt.replace("{TIMEOUT_SECONDS}", str(timeout_sec))
+        system_prompt = system_prompt.replace("{AVAILABLE_PACKAGES}", packages_section)
+        # Create session state with configuration
+        hardware_config = {
+            "gpu_type": gpu_type,
+            "cpu_cores": cpu_cores,
+            "memory_gb": memory_gb,
+            "timeout_sec": timeout_sec
+        }
+        api_config = {
+            "model_name": model_name or user_model_name or "unknown",
+            "provider_endpoint": os.environ.get("PROVIDER_API_ENDPOINT") or provider_api_endpoint,
+            "provider_type": "openai_compatible"
+        }
+        environment_config = {
+            "variables": env_vars_text or "",
+            "files_uploaded": filenames if filenames else []
+        }
+        # Create initial session state
+        session_state = session_manager.create_initial_state(
+            hardware_config, api_config, environment_config, system_prompt
+        )
+        # Add user input if provided
+        if user_input and user_input.strip():
+            session_manager.add_message(session_state, "user", user_input)
+        # Get conversation history
+        message_history = session_manager.get_conversation_history(session_state)
+        # Save initial state
+        session_manager.save_state(session_state)
+        logger.info(f"Created new session {session_id} with {len(message_history)} messages")
+    logger.debug(f"Session {session_id} ready with {len(message_history)} messages")
+    # Determine which tools to use based on web search toggle
+    from jupyter_agent import TOOLS
+    if enable_web_search:
+        # Check if Tavily API key is available
+        tavily_key = os.environ.get("TAVILY_API_KEY") or tavily_api_key
+        if tavily_key:
+            selected_tools = TOOLS  # Use all tools (code + search)
+            logger.info(f"Web search enabled for session {session_id} - using all tools")
+        else:
+            selected_tools = TOOLS[:1]  # Use only code execution tool
+            logger.warning(f"Web search enabled but no Tavily API key found for session {session_id} - using code tool only")
+    else:
+        selected_tools = TOOLS[:1]  # Use only code execution tool
+        logger.info(f"Web search disabled for session {session_id} - using code tool only")
+    logger.info(f"Starting interactive notebook execution for session {session_id}")
+    # Import Phoenix session context if available
+    try:
+        from jupyter_agent import create_phoenix_session_context
+        phoenix_available = True
+    except ImportError:
+        phoenix_available = False
+    # Prepare session metadata for Phoenix tracing at the session level
+    if phoenix_available:
+        session_level_metadata = {
+            "agent_type": "eureka-agent",
+            "session_type": "jupyter_execution",
+            "gpu_type": gpu_type,
+            "cpu_cores": cpu_cores,
+            "memory_gb": memory_gb,
+            "timeout_sec": timeout_sec,
+            "web_search_enabled": enable_web_search,
+            "tools_available": len(selected_tools)
+        }
+        # Add API provider info if available
+        if model_name:
+            session_level_metadata["model"] = model_name
+        session_context = create_phoenix_session_context(
+            session_id=session_id,
+            user_id=None,  # Could add user identification if available
+            metadata=session_level_metadata
+        )
+    else:
+        from contextlib import nullcontext
+        session_context = nullcontext()
+    # Wrap the entire execution in a Phoenix session context
+    with session_context:
+        logger.debug(f"Starting session-level Phoenix tracing for {session_id}")
+        try:
+            for notebook_html, notebook_data, messages in run_interactive_notebook_with_session_state(
+                client, model_name, session_manager, session_state, sbx, STOP_EVENTS[session_id], selected_tools
+            ):
+                message_history = messages
+                logger.debug(f"Interactive notebook yield for session {session_id}")
+                # Update session state and yield with legacy notebook file for UI compatibility
+                session_manager.update_notebook_data(session_state, notebook_data)
+                session_manager.save_state(session_state)
+                # Create legacy notebook file for UI download compatibility
+                with open(save_dir, 'w', encoding='utf-8') as f:
+                    json.dump(notebook_data, f, indent=2)
+                yield notebook_html, message_history, save_dir
+        except Exception as e:
+            logger.error(f"Error during interactive notebook execution for session {session_id}: {str(e)}")
+            # Save error state
+            session_manager.update_execution_state(session_state, is_running=False, last_execution_successful=False)
+            session_manager.save_state(session_state)
+            raise
+    # Final save and cleanup
+    try:
+        session_manager.update_execution_state(session_state, is_running=False)
+        session_manager.save_state(session_state)
+        logger.info(f"Final session state saved for session {session_id}")
+        # Create final legacy notebook file for UI
+        with open(save_dir, 'w', encoding='utf-8') as f:
+            json.dump(notebook_data, f, indent=2)
+    except Exception as e:
+        logger.error(f"Failed to save final session state for session {session_id}: {str(e)}")
+        raise
+    yield notebook_html, message_history, save_dir
+    logger.info(f"Completed execution for session {session_id}")
+    # Update legacy execution state for compatibility
+    if session_id in EXECUTION_STATES:
+        EXECUTION_STATES[session_id]["running"] = False
+def clear(msg_state, request: gr.Request):
+    """Clear notebook but keep session data (less destructive than shutdown)"""
+    session_id = request.session_hash
+    logger.info(f"Clearing notebook for session {session_id}")
+    # Stop any running execution
+    if session_id in STOP_EVENTS:
+        STOP_EVENTS[session_id].set()
+    # Clear execution states but keep session data
+    if session_id in EXECUTION_STATES:
+        EXECUTION_STATES[session_id]["running"] = False
+        EXECUTION_STATES[session_id]["paused"] = False
+        EXECUTION_STATES[session_id]["current_phase"] = "ready"
+    # Reset message state for UI
+    msg_state = []
+    logger.info(f"Reset notebook display for session {session_id}")
+    return init_notebook.render(), msg_state
+def stop_execution(request: gr.Request):
+    """Stop the current execution for this session"""
+    session_id = request.session_hash
+    logger.info(f"Stopping execution for session {session_id}")
+    if session_id in STOP_EVENTS and session_id in EXECUTION_STATES:
+        # Check if execution is actually running
+        if EXECUTION_STATES[session_id].get("running", False):
+            STOP_EVENTS[session_id].set()
+            logger.info(f"Stop signal sent for session {session_id}")
+            # Update execution state
+            EXECUTION_STATES[session_id]["running"] = False
+            EXECUTION_STATES[session_id]["paused"] = True
+            EXECUTION_STATES[session_id]["current_phase"] = "stopping"
+            # Also update session state if available
+            session_manager = SessionStateManager(session_id, TMP_DIR)
+            session_state = session_manager.load_state()
+            if session_state:
+                session_manager.update_execution_state(
+                    session_state, is_running=False, is_paused=True, current_phase="stopping"
+                )
+                session_manager.save_state(session_state)
+            return "⏸️ Execution stopped - click Run to resume with new input"
+        else:
+            logger.info(f"No active execution to stop for session {session_id}")
+            return "⚪ No active execution to stop"
+    else:
+        logger.warning(f"No execution session found for {session_id}")
+        return "❌ No execution session found"
+def shutdown_sandbox(request: gr.Request):
+    """Shutdown the sandbox while preserving all session data and files"""
+    session_id = request.session_hash
+    logger.info(f"Shutting down sandbox for {session_id} (preserving all session data and files)")
+    try:
+        # 1. Stop any running execution first
+        if session_id in STOP_EVENTS:
+            STOP_EVENTS[session_id].set()
+            logger.info(f"Stopped execution for session {session_id}")
+        # 2. Shutdown Modal sandbox only
+        if session_id in SANDBOXES:
+            logger.info(f"Killing Modal sandbox for session {session_id}")
+            SANDBOXES[session_id].kill()
+            SANDBOXES.pop(session_id)
+            logger.info(f"Successfully shutdown sandbox for session {session_id}")
+        # 3. Log what's being preserved (but don't remove anything)
+        session_manager = SessionStateManager(session_id, TMP_DIR)
+        if session_manager.session_exists():
+            logger.info(f"Preserving session data for {session_id}")
+            # Load session state to show what's being preserved
+            session_state = session_manager.load_state()
+            if session_state:
+                # Log what we're preserving
+                stats = session_state.get("session_stats", {})
+                llm_interactions = len(session_state.get("llm_interactions", []))
+                tool_executions = len(session_state.get("tool_executions", []))
+                logger.info(f"Preserving session {session_id}: "
+                          f"{stats.get('total_messages', 0)} messages, "
+                          f"{llm_interactions} LLM interactions, "
+                          f"{tool_executions} tool executions, "
+                          f"{stats.get('total_code_executions', 0)} code runs")
+            # Log all preserved files
+            if session_manager.session_dir.exists():
+                try:
+                    preserved_files = []
+                    for file_path in session_manager.session_dir.iterdir():
+                        if file_path.is_file():
+                            preserved_files.append(file_path.name)
+                    if preserved_files:
+                        logger.info(f"Preserving {len(preserved_files)} files in {session_id}: {preserved_files}")
+                    else:
+                        logger.info(f"No files found in session {session_id}")
+                except OSError as e:
+                    logger.warning(f"Could not check session directory {session_id}: {e}")
+        # 4. Keep execution tracking data (don't clear anything)
+        logger.info(f"Preserving execution state and stop events for {session_id}")
+        logger.info(f"Sandbox shutdown completed for session {session_id} (all data preserved)")
+        return gr.Button(visible=False)
+    except Exception as e:
+        logger.error(f"Error during shutdown for session {session_id}: {str(e)}")
+        return f"❌ Error during shutdown: {str(e)}", gr.Button(visible=True)
+# continue_execution function removed - functionality integrated into execute_jupyter_agent
+def get_execution_status(request: gr.Request):
+    """Get the current execution status for UI updates"""
+    session_id = request.session_hash
+    if session_id not in EXECUTION_STATES:
+        return "⚪ Ready"
+    state = EXECUTION_STATES[session_id]
+    if state["running"]:
+        if session_id in STOP_EVENTS and STOP_EVENTS[session_id].is_set():
+            return "⏸️ Stopping..."
+        else:
+            # Check if we have more detailed phase information
+            phase = state.get("current_phase", "running")
+            if phase == "generating":
+                return "🟢 Generating response..."
+            elif phase == "executing_code":
+                return "🟢 Executing code..."
+            elif phase == "searching":
+                return "🟢 Searching web..."
+            else:
+                return "🟢 Running"
+    elif state.get("paused", False):
+        return "⏸️ Paused - Click Run to continue"
+    else:
+        return "⚪ Ready"
+def is_sandbox_active(request: gr.Request):
+    """Check if sandbox is active for the current session"""
+    session_id = request.session_hash
+    return session_id in SANDBOXES
+def get_sandbox_status_and_visibility(request: gr.Request):
+    """Get sandbox status message and button visibility"""
+    session_id = request.session_hash
+    if session_id in SANDBOXES:
+        return "🟢 Sandbox active", gr.Button(visible=True)
+    else:
+        return "⚪ No sandbox active", gr.Button(visible=False)
+def update_sandbox_button_visibility(request: gr.Request):
+    """Update only the button visibility based on sandbox status"""
+    session_id = request.session_hash
+    return gr.Button(visible=session_id in SANDBOXES)
+def reset_ui_after_shutdown(request: gr.Request):
+    """Reset UI components after complete shutdown"""
+    session_id = request.session_hash
+    # Check if session is truly cleared
+    is_cleared = (session_id not in SANDBOXES and
+                 session_id not in EXECUTION_STATES and
+                 session_id not in STOP_EVENTS)
+    if is_cleared:
+        # Return reset state for all UI components
+        return (
+            init_notebook.render(),  # Reset notebook display
+            [],  # Clear message state
+            "⚪ Ready",  # Reset status
+            "⚪ No sandbox active",  # Reset sandbox status
+            gr.Button(visible=False)  # Hide shutdown button
+        )
+    else:
+        # Return current state if not fully cleared
+        status = get_execution_status(request)
+        sandbox_status, button_vis = get_sandbox_status_and_visibility(request)
+        return (
+            init_notebook.render(),  # Still reset notebook display
+            [],  # Still clear message state
+            status,
+            sandbox_status,
+            button_vis
+        )
+def reconstruct_message_history_from_notebook(notebook_data):
+    """Reconstruct message history from notebook cells"""
+    message_history = []
+    cells = notebook_data.get('cells', [])
+    system_prompt = None
+    current_conversation = []
+    for cell in cells:
+        cell_type = cell.get('cell_type', '')
+        if cell_type == 'markdown':
+            content = cell.get('source', '')
+            if isinstance(content, list):
+                content = ''.join(content)
+            # Check if this is a system message
+            if 'System' in content and 'IMPORTANT EXECUTION GUIDELINES' in content:
+                # Extract the system prompt content
+                system_content = content
+                # Clean up the HTML and extract the actual content
+                # Remove HTML tags and extract the text content
+                clean_content = re.sub(r'<[^>]+>', '', system_content)
+                clean_content = re.sub(r'\n+', '\n', clean_content).strip()
+                system_prompt = clean_content
+            elif 'User' in content and not any(word in content for word in ['Assistant', 'System']):
+                # This is a user message
+                # Extract the user content after the User header
+                user_content = content.split('User')[1] if 'User' in content else content
+                # Clean up HTML and formatting
+                user_content = re.sub(r'<[^>]+>', '', user_content)
+                user_content = re.sub(r'-{3,}', '', user_content)
+                user_content = user_content.strip()
+                if user_content:
+                    current_conversation.append({
+                        "role": "user",
+                        "content": user_content
+                    })
+            elif 'Assistant' in content:
+                # This is an assistant message
+                assistant_content = content.split('Assistant')[1] if 'Assistant' in content else content
+                # Clean up HTML and formatting
+                assistant_content = re.sub(r'<[^>]+>', '', assistant_content)
+                assistant_content = re.sub(r'-{3,}', '', assistant_content)
+                assistant_content = assistant_content.strip()
+                if assistant_content:
+                    current_conversation.append({
+                        "role": "assistant",
+                        "content": assistant_content
+                    })
+    # Build the final message history
+    if system_prompt:
+        message_history.append({
+            "role": "system",
+            "content": system_prompt
+        })
+    # Add the conversation messages
+    message_history.extend(current_conversation)
+    return message_history
+def load_previous_notebook(notebook_choice, request: gr.Request):
+    """Load a previous notebook with complete session configuration (dev only)"""
+    if not is_dev_environment():
+        return (init_notebook.render(), [], "Load previous notebooks is only available in development mode",
+                None, None, None, None, None, "", "", "", "", "", "", "", False)
+    if not notebook_choice or notebook_choice == "None":
+        return (init_notebook.render(), [], "Please select a notebook to load",
+                None, None, None, None, None, "", "", "", "", "", "", "", False)
+    try:
+        # Parse the notebook choice to get the session ID
+        session_id = notebook_choice.split(" ")[0]
+        notebook_path = Path(TMP_DIR) / session_id / "jupyter-agent.ipynb"
+        if not notebook_path.exists():
+            return (init_notebook.render(), [], f"Notebook file not found: {notebook_path}",
+                    None, None, None, None, None, "", "", "", "", "", "", "", False)
+        # Load the notebook
+        with open(notebook_path, 'r') as f:
+            notebook_data = json.load(f)
+        # Load session state
+        temp_session_manager = SessionStateManager(session_id, TMP_DIR)
+        session_state = temp_session_manager.load_state()
+        session_config = None  # For backward compatibility
+        # Extract config from session state for UI restoration
+        if session_state:
+            session_config = {
+                "hardware": session_state.get("hardware_config", {}),
+                "environment_vars": session_state.get("environment", {}).get("variables", ""),
+                "api_keys": {
+                    "model_name": session_state.get("api_config", {}).get("model_name", "")
+                }
+            }
+        # Create a new JupyterNotebook instance with the loaded data
+        loaded_notebook = JupyterNotebook()
+        loaded_notebook.data = notebook_data
+        # Reconstruct message history from notebook cells
+        message_history = reconstruct_message_history_from_notebook(notebook_data)
+        # Store the loaded notebook info in session for continue functionality
+        session_id_hash = request.session_hash
+        if session_id_hash not in EXECUTION_STATES:
+            EXECUTION_STATES[session_id_hash] = {}
+        EXECUTION_STATES[session_id_hash]["loaded_notebook"] = {
+            "notebook_data": notebook_data,
+            "message_history": message_history,
+            "original_session": session_id,
+            "session_config": session_config
+        }
+        logger.info(f"Successfully loaded notebook from {notebook_path}")
+        logger.info(f"Reconstructed message history with {len(message_history)} messages")
+        # Prepare configuration values to restore UI state
+        config_loaded = ""
+        gpu_type = None
+        cpu_cores = None
+        memory_gb = None
+        timeout_sec = None
+        env_vars = ""
+        modal_token_id = ""
+        modal_token_secret = ""
+        hf_token = ""
+        provider_api_key = ""
+        provider_api_endpoint = ""
+        model_name = ""
+        if session_config:
+            hardware = session_config.get("hardware", {})
+            gpu_type = hardware.get("gpu_type")
+            cpu_cores = hardware.get("cpu_cores")
+            memory_gb = hardware.get("memory_gb")
+            timeout_sec = hardware.get("timeout_sec")
+            env_vars = session_config.get("environment_vars", "")
+            api_keys = session_config.get("api_keys", {})
+            modal_token_id = api_keys.get("modal_token_id", "")
+            modal_token_secret = api_keys.get("modal_token_secret", "")
+            hf_token = api_keys.get("hf_token", "")
+            provider_api_key = api_keys.get("provider_api_key", "")
+            provider_api_endpoint = api_keys.get("provider_api_endpoint", "")
+            model_name = api_keys.get("model_name", "")
+            config_loaded = f"✅ Configuration restored: GPU={gpu_type}, CPU={cpu_cores}, Memory={memory_gb}GB, Timeout={timeout_sec}s"
+        success_message = f"✅ Loaded notebook: {session_id} ({len(notebook_data.get('cells', []))} cells, {len(message_history)} messages)"
+        if config_loaded:
+            success_message += f"\n{config_loaded}"
+        return (loaded_notebook.render(), message_history, success_message,
+                gpu_type, cpu_cores, memory_gb, timeout_sec, env_vars,
+                modal_token_id, modal_token_secret, hf_token, provider_api_key, provider_api_endpoint, model_name,
+                "", False)  # Default empty tavily_api_key and False for enable_web_search
+    except Exception as e:
+        logger.error(f"Failed to load notebook {notebook_choice}: {str(e)}")
+        error_message = f"❌ Failed to load notebook: {str(e)}"
+        return (init_notebook.render(), [], error_message,
+                None, None, None, None, None, "", "", "", "", "", "", "", False)
+def get_notebook_options():
+    """Get options for notebook dropdown (dev only)"""
+    if not is_dev_environment():
+        return ["Load previous notebooks is only available in development mode"]
+    notebooks = get_previous_notebooks()
+    if not notebooks:
+        return ["No previous notebooks found"]
+    options = ["None"] + [nb['display_name'] for nb in notebooks[:20]]  # Limit to 20 most recent
+    return options
+def refresh_notebook_options():
+    """Refresh the notebook options dropdown"""
+    return gr.Dropdown(choices=get_notebook_options(), value="None")
+# Legacy session configuration functions removed - replaced by SessionStateManager
+# All session data is now stored in a single comprehensive session_state.json file
+css = """
+#component-0 {
+    height: 100vh;
+    overflow-y: auto;
+    padding: 20px;
+}
+.gradio-container {
+    height: 100vh !important;
+}
+.contain {
+    height: 100vh !important;
+}
+/* Button states for execution control */
+.button-executing {
+    opacity: 0.6 !important;
+    pointer-events: none !important;
+    cursor: not-allowed !important;
+}
+.button-executing::after {
+    content: " ⏳";
+}
+.status-running {
+    animation: pulse 2s infinite;
+}
+@keyframes pulse {
+    0% { opacity: 1; }
+    50% { opacity: 0.5; }
+    100% { opacity: 1; }
+}
+"""
+# Create the interface
+with gr.Blocks() as demo:
+    msg_state = gr.State(value=[])
+    # Environment info display
+    env_info = gr.Markdown(f"""
+    **Environment**: {get_environment().upper()} | **Features**: {"Development features enabled" if is_dev_environment() else "Production mode"}
+    """)
+    html_output = gr.HTML(value=JupyterNotebook().render())
+    user_input = gr.Textbox(
+        # value="train a 5 neuron neural network to classify the iris dataset",
+        value="can you finetune llama 3.2 1b on tiny stories dataset and using unsloth",
+        lines=3,
+        label="Agent task"
+    )
+    with gr.Accordion("Upload files ⬆ | Download notebook⬇", open=False):
+        files = gr.File(label="Upload files to use", file_count="multiple")
+        file = gr.File(TMP_DIR+"jupyter-agent.ipynb", label="Download Jupyter Notebook")
+    with gr.Row():
+        # Web Search Configuration
+        with gr.Accordion("🔍 Web Search Settings", open=False):
+            with gr.Row():
+                enable_web_search = gr.Checkbox(
+                    label="Enable Web Search",
+                    value=bool(os.environ.get("TAVILY_API_KEY")),  # Default to True if API key is available
+                    info="Allow the agent to search the web for current information and documentation"
+                )
+                # Show web search status with better formatting
+                tavily_status = "✅ Available" if os.environ.get("TAVILY_API_KEY") else "❌ API Key Required"
+                gr.Markdown(f"**Status:** {tavily_status}")
+            gr.Markdown("""
+            **Web Search Features:**
+            - 🌐 Search for current tutorials, documentation, and best practices
+            - 🐛 Find solutions to error messages and debugging help
+            - 📚 Access up-to-date library documentation and examples
+            - 💡 Get recent examples and code snippets from the web
+            ⚠️ **Note**: Web search requires a Tavily API key. Get one free at [tavily.com](https://tavily.com)
+            """)
+        # Previous notebooks section (dev only)
+        if is_dev_environment():
+            with gr.Accordion("📂 Load Previous Notebook (Dev Only)", open=False):
+                notebook_dropdown = gr.Dropdown(
+                    choices=get_notebook_options(),
+                    value="None",
+                    label="Select Previous Notebook",
+                    info="Load a previously created notebook session"
+                )
+                with gr.Row():
+                    load_notebook_btn = gr.Button("📖 Load Selected", variant="secondary")
+                    refresh_notebooks_btn = gr.Button("🔄 Refresh List", variant="secondary")
+                load_status = gr.Textbox(
+                    label="Load Status",
+                    interactive=False,
+                    visible=False
+                )
+    # Check for missing API keys and show input fields conditionally
+    missing_keys = get_missing_api_keys()
+    # API Key Configuration (shown only if keys are missing)
+    if missing_keys:
+        with gr.Accordion("🔑 Required API Keys (Missing from .env)", open=True):
+            gr.Markdown("""
+            **⚠️ Some required API keys are missing from your .env file.**
+            Please provide them below to use the application:
+            """)
+            api_key_components = {}
+            if "MODAL_TOKEN_ID" in missing_keys:
+                api_key_components["modal_token_id"] = gr.Textbox(
+                    label="Modal Token ID",
+                    placeholder="ak-...",
+                    info="Modal Token ID for sandbox access",
+                    type="password"
+                )
+            else:
+                api_key_components["modal_token_id"] = gr.Textbox(visible=False)
+            if "MODAL_TOKEN_SECRET" in missing_keys:
+                api_key_components["modal_token_secret"] = gr.Textbox(
+                    label="Modal Token Secret",
+                    placeholder="as-...",
+                    info="Modal Token Secret for sandbox access",
+                    type="password"
+                )
+            else:
+                api_key_components["modal_token_secret"] = gr.Textbox(visible=False)
+            if "HF_TOKEN" in missing_keys:
+                api_key_components["hf_token"] = gr.Textbox(
+                    label="Hugging Face Token (Optional)",
+                    placeholder="hf_...",
+                    info="Hugging Face Token for model access",
+                    type="password"
+                )
+            else:
+                api_key_components["hf_token"] = gr.Textbox(visible=False)
+            if "PROVIDER_API_KEY" in missing_keys:
+                api_key_components["provider_api_key"] = gr.Textbox(
+                    label="AI Provider API Key",
+                    placeholder="sk-, gsk_, or csk-...",
+                    info="API Key for your AI provider (Anthropic, OpenAI, Cerebras, etc.)",
+                    type="password"
+                )
+            else:
+                api_key_components["provider_api_key"] = gr.Textbox(visible=False)
+            if "PROVIDER_API_ENDPOINT" in missing_keys:
+                api_key_components["provider_api_endpoint"] = gr.Textbox(
+                    label="AI Provider API Endpoint",
+                    placeholder="https://api.anthropic.com/v1/",
+                    info="API endpoint for your AI provider"
+                )
+            else:
+                api_key_components["provider_api_endpoint"] = gr.Textbox(visible=False)
+            if "MODEL_NAME" in missing_keys:
+                api_key_components["model_name"] = gr.Textbox(
+                    label="Model Name",
+                    placeholder="claude-sonnet-4-20250514",
+                    info="Name of the model to use"
+                )
+            else:
+                api_key_components["model_name"] = gr.Textbox(visible=False)
+            if "TAVILY_API_KEY" in missing_keys:
+                api_key_components["tavily_api_key"] = gr.Textbox(
+                    label="Tavily API Key (Optional)",
+                    placeholder="tvly-...",
+                    info="Tavily API Key for web search functionality",
+                    type="password"
+                )
+            else:
+                api_key_components["tavily_api_key"] = gr.Textbox(visible=False)
+    else:
+        # Create hidden components when no keys are missing
+        api_key_components = {
+            "modal_token_id": gr.Textbox(visible=False),
+            "modal_token_secret": gr.Textbox(visible=False),
+            "hf_token": gr.Textbox(visible=False),
+            "provider_api_key": gr.Textbox(visible=False),
+            "provider_api_endpoint": gr.Textbox(visible=False),
+            "model_name": gr.Textbox(visible=False),
+            "tavily_api_key": gr.Textbox(visible=False)
+        }
+    with gr.Accordion("Hardware Configuration ⚙️", open=False):
+        with gr.Row():
+            with gr.Column():
+                env_vars = gr.Textbox(
+                        label="Environment Variables",
+                        placeholder="Enter environment variables (one per line):\nAPI_KEY=your_key_here\nDATA_PATH=/path/to/data\nDEBUG=true",
+                        lines=5,
+                        info="Add custom environment variables for the sandbox. Format: KEY=value (one per line)"
+                    )
+                env_info = gr.Markdown("""
+                    **Environment Variables Info:**
+                    - Variables will be available in the sandbox environment
+                    - Use KEY=value format, one per line
+                    - Common examples: API keys, data paths, configuration flags
+                    - Variables are session-specific and not persisted between sessions
+                    ⚠️ **Security**: Avoid sensitive credentials in shared environments
+                    """)
+            with gr.Column():
+                with gr.Row():
+                    gpu_type = gr.Dropdown(
+                        choices=GPU_OPTIONS,
+                        value="cpu",
+                        label="GPU Type",
+                        info="Select hardware acceleration"
+                    )
+                    cpu_cores = gr.Slider(
+                        minimum=0.25,
+                        maximum=16,
+                        value=2.0,
+                        step=0.25,
+                        label="CPU Cores",
+                        info="Number of CPU cores"
+                    )
+                with gr.Row():
+                    memory_gb = gr.Slider(
+                        minimum=0.5,
+                        maximum=64,
+                        value=8.0,
+                        step=0.5,
+                        label="Memory (GB)",
+                        info="RAM allocation"
+                    )
+                    timeout_sec = gr.Slider(
+                        minimum=60,
+                        maximum=1800,
+                        value=300,
+                        step=60,
+                        label="Timeout (seconds)",
+                        info="Maximum execution time"
+                    )
+                hardware_info = gr.Markdown("""
+                **Hardware Options:**
+                - **CPU Only**: Free, good for basic tasks
+                - **T4**: Low-cost GPU, good for small models
+                - **L4**: Mid-range GPU, better performance
+                - **A100 40/80GB**: High-end GPU for large models
+                - **H100**: Latest flagship GPU for maximum performance
+                ⚠️ **Note**: GPU instances cost more. Choose based on your workload.
+                """)
+                # with gr.Accordion("Environment Variables 🔧", open=False):
+    with gr.Row():
+        generate_btn = gr.Button("Run!", variant="primary")
+        stop_btn = gr.Button("⏸️ Stop", variant="secondary")
+        # continue_btn removed - Run button handles continuation automatically
+        clear_btn = gr.Button("Clear Notebook", variant="stop")
+        shutdown_btn = gr.Button("🔴 Shutdown Sandbox", variant="stop", visible=False)
+    # Status display
+    status_display = gr.Textbox(
+        value="⚪ Ready",
+        label="Execution Status",
+        interactive=False,
+        max_lines=1
+    )
+    generate_btn.click(
+        fn=execute_jupyter_agent,
+        inputs=[
+            user_input, files, msg_state, gpu_type, cpu_cores, memory_gb, timeout_sec, env_vars,
+            api_key_components["modal_token_id"], api_key_components["modal_token_secret"],
+            api_key_components["hf_token"], api_key_components["provider_api_key"],
+            api_key_components["provider_api_endpoint"], api_key_components["model_name"],
+            api_key_components["tavily_api_key"], enable_web_search
+        ],
+        outputs=[html_output, msg_state, file],
+        show_progress="hidden",
+    )
+    stop_btn.click(
+        fn=stop_execution,
+        outputs=[status_display],
+        show_progress="hidden",
+    )
+    # continue_btn.click handler removed - Run button handles continuation automatically
+    clear_btn.click(fn=clear, inputs=[msg_state], outputs=[html_output, msg_state])
+    shutdown_btn.click(
+        fn=shutdown_sandbox,
+        outputs=[shutdown_btn],
+        show_progress="hidden",
+    )
+    # Add event handlers for notebook loading (dev only)
+    if is_dev_environment():
+        load_notebook_btn.click(
+            fn=load_previous_notebook,
+            inputs=[notebook_dropdown],
+            outputs=[
+                html_output, msg_state, load_status,
+                gpu_type, cpu_cores, memory_gb, timeout_sec, env_vars,
+                api_key_components["modal_token_id"], api_key_components["modal_token_secret"],
+                api_key_components["hf_token"], api_key_components["provider_api_key"],
+                api_key_components["provider_api_endpoint"], api_key_components["model_name"],
+                api_key_components["tavily_api_key"], enable_web_search
+            ],
+            show_progress="hidden"
+        )
+        refresh_notebooks_btn.click(
+            fn=refresh_notebook_options,
+            outputs=[notebook_dropdown],
+            show_progress="hidden"
+        )
+        # Show/hide load status based on selection
+        notebook_dropdown.change(
+            fn=lambda choice: gr.Textbox(visible=choice != "None"),
+            inputs=[notebook_dropdown],
+            outputs=[load_status]
+        )
+    # Periodic status update using timer
+    status_timer = gr.Timer(2.0)  # Update every 2 seconds
+    status_timer.tick(
+        fn=get_execution_status,
+        outputs=[status_display],
+        show_progress="hidden"
+    )
+    # Update button visibility periodically
+    button_timer = gr.Timer(3.0)  # Check every 3 seconds
+    button_timer.tick(
+        fn=update_sandbox_button_visibility,
+        outputs=[shutdown_btn],
+        show_progress="hidden"
+    )
+    demo.load(
+        fn=None,
+        inputs=None,
+        outputs=None,
+        js=""" () => {
+    if (document.querySelectorAll('.dark').length) {
+        document.querySelectorAll('.dark').forEach(el => el.classList.remove('dark'));
+    }
+    // Add execution state management functions
+    window.setExecutionState = function(isExecuting) {
+        // Find Run button by text content since variant attribute might not be reliable
+        const buttons = document.querySelectorAll('button');
+        let runButton = null;
+        let stopButton = null;
+        buttons.forEach(button => {
+            const text = button.textContent.trim().toLowerCase();
+            if (text.includes('run') && !text.includes('stop')) {
+                runButton = button;
+            } else if (text.includes('stop') || text.includes('⏸️')) {
+                stopButton = button;
+            }
+        });
+        if (runButton) {
+            if (isExecuting) {
+                runButton.classList.add('button-executing');
+                runButton.disabled = true;
+                runButton.style.opacity = '0.6';
+                runButton.style.cursor = 'not-allowed';
+                runButton.style.pointerEvents = 'none';
+                if (runButton.textContent.indexOf('⏳') === -1) {
+                    runButton.textContent = runButton.textContent.replace('!', '! ⏳');
+                }
+            } else {
+                runButton.classList.remove('button-executing');
+                runButton.disabled = false;
+                runButton.style.opacity = '1';
+                runButton.style.cursor = 'pointer';
+                runButton.style.pointerEvents = 'auto';
+                runButton.textContent = runButton.textContent.replace(' ⏳', '');
+            }
+        }
+        // Also update stop button visibility/state
+        if (stopButton) {
+            stopButton.style.display = isExecuting ? 'block' : 'inline-block';
+        }
+    };
+    // Monitor for status changes and update button states
+    window.monitorExecutionStatus = function() {
+        // Try multiple ways to find the status element
+        let statusElement = document.querySelector('input[label*="Execution Status"], input[label*="Status"], textarea[label*="Status"]');
+        if (!statusElement) {
+            // Fallback: look for any input that might contain status
+            const allInputs = document.querySelectorAll('input, textarea');
+            allInputs.forEach(input => {
+                if (input.value && (input.value.includes('🟢') || input.value.includes('⚪') || input.value.includes('⏸️'))) {
+                    statusElement = input;
+                }
+            });
+        }
+        if (statusElement) {
+            const status = statusElement.value || '';
+            const isRunning = status.includes('🟢') || status.includes('Running') || status.includes('Generating') || status.includes('Executing');
+            const isReady = status.includes('⚪') || status.includes('Ready');
+            window.setExecutionState(isRunning);
+            // Add visual indicator to status element
+            if (isRunning) {
+                statusElement.style.background = '#e3f2fd';
+                statusElement.style.borderColor = '#2196f3';
+            } else if (isReady) {
+                statusElement.style.background = '#f5f5f5';
+                statusElement.style.borderColor = '#ccc';
+            } else {
+                statusElement.style.background = '#fff3e0';
+                statusElement.style.borderColor = '#ff9800';
+            }
+        }
+    };
+    // Set up mutation observer to watch for status changes
+    const observer = new MutationObserver(function(mutations) {
+        mutations.forEach(function(mutation) {
+            if (mutation.type === 'childList' || mutation.type === 'attributes') {
+                setTimeout(window.monitorExecutionStatus, 100);
+            }
+        });
+    });
+    // Start observing
+    observer.observe(document.body, {
+        childList: true,
+        subtree: true,
+        attributes: true
+    });
+}
+"""
+    )
+logger.info("Starting Gradio application")
+demo.launch(ssr_mode=False)

jupyter_agent.py ADDED Viewed

	@@ -0,0 +1,1463 @@

+from jupyter_handler import JupyterNotebook
+import json
+import logging
+import os
+import datetime
+from pathlib import Path
+from typing import Dict, List, Any, Optional
+from tavily import TavilyClient
+# Phoenix tracing imports
+try:
+    from openinference.instrumentation import using_session
+    PHOENIX_AVAILABLE = True
+    print("Phoenix session tracking imports successful")
+except ImportError:
+    PHOENIX_AVAILABLE = False
+    print("Phoenix session tracking not available - missing openinference packages")
+# Configure logging for utils module
+logger = logging.getLogger(__name__)
+# Initialize Tavily client
+TAVILY_API_KEY = os.getenv("TAVILY_API_KEY")
+tavily_client = TavilyClient(api_key=TAVILY_API_KEY) if TAVILY_API_KEY else None
+TOOLS = [
+    {
+        "type": "function",
+        "function": {
+            "name": "add_and_execute_jupyter_code_cell",
+            "description": "A Python code execution environment that runs code in a Jupyter notebook interface. This is stateful - variables and imports persist between executions.",
+            "parameters": {
+                "type": "object",
+                "properties": {
+                    "code": {
+                        "type": "string",
+                        "description": "The Python code to execute."
+                    }
+                },
+                "required": ["code"]
+            }
+        }
+    },
+    {
+        "type": "function",
+        "function": {
+            "name": "edit_and_execute_current_cell",
+            "description": "Edit the current/last code cell and execute the new code. Use this to fix errors or modify the previous code instead of creating a new cell.",
+            "parameters": {
+                "type": "object",
+                "properties": {
+                    "code": {
+                        "type": "string",
+                        "description": "The updated Python code to replace the current cell with and execute."
+                    }
+                },
+                "required": ["code"]
+            }
+        }
+    },
+    {
+        "type": "function",
+        "function": {
+            "name": "execute_shell_command",
+            "description": "Execute shell/system commands like ls, cat, mkdir, etc. This runs independently of Python and provides terminal-style output.",
+            "parameters": {
+                "type": "object",
+                "properties": {
+                    "command": {
+                        "type": "string",
+                        "description": "The shell command to execute (e.g., 'ls -la', 'cat file.txt', 'mkdir new_folder')."
+                    }
+                },
+                "required": ["command"]
+            }
+        }
+    },
+    {
+        "type": "function",
+        "function": {
+            "name": "web_search",
+            "description": "Search the web for current information, documentation, tutorials, and solutions to coding problems. Use this to get context before starting tasks or when encountering errors.",
+            "parameters": {
+                "type": "object",
+                "properties": {
+                    "query": {
+                        "type": "string",
+                        "description": "Search query (max 400 characters). Be specific and include relevant keywords."
+                    }
+                },
+                "required": ["query"]
+            }
+        }
+    },
+]
+# TOOLS = TOOLS[:1]
+MAX_TURNS = 20
+def create_phoenix_session_context(session_id: str, user_id: str = None, metadata: Dict = None):
+    """
+    Create a Phoenix session context for tracing LLM interactions.
+    Args:
+        session_id: Unique identifier for the session
+        user_id: Optional user identifier
+        metadata: Additional metadata to include in traces
+    Returns:
+        Context manager for Phoenix session tracking
+    """
+    if not PHOENIX_AVAILABLE:
+        # Return a no-op context manager if Phoenix is not available
+        from contextlib import nullcontext
+        return nullcontext()
+    try:
+        # Use using_session for proper session grouping in Phoenix
+        # This ensures all LLM calls within this context are grouped under the same session
+        logger.debug(f"Creating Phoenix session context for session_id: {session_id}")
+        return using_session(session_id)
+    except Exception as e:
+        logger.warning(f"Failed to create Phoenix session context for {session_id}: {e}")
+        # Fallback to no-op context if Phoenix session creation fails
+        from contextlib import nullcontext
+        return nullcontext()
+class SessionStateManager:
+    """Manages comprehensive session state in a single JSON file"""
+    def __init__(self, session_id: str, base_dir: str = './temp/'):
+        self.session_id = session_id
+        self.base_dir = Path(base_dir)
+        self.session_dir = self.base_dir / session_id
+        self.state_file = self.session_dir / 'session_state.json'
+        self.session_dir.mkdir(parents=True, exist_ok=True)
+        logger.info(f"SessionStateManager initialized for {session_id}")
+    def create_initial_state(self, hardware_config: Dict, api_config: Dict,
+                           environment: Dict, system_prompt: str) -> Dict:
+        """Create initial session state structure"""
+        timestamp = datetime.datetime.now(datetime.timezone.utc).isoformat()
+        initial_state = {
+            "session_id": self.session_id,
+            "created_at": timestamp,
+            "last_updated": timestamp,
+            "version": "1.0",
+            "hardware_config": hardware_config,
+            "api_config": api_config,
+            "environment": environment,
+            "conversation_history": [
+                {
+                    "role": "system",
+                    "content": system_prompt,
+                    "timestamp": timestamp,
+                    "metadata": {"type": "system_initialization"}
+                }
+            ],
+            "llm_interactions": [],  # Complete API call logs
+            "tool_executions": [],   # All tool calls and results
+            "notebook_data": {
+                "cells": [],
+                "metadata": {
+                    "kernel_info": {"name": "python3"},
+                    "language_info": {"name": "python", "version": "3.12"},
+                },
+                "nbformat": 4,
+                "nbformat_minor": 0
+            },
+            "execution_state": {
+                "current_turn": 0,
+                "max_turns": MAX_TURNS,
+                "is_running": False,
+                "is_paused": False,
+                "last_execution_successful": None,
+                "sandbox_active": False,
+                "sandbox_info": None
+            },
+            "session_stats": {
+                "total_messages": 1,
+                "total_code_executions": 0,
+                "total_searches": 0,
+                "total_errors": 0,
+                "session_duration_seconds": 0
+            }
+        }
+        logger.info("Created initial session state for %s", self.session_id)
+        return initial_state
+    def load_state(self) -> Optional[Dict]:
+        """Load session state from file with improved error handling"""
+        if not self.state_file.exists():
+            logger.info(f"No existing session state found for {self.session_id}")
+            return None
+        try:
+            with open(self.state_file, 'r', encoding='utf-8') as f:
+                state = json.load(f)
+            logger.info(f"Loaded session state for {self.session_id} with {len(state.get('conversation_history', []))} messages")
+            return state
+        except json.JSONDecodeError as e:
+            logger.error(f"JSON corruption in session state for {self.session_id}: {str(e)}")
+            logger.info(f"Creating backup of corrupted file: {self.state_file}.corrupted")
+            try:
+                import shutil
+                shutil.copy2(self.state_file, str(self.state_file) + ".corrupted")
+                logger.info(f"Backup created successfully")
+            except Exception as backup_error:
+                logger.warning(f"Failed to create backup: {backup_error}")
+            return None
+        except Exception as e:
+            logger.error(f"Failed to load session state for {self.session_id}: {str(e)}")
+            return None
+    def save_state(self, state: Dict) -> bool:
+        """Save session state to file with improved error handling"""
+        try:
+            # Update last_updated timestamp
+            state["last_updated"] = datetime.datetime.now(datetime.timezone.utc).isoformat()
+            # Update session stats
+            if "session_stats" not in state:
+                state["session_stats"] = {}
+            created_at = datetime.datetime.fromisoformat(state["created_at"])
+            current_time = datetime.datetime.now(datetime.timezone.utc)
+            state["session_stats"]["session_duration_seconds"] = int((current_time - created_at).total_seconds())
+            state["session_stats"]["total_messages"] = len(state.get("conversation_history", []))
+            # Validate JSON serializability before writing
+            try:
+                json.dumps(state, ensure_ascii=False)
+            except (TypeError, ValueError) as e:
+                logger.error(f"State contains non-serializable data: {e}")
+                logger.info("Attempting to clean non-serializable data...")
+                state = self._clean_non_serializable_data(state)
+            # Write to temporary file first, then rename for atomic operation
+            temp_file = self.state_file.with_suffix('.tmp')
+            with open(temp_file, 'w', encoding='utf-8') as f:
+                json.dump(state, f, indent=2, ensure_ascii=False)
+            # Atomic rename
+            temp_file.replace(self.state_file)
+            logger.debug(f"Saved session state for {self.session_id} ({len(json.dumps(state))} characters)")
+            return True
+        except Exception as e:
+            logger.error(f"Failed to save session state for {self.session_id}: {str(e)}")
+            # Clean up temp file if it exists
+            temp_file = self.state_file.with_suffix('.tmp')
+            if temp_file.exists():
+                try:
+                    temp_file.unlink()
+                except Exception:
+                    pass
+            return False
+    def _clean_non_serializable_data(self, obj):
+        """Recursively clean non-serializable data from objects"""
+        if isinstance(obj, dict):
+            cleaned = {}
+            for key, value in obj.items():
+                try:
+                    json.dumps(value)
+                    cleaned[key] = self._clean_non_serializable_data(value)
+                except (TypeError, ValueError):
+                    logger.warning(f"Removing non-serializable field: {key}")
+                    cleaned[key] = f"<non-serializable: {type(value).__name__}>"
+            return cleaned
+        elif isinstance(obj, list):
+            cleaned = []
+            for item in obj:
+                try:
+                    json.dumps(item)
+                    cleaned.append(self._clean_non_serializable_data(item))
+                except (TypeError, ValueError):
+                    cleaned.append(f"<non-serializable: {type(item).__name__}>")
+            return cleaned
+        else:
+            return obj
+    def log_llm_interaction(self, state: Dict, request_data: Dict, response_data: Dict,
+                           model: str, turn: int) -> None:
+        """Log complete LLM API interaction"""
+        timestamp = datetime.datetime.now(datetime.timezone.utc).isoformat()
+        interaction = {
+            "timestamp": timestamp,
+            "turn": turn,
+            "model": model,
+            "request": {
+                "messages_count": len(request_data.get("messages", [])),
+                "tools_count": len(request_data.get("tools", [])),
+                "model": request_data.get("model"),
+                "tool_choice": request_data.get("tool_choice")
+            },
+            "response": {
+                "content": response_data.get("choices", [{}])[0].get("message", {}).get("content"),
+                "tool_calls": response_data.get("choices", [{}])[0].get("message", {}).get("tool_calls"),
+                "finish_reason": response_data.get("choices", [{}])[0].get("finish_reason"),
+                "usage": response_data.get("usage")
+            }
+        }
+        if "llm_interactions" not in state:
+            state["llm_interactions"] = []
+        state["llm_interactions"].append(interaction)
+        # Log Phoenix session information for easy debugging
+        logger.debug(f"Logged LLM interaction for turn {turn} in session {self.session_id}")
+        logger.debug(f"Phoenix session tracking: session_id={self.session_id}, turn={turn}, model={model}")
+        # Log usage information if available for monitoring
+        usage = response_data.get("usage")
+        if usage:
+            logger.info(f"Session {self.session_id} turn {turn}: "
+                       f"prompt_tokens={usage.get('prompt_tokens', 0)}, "
+                       f"completion_tokens={usage.get('completion_tokens', 0)}, "
+                       f"total_tokens={usage.get('total_tokens', 0)}")
+    def log_tool_execution(self, state: Dict, tool_call_id: str, tool_name: str,
+                          tool_args: Dict, result: str, execution_data: Any = None) -> None:
+        """Log tool execution with full details"""
+        timestamp = datetime.datetime.now(datetime.timezone.utc).isoformat()
+        # Safely serialize execution_data to prevent JSON corruption
+        safe_execution_data = None
+        if execution_data is not None:
+            try:
+                # Convert execution_data to a safe, serializable format
+                if hasattr(execution_data, '__dict__'):
+                    safe_execution_data = {
+                        "type": type(execution_data).__name__,
+                        "error": str(execution_data.error) if hasattr(execution_data, 'error') and execution_data.error else None,
+                        "has_results": hasattr(execution_data, 'results') and bool(execution_data.results),
+                        "has_stdout": hasattr(execution_data, 'logs') and hasattr(execution_data.logs, 'stdout') and bool(execution_data.logs.stdout),
+                        "has_stderr": hasattr(execution_data, 'logs') and hasattr(execution_data.logs, 'stderr') and bool(execution_data.logs.stderr)
+                    }
+                else:
+                    # For simple types, convert to string safely
+                    safe_execution_data = str(execution_data)[:200]  # Limit length
+            except Exception as e:
+                logger.warning(f"Failed to serialize execution_data for {tool_call_id}: {e}")
+                safe_execution_data = {"serialization_error": str(e)}
+        tool_execution = {
+            "timestamp": timestamp,
+            "tool_call_id": tool_call_id,
+            "tool_name": tool_name,
+            "arguments": tool_args,
+            "result_summary": result[:500] + "..." if len(result) > 500 else result,
+            "result_length": len(result),
+            "execution_data": safe_execution_data,
+            "success": execution_data is None or (hasattr(execution_data, 'error') and execution_data.error is None) if execution_data else True
+        }
+        if "tool_executions" not in state:
+            state["tool_executions"] = []
+        state["tool_executions"].append(tool_execution)
+        # Update stats
+        if tool_name == "add_and_execute_jupyter_code_cell":
+            state["session_stats"]["total_code_executions"] = state["session_stats"].get("total_code_executions", 0) + 1
+        elif tool_name == "web_search":
+            state["session_stats"]["total_searches"] = state["session_stats"].get("total_searches", 0) + 1
+        if not tool_execution["success"]:
+            state["session_stats"]["total_errors"] = state["session_stats"].get("total_errors", 0) + 1
+        logger.debug(f"Logged tool execution {tool_name} ({tool_call_id}) in session {self.session_id}")
+    def add_message(self, state: Dict, role: str, content: str,
+                   tool_calls: List = None, tool_call_id: str = None,
+                   raw_execution: Any = None, metadata: Dict = None) -> None:
+        """Add message to conversation history with full context"""
+        timestamp = datetime.datetime.now(datetime.timezone.utc).isoformat()
+        message = {
+            "role": role,
+            "content": content,
+            "timestamp": timestamp
+        }
+        if tool_calls:
+            message["tool_calls"] = tool_calls
+        if tool_call_id:
+            message["tool_call_id"] = tool_call_id
+        if raw_execution:
+            message["raw_execution"] = raw_execution
+        if metadata:
+            message["metadata"] = metadata
+        state["conversation_history"].append(message)
+        logger.debug(f"Added {role} message to session {self.session_id} conversation history")
+    def update_execution_state(self, state: Dict, **kwargs) -> None:
+        """Update execution state fields"""
+        for key, value in kwargs.items():
+            if key in state["execution_state"]:
+                state["execution_state"][key] = value
+                logger.debug(f"Updated execution state {key}={value} for session {self.session_id}")
+        # Try to sync with global EXECUTION_STATES for UI consistency (if available)
+        try:
+            import sys
+            if 'app' in sys.modules:
+                execution_states = getattr(sys.modules['app'], 'EXECUTION_STATES', None)
+                if execution_states and self.session_id in execution_states:
+                    for key, value in kwargs.items():
+                        execution_states[self.session_id][key] = value
+        except (ImportError, AttributeError):
+            pass  # Ignore if we can't sync with global state
+    def update_notebook_data(self, state: Dict, notebook_data: Dict) -> None:
+        """Update notebook data in session state"""
+        state["notebook_data"] = notebook_data
+        logger.debug(f"Updated notebook data for session {self.session_id} ({len(notebook_data.get('cells', []))} cells)")
+    def get_conversation_history(self, state: Dict) -> List[Dict]:
+        """Get conversation history suitable for LLM API calls"""
+        return state.get("conversation_history", [])
+    def validate_and_repair_conversation(self, state: Dict) -> None:
+        """Validate and repair conversation history to ensure tool calls have responses"""
+        conversation = state.get("conversation_history", [])
+        if not conversation:
+            return
+        pending_tool_calls = set()
+        valid_messages = []
+        for message in conversation:
+            if message.get("role") == "assistant" and message.get("tool_calls"):
+                # Track tool calls
+                for tool_call in message["tool_calls"]:
+                    pending_tool_calls.add(tool_call["id"])
+                valid_messages.append(message)
+            elif message.get("role") == "tool" and message.get("tool_call_id"):
+                # Remove from pending when we find a response
+                pending_tool_calls.discard(message["tool_call_id"])
+                valid_messages.append(message)
+            else:
+                # Regular message (system, user, assistant without tool calls)
+                valid_messages.append(message)
+        # If there are incomplete tool calls, remove the assistant messages that created them
+        if pending_tool_calls:
+            logger.warning(f"Found incomplete tool calls in conversation: {pending_tool_calls}")
+            logger.warning("Removing incomplete assistant messages to repair conversation")
+            repaired_messages = []
+            for message in valid_messages:
+                if (message.get("role") == "assistant" and
+                    message.get("tool_calls") and
+                    any(tc["id"] in pending_tool_calls for tc in message["tool_calls"])):
+                    logger.debug("Removing assistant message with incomplete tool calls")
+                    continue
+                repaired_messages.append(message)
+            # Update conversation history
+            state["conversation_history"] = repaired_messages
+            logger.info(f"Repaired conversation: {len(conversation)} -> {len(repaired_messages)} messages")
+            # Save the repaired state
+            self.save_state(state)
+    def session_exists(self) -> bool:
+        """Check if session state file exists"""
+        return self.state_file.exists()
+    def get_session_summary(self, state: Dict) -> str:
+        """Get human-readable session summary"""
+        stats = state.get("session_stats", {})
+        created = datetime.datetime.fromisoformat(state["created_at"])
+        return f"""Session {self.session_id}:
+- Created: {created.strftime('%Y-%m-%d %H:%M:%S UTC')}
+- Messages: {stats.get('total_messages', 0)}
+- Code Executions: {stats.get('total_code_executions', 0)}
+- Web Searches: {stats.get('total_searches', 0)}
+- Errors: {stats.get('total_errors', 0)}
+- Duration: {stats.get('session_duration_seconds', 0)}s
+- Hardware: {state.get('hardware_config', {}).get('gpu_type', 'unknown')}
+- Model: {state.get('api_config', {}).get('model_name', 'unknown')}"""
+def execute_code(sbx, code):
+    logger.debug(f"Executing code in sandbox ({len(code)} characters)")
+    execution = sbx.run_code(code, on_stdout=lambda data: logger.debug(f'stdout: {data}'))
+    output = ""
+    if len(execution.logs.stdout) > 0:
+        output += "\n".join(execution.logs.stdout)
+        logger.debug(f"Execution produced {len(execution.logs.stdout)} stdout lines")
+    if len(execution.logs.stderr) > 0:
+        output += "\n".join(execution.logs.stderr)
+        logger.debug(f"Execution produced {len(execution.logs.stderr)} stderr lines")
+    if execution.error is not None:
+        output += execution.error.traceback
+        logger.warning(f"Execution error: {execution.error.name}: {execution.error.value}")
+    logger.debug(f"Code execution completed, output length: {len(output)}")
+    return output, execution
+def parse_exec_result_llm(execution, max_code_output=1000):
+    logger.debug(f"Parsing execution result for LLM (max_output: {max_code_output})")
+    output = []
+    def truncate_if_needed(text):
+        if len(text) > max_code_output:
+            return (text[:max_code_output] + f"\n[Output is truncated as it is more than {max_code_output} characters]")
+        return text
+    if execution.results:
+        results_text_parts = []
+        plot_count = 0
+        for result in execution.results:
+            if hasattr(result, 'text') and result.text:
+                results_text_parts.append(result.text)
+            elif hasattr(result, 'png') and result.png:
+                plot_count += 1
+                results_text_parts.append(f"[Plot {plot_count} generated and displayed]")
+            elif hasattr(result, 'html') and result.html:
+                results_text_parts.append("[HTML output generated]")
+        if results_text_parts:
+            results_text = "\n".join(results_text_parts)
+            output.append(truncate_if_needed(results_text))
+        logger.debug(f"Added {len(execution.results)} execution results (including {plot_count} plots)")
+    if execution.logs.stdout:
+        stdout_text = "\n".join(execution.logs.stdout)
+        output.append(truncate_if_needed(stdout_text))
+        logger.debug(f"Added stdout output ({len(execution.logs.stdout)} lines)")
+    if execution.logs.stderr:
+        stderr_text = "\n".join(execution.logs.stderr)
+        output.append(truncate_if_needed(stderr_text))
+        logger.debug(f"Added stderr output ({len(execution.logs.stderr)} lines)")
+    if execution.error is not None:
+        output.append(truncate_if_needed(execution.error.traceback))
+        logger.debug(f"Added error traceback: {execution.error.name}")
+    final_output = "\n".join(output)
+    logger.debug(f"Parsed execution result for LLM: {len(final_output)} characters")
+    return final_output
+def clean_messages_for_api(messages):
+    """
+    Create a clean copy of messages without raw_execution fields and metadata for API calls.
+    Also validates that tool calls have corresponding tool responses.
+    This prevents 413 errors and API validation errors.
+    """
+    logger.debug(f"Cleaning {len(messages)} messages for API call")
+    cleaned_messages = []
+    raw_execution_count = 0
+    metadata_count = 0
+    pending_tool_calls = set()
+    for message in messages:
+        cleaned_message = message.copy()
+        # Remove raw_execution data
+        if "raw_execution" in cleaned_message:
+            cleaned_message.pop("raw_execution")
+            raw_execution_count += 1
+        # Remove metadata and timestamp
+        if "metadata" in cleaned_message:
+            cleaned_message.pop("metadata")
+            metadata_count += 1
+        if "timestamp" in cleaned_message:
+            cleaned_message.pop("timestamp")
+        # Track tool calls and responses for validation
+        if cleaned_message.get("role") == "assistant" and cleaned_message.get("tool_calls"):
+            for tool_call in cleaned_message["tool_calls"]:
+                pending_tool_calls.add(tool_call["id"])
+        elif cleaned_message.get("role") == "tool" and cleaned_message.get("tool_call_id"):
+            pending_tool_calls.discard(cleaned_message["tool_call_id"])
+        cleaned_messages.append(cleaned_message)
+    # If there are pending tool calls without responses, remove the assistant message with tool calls
+    if pending_tool_calls:
+        logger.warning(f"Found {len(pending_tool_calls)} tool calls without responses: {pending_tool_calls}")
+        logger.warning("Removing incomplete tool call messages to prevent API errors")
+        # Remove messages with incomplete tool calls
+        filtered_messages = []
+        for message in cleaned_messages:
+            if (message.get("role") == "assistant" and
+                message.get("tool_calls") and
+                any(tc["id"] in pending_tool_calls for tc in message["tool_calls"])):
+                logger.debug("Removing assistant message with incomplete tool calls")
+                continue
+            filtered_messages.append(message)
+        cleaned_messages = filtered_messages
+    logger.debug(f"Cleaned messages: removed raw_execution from {raw_execution_count}, metadata from {metadata_count}")
+    logger.debug(f"Final cleaned message count: {len(cleaned_messages)}")
+    return cleaned_messages
+def web_search(query):
+    """
+    Perform web search using Tavily API with automatic year addition and formatting.
+    Args:
+        query (str): Search query (max 400 characters)
+    Returns:
+        str: Formatted search results for LLM consumption
+    """
+    if not tavily_client:
+        logger.error("Tavily client not initialized - API key missing")
+        return "❌ Search unavailable: Tavily API key not configured"
+    # Validate query length
+    if len(query) > 400:
+        logger.warning(f"Query too long ({len(query)} chars), truncating to 400")
+        query = query[:400]
+    # Add current year to query for more recent results
+    current_year = datetime.datetime.now().year
+    if str(current_year) not in query:
+        # Only add year if query has room for it
+        year_addition = f" {current_year}"
+        if len(query + year_addition) <= 400:
+            query += year_addition
+            logger.debug(f"Added current year to query: {current_year}")
+    logger.info(f"Performing Tavily search: '{query}' ({len(query)} chars)")
+    try:
+        # Perform search with optimized parameters
+        response = tavily_client.search(
+            query=query,
+            search_depth="basic",  # Use basic for faster results
+            max_results=5,         # Limit results to avoid overwhelming context
+            include_answer=True,   # Include AI-generated answer
+            include_raw_content=False,  # Don't include raw content to save tokens
+            include_images=False   # Don't include images
+        )
+        logger.info(f"Search completed: {len(response.get('results', []))} results found")
+        # Format results for LLM consumption
+        formatted_results = format_search_results_for_llm(response)
+        logger.debug(f"Formatted search results: {len(formatted_results)} characters")
+        return formatted_results
+    except Exception as e:
+        logger.error(f"Tavily search failed: {str(e)}")
+        return f"❌ Search failed: {str(e)}"
+def format_search_results_for_llm(response):
+    """Format Tavily search results for LLM consumption"""
+    query = response.get('query', 'Unknown query')
+    results = response.get('results', [])
+    answer = response.get('answer', '')
+    formatted = f"🔍 **Web Search Results for:** {query}\n\n"
+    if answer:
+        formatted += f"**Quick Answer:** {answer}\n\n"
+    if results:
+        formatted += f"**Found {len(results)} relevant sources:**\n\n"
+        for i, result in enumerate(results, 1):
+            title = result.get('title', 'Untitled')
+            url = result.get('url', '')
+            content = result.get('content', '')
+            score = result.get('score', 0)
+            # Truncate content to reasonable length
+            # if len(content) > 300:
+            #     content = content[:300] + "..."
+            formatted += f"**{i}. {title}** (Relevance: {score:.2f})\n"
+            formatted += f"   🔗 {url}\n"
+            formatted += f"   📄 {content}\n\n"
+    else:
+        formatted += "No results found.\n"
+    return formatted
+def run_interactive_notebook_with_session_state(client, model, session_state_manager, session_state, sbx, stop_event=None, tools=None):
+    logger.info(f"Starting interactive notebook with session state for {session_state_manager.session_id}")
+    # Get conversation history from session state
+    messages = session_state_manager.get_conversation_history(session_state)
+    notebook = JupyterNotebook(messages)
+    # Update execution state
+    session_state_manager.update_execution_state(session_state, is_running=True, sandbox_active=True, current_phase="initializing")
+    # Use provided tools or default to all tools
+    if tools is None:
+        tools = TOOLS
+    try:
+        sbx_info = sbx.get_info()
+        notebook.add_sandbox_countdown(sbx_info.started_at, sbx_info.end_at)
+        # Store sandbox info in session state
+        session_state["execution_state"]["sandbox_info"] = {
+            "started_at": sbx_info.started_at.isoformat(),
+            "end_at": sbx_info.end_at.isoformat(),
+            "timeout_seconds": int((sbx_info.end_at - sbx_info.started_at).total_seconds())
+        }
+        logger.debug(f"Added sandbox countdown: {sbx_info.started_at} to {sbx_info.end_at}")
+    except Exception as e:
+        logger.warning(f"Failed to get sandbox info: {str(e)}")
+    logger.debug("Initial notebook yield in 'generating' mode")
+    # Update notebook data in session state
+    session_state_manager.update_notebook_data(session_state, notebook.data)
+    # Save initial state
+    session_state_manager.save_state(session_state)
+    yield notebook.render(mode="generating"), notebook.data, messages
+    max_code_output = 1000
+    turns = session_state["execution_state"]["current_turn"]
+    done = False
+    previous_execution_had_error = False
+    previous_execution_had_warnings = False
+    logger.info(f"Starting interactive loop from turn {turns} with max_output={max_code_output}, max_turns={MAX_TURNS}")
+    while not done and (turns <= MAX_TURNS) and (stop_event is None or not stop_event.is_set()):
+        turns += 1
+        logger.info(f"Starting turn {turns}/{MAX_TURNS}")
+        try:
+            # Update phase to generating
+            session_state_manager.update_execution_state(session_state, current_phase="generating")
+            # Refresh messages from session state before API call
+            messages = session_state_manager.get_conversation_history(session_state)
+            logger.debug(f"Making API call to {model} with {len(messages)} messages")
+            # Prepare request data for logging
+            request_data = {
+                "messages": clean_messages_for_api(messages),
+                "model": model,
+                "tools": tools,
+                "tool_choice": "auto"
+            }
+            # Prepare session metadata for Phoenix tracing
+            session_metadata = {
+                "turn": turns,
+                "max_turns": MAX_TURNS,
+                "model": model,
+                "tools_count": len(tools),
+                "messages_count": len(messages),
+                "current_phase": "generating"
+            }
+            # Add hardware config if available
+            if "hardware_config" in session_state:
+                hw_config = session_state["hardware_config"]
+                session_metadata.update({
+                    "gpu_type": hw_config.get("gpu_type", "unknown"),
+                    "cpu_cores": hw_config.get("cpu_cores", "unknown"),
+                    "memory_gb": hw_config.get("memory_gb", "unknown")
+                })
+            # Wrap OpenAI API call with Phoenix session context for proper grouping
+            with create_phoenix_session_context(
+                session_id=session_state_manager.session_id,
+                user_id=None,  # Could be extracted from request context if available
+                metadata=session_metadata
+            ):
+                logger.debug(f"Making OpenAI API call with Phoenix session context: {session_state_manager.session_id}")
+                response = client.chat.completions.create(**request_data)
+                logger.debug("API call successful within Phoenix session context")
+            # Log the complete LLM interaction
+            session_state_manager.log_llm_interaction(
+                session_state, request_data, response.model_dump(), model, turns
+            )
+        except Exception as e:
+            # Handle inference client errors
+            logger.error(f"Inference failed on turn {turns}: {str(e)}")
+            # Add detailed error information to the notebook
+            error_message = str(e)
+            if "429" in error_message or "too_many_requests" in error_message.lower():
+                detailed_error = f"""**API Rate Limit Exceeded** 🚫
+The inference service has reached its rate limit. This typically means:
+- Too many requests have been sent in a short period
+- Daily quota has been exceeded
+- Service is temporarily overloaded
+**What you can try:**
+- Wait a few minutes and try again
+- If using Cerebras API, check your daily quota
+- Try using a different model or service
+- Contact support if the issue persists
+**Technical details:**
+```
+{error_message}
+```"""
+            elif "401" in error_message or "unauthorized" in error_message.lower():
+                detailed_error = f"""**Authentication Error** 🔐
+There's an issue with API authentication:
+- API key might be missing or invalid
+- API key might have expired
+- Insufficient permissions
+**Technical details:**
+```
+{error_message}
+```"""
+            elif "500" in error_message or "internal" in error_message.lower():
+                detailed_error = f"""**Server Error** 🔧
+The inference service encountered an internal error:
+- Service might be temporarily unavailable
+- Try again in a few moments
+- If the issue persists, it's likely a service-side problem
+**Technical details:**
+```
+{error_message}
+```"""
+            else:
+                detailed_error = f"""**Inference Service Error** ⚠️
+An error occurred while communicating with the AI service:
+**Technical details:**
+```
+{error_message}
+```
+**What you can try:**
+- Check your internet connection
+- Try again in a few moments
+- If the problem persists, contact support"""
+            notebook.add_error(detailed_error)
+            # Add error to session state
+            session_state_manager.add_message(
+                session_state, "assistant", detailed_error,
+                metadata={"type": "error", "error_type": "api_error", "turn": turns}
+            )
+            # Update execution state
+            session_state_manager.update_execution_state(
+                session_state, is_running=False, last_execution_successful=False
+            )
+            # Update notebook data and save state
+            session_state_manager.update_notebook_data(session_state, notebook.data)
+            session_state_manager.save_state(session_state)
+            yield notebook.render(mode="error"), notebook.data, messages
+            return
+        # Get the response content and tool calls
+        full_response = response.choices[0].message.content or ""
+        tool_calls = response.choices[0].message.tool_calls or []
+        logger.debug(f"Turn {turns}: Response content length: {len(full_response)}, Tool calls: {len(tool_calls)}")
+        # Add markdown cell for assistant's thinking
+        if full_response.strip():
+            logger.debug(f"Adding assistant response as markdown ({len(full_response)} chars)")
+            notebook.add_markdown(full_response, "assistant")
+        else:
+            logger.debug("Skipping empty assistant response")
+        # Handle tool calls and add assistant message to session state only
+        if tool_calls:
+            logger.info(f"Processing {len(tool_calls)} tool calls on turn {turns}")
+            # Add assistant message to session state (messages will be derived from this)
+            session_state_manager.add_message(
+                session_state, "assistant", full_response,
+                tool_calls=[{
+                    "id": tc.id,
+                    "type": "function",
+                    "function": {"name": tc.function.name, "arguments": tc.function.arguments}
+                } for tc in tool_calls],
+                metadata={"turn": turns, "type": "thinking"}
+            )
+            logger.debug(f"Added assistant message with {len(tool_calls)} tool calls to session state")
+        elif full_response.strip():
+            # If no tool calls but we have content, add regular assistant message
+            session_state_manager.add_message(
+                session_state, "assistant", full_response,
+                metadata={"turn": turns, "type": "thinking"}
+            )
+            logger.debug("Added regular assistant message to session state")
+        for i, tool_call in enumerate(tool_calls):
+            logger.debug(f"Processing tool call {i+1}/{len(tool_calls)}: {tool_call.function.name}")
+            if tool_call.function.name == "add_and_execute_jupyter_code_cell":
+                # Update phase to executing code
+                session_state_manager.update_execution_state(session_state, current_phase="executing_code")
+                logger.debug(f"Processing code execution tool call: {tool_call.id}")
+                tool_args = json.loads(tool_call.function.arguments)
+                code = tool_args["code"]
+                logger.debug(f"Code to execute: {len(code)} characters")
+                # Determine if we should reuse the last cell or create a new one
+                # Reuse if there were errors (not just warnings) in the previous execution
+                should_reuse_cell = (previous_execution_had_error and
+                                   notebook.get_last_cell_type() == "code")
+                if should_reuse_cell:
+                    logger.info("Reusing last code cell due to previous execution error")
+                    # Update the existing cell's code instead of creating a new one
+                    notebook.update_last_code_cell(code)
+                else:
+                    logger.debug("Creating new code cell")
+                    # Create a new cell (normal behavior)
+                    notebook.add_code(code)
+                logger.debug("Yielding notebook in 'executing' mode")
+                yield notebook.render(mode="executing"), notebook.data, messages
+                try:
+                    # Check for stop event before execution
+                    if stop_event and stop_event.is_set():
+                        logger.info("Stop event detected before code execution")
+                        stopped_message = """**Execution Stopped** ⏸️
+The execution was stopped by user request before the code could run."""
+                        notebook.add_markdown(stopped_message, "assistant")
+                        yield notebook.render(mode="stopped"), notebook.data, messages
+                        return
+                    # Execution sandbox call - might timeout
+                    logger.info("Executing code in sandbox")
+                    execution = sbx.run_code(code)
+                    notebook.append_execution(execution)
+                    # Update error and warning tracking for next iteration
+                    previous_execution_had_error = notebook.has_execution_error(execution)
+                    previous_execution_had_warnings = notebook.has_execution_warnings(execution)
+                    # Log tool execution in session state
+                    tool_args = json.loads(tool_call.function.arguments)
+                    tool_response_content = parse_exec_result_llm(execution, max_code_output=max_code_output)
+                    session_state_manager.log_tool_execution(
+                        session_state, tool_call.id, "add_and_execute_jupyter_code_cell",
+                        tool_args, tool_response_content, execution
+                    )
+                    if previous_execution_had_error:
+                        logger.warning("Code execution resulted in error")
+                    elif previous_execution_had_warnings:
+                        logger.info("Code execution completed with warnings")
+                    else:
+                        logger.info("Code execution completed successfully")
+                except Exception as e:
+                    # Handle sandbox timeout/execution errors
+                    logger.error(f"Code execution failed: {str(e)}")
+                    # Add detailed error information for code execution failures
+                    error_message = str(e)
+                    if "timeout" in error_message.lower():
+                        detailed_error = f"""**Code Execution Timeout** ⏰
+The code execution took too long and was terminated:
+- Code may have entered an infinite loop
+- Processing large datasets can cause timeouts
+- Complex computations may exceed time limits
+**What you can try:**
+- Optimize your code for better performance
+- Break down complex operations into smaller steps
+- Increase the timeout limit in settings
+- Check for infinite loops or blocking operations
+**Technical details:**
+```
+{error_message}
+```"""
+                    else:
+                        detailed_error = f"""**Code Execution Failed** 💥
+An error occurred while executing the code in the sandbox:
+**Technical details:**
+```
+{error_message}
+```
+**What you can try:**
+- Check the code for syntax errors
+- Verify all required packages are available
+- Try simplifying the code
+- Check the sandbox logs for more details"""
+                    notebook.add_error(detailed_error)
+                    yield notebook.render(mode="error"), notebook.data, messages
+                    return
+                # Prepare tool response (already computed above)
+                raw_execution = notebook.parse_exec_result_nb(execution)
+                logger.debug(f"Tool response: {len(tool_response_content)} chars content, {len(raw_execution)} raw outputs")
+                # Add tool response to session state only
+                session_state_manager.add_message(
+                    session_state, "tool", tool_response_content,
+                    tool_call_id=tool_call.id, raw_execution=raw_execution,
+                    metadata={"turn": turns, "execution_successful": not previous_execution_had_error}
+                )
+            elif tool_call.function.name == "web_search":
+                # Update phase to searching
+                session_state_manager.update_execution_state(session_state, current_phase="searching")
+                logger.debug(f"Processing search tool call: {tool_call.id}")
+                tool_args = json.loads(tool_call.function.arguments)
+                query = tool_args["query"]
+                logger.debug(f"Search query: '{query}' ({len(query)} chars)")
+                # Add search status to notebook
+                notebook.add_markdown("🔍 **Searching the web...**", "assistant")
+                yield notebook.render(mode="generating"), notebook.data, messages
+                try:
+                    # Perform search
+                    search_results = web_search(query)
+                    logger.info("Search completed successfully")
+                    # Log search tool execution
+                    tool_args = json.loads(tool_call.function.arguments)
+                    session_state_manager.log_tool_execution(
+                        session_state, tool_call.id, "web_search",
+                        tool_args, search_results
+                    )
+                    # Add search results to notebook
+                    notebook.add_markdown(search_results, "assistant")
+                    # Add tool response to session state only
+                    session_state_manager.add_message(
+                        session_state, "tool", search_results,
+                        tool_call_id=tool_call.id,
+                        metadata={"turn": turns, "search_successful": True}
+                    )
+                except Exception as e:
+                    error_message = f"❌ Search failed: {str(e)}"
+                    logger.error(f"Search tool call failed: {str(e)}")
+                    # Log failed search
+                    tool_args = json.loads(tool_call.function.arguments)
+                    session_state_manager.log_tool_execution(
+                        session_state, tool_call.id, "web_search",
+                        tool_args, error_message
+                    )
+                    # Add error to notebook
+                    notebook.add_markdown(error_message, "assistant")
+                    # Add error response to session state only
+                    session_state_manager.add_message(
+                        session_state, "tool", error_message,
+                        tool_call_id=tool_call.id,
+                        metadata={"turn": turns, "search_successful": False, "error": str(e)}
+                    )
+            elif tool_call.function.name == "edit_and_execute_current_cell":
+                # Update phase to executing code
+                session_state_manager.update_execution_state(session_state, current_phase="executing_code")
+                logger.debug(f"Processing edit current cell tool call: {tool_call.id}")
+                tool_args = json.loads(tool_call.function.arguments)
+                code = tool_args["code"]
+                logger.debug(f"Code to execute in current cell: {len(code)} characters")
+                # Check if we have a code cell to edit
+                if notebook.get_last_cell_type() == "code":
+                    logger.info("Editing last code cell with new code")
+                    notebook.update_last_code_cell(code)
+                else:
+                    logger.info("No code cell to edit, creating new cell")
+                    notebook.add_code(code)
+                logger.debug("Yielding notebook in 'executing' mode")
+                yield notebook.render(mode="executing"), notebook.data, messages
+                try:
+                    # Check for stop event before execution
+                    if stop_event and stop_event.is_set():
+                        logger.info("Stop event detected before code execution")
+                        stopped_message = """**Execution Stopped** ⏸️
+The execution was stopped by user request before the code could run."""
+                        notebook.add_markdown(stopped_message, "assistant")
+                        yield notebook.render(mode="stopped"), notebook.data, messages
+                        return
+                    # Execution sandbox call - might timeout
+                    logger.info("Executing edited code in sandbox")
+                    execution = sbx.run_code(code)
+                    notebook.append_execution(execution)
+                    # Update error and warning tracking for next iteration
+                    previous_execution_had_error = notebook.has_execution_error(execution)
+                    previous_execution_had_warnings = notebook.has_execution_warnings(execution)
+                    # Log tool execution in session state
+                    tool_response_content = parse_exec_result_llm(execution, max_code_output=max_code_output)
+                    session_state_manager.log_tool_execution(
+                        session_state, tool_call.id, "edit_and_execute_current_cell",
+                        tool_args, tool_response_content, execution
+                    )
+                    if previous_execution_had_error:
+                        logger.warning("Edited code execution resulted in error")
+                    elif previous_execution_had_warnings:
+                        logger.info("Edited code execution completed with warnings")
+                    else:
+                        logger.info("Edited code execution completed successfully")
+                except Exception as e:
+                    # Handle sandbox timeout/execution errors
+                    logger.error(f"Edited code execution failed: {str(e)}")
+                    # Add detailed error information for code execution failures
+                    error_message = str(e)
+                    if "timeout" in error_message.lower():
+                        detailed_error = f"""**Code Execution Timeout** ⏰
+The edited code execution took too long and was terminated:
+- Code may have entered an infinite loop
+- Processing large datasets can cause timeouts
+- Complex computations may exceed time limits
+**What you can try:**
+- Optimize your code for better performance
+- Break down complex operations into smaller steps
+- Increase the timeout limit in settings
+- Check for infinite loops or blocking operations
+**Technical details:**
+```
+{error_message}
+```"""
+                    else:
+                        detailed_error = f"""**Code Execution Failed** 💥
+An error occurred while executing the edited code in the sandbox:
+**Technical details:**
+```
+{error_message}
+```
+**What you can try:**
+- Check the code for syntax errors
+- Verify all required packages are available
+- Try simplifying the code
+- Check the sandbox logs for more details"""
+                    notebook.add_error(detailed_error)
+                    yield notebook.render(mode="error"), notebook.data, messages
+                    return
+                # Prepare tool response
+                raw_execution = notebook.parse_exec_result_nb(execution)
+                logger.debug(f"Tool response: {len(tool_response_content)} chars content, {len(raw_execution)} raw outputs")
+                # Add tool response to session state only
+                session_state_manager.add_message(
+                    session_state, "tool", tool_response_content,
+                    tool_call_id=tool_call.id, raw_execution=raw_execution,
+                    metadata={"turn": turns, "execution_successful": not previous_execution_had_error, "action": "edit_cell"}
+                )
+            elif tool_call.function.name == "execute_shell_command":
+                # Update phase to executing shell command
+                session_state_manager.update_execution_state(session_state, current_phase="executing_shell")
+                logger.debug(f"Processing shell command tool call: {tool_call.id}")
+                tool_args = json.loads(tool_call.function.arguments)
+                command = tool_args["command"]
+                logger.debug(f"Shell command to execute: '{command}'")
+                # Add shell command to notebook with special styling
+                notebook.add_shell_command(command)
+                logger.debug("Yielding notebook in 'executing' mode")
+                yield notebook.render(mode="executing"), notebook.data, messages
+                try:
+                    # Check for stop event before execution
+                    if stop_event and stop_event.is_set():
+                        logger.info("Stop event detected before shell execution")
+                        stopped_message = """**Execution Stopped** ⏸️
+The execution was stopped by user request before the shell command could run."""
+                        notebook.add_markdown(stopped_message, "assistant")
+                        yield notebook.render(mode="stopped"), notebook.data, messages
+                        return
+                    # Execute shell command in sandbox using raw shell execution
+                    logger.info(f"Executing raw shell command in sandbox: {command}")
+                    try:
+                        # Use the new raw shell execution method
+                        if hasattr(sbx, 'run_shell'):
+                            shell_execution = sbx.run_shell(command, timeout=60)
+                            logger.info("Shell command executed using raw shell method")
+                        else:
+                            # Fallback: Execute shell command using Python subprocess within sandbox
+                            shell_code = f"""
+import subprocess
+import sys
+try:
+    result = subprocess.run(
+        {repr(command)},
+        shell=True,
+        capture_output=True,
+        text=True,
+        timeout=60
+    )
+    if result.stdout:
+        print("STDOUT:")
+        print(result.stdout)
+    if result.stderr:
+        print("STDERR:")
+        print(result.stderr)
+    print(f"Exit code: {{result.returncode}}")
+except subprocess.TimeoutExpired:
+    print("Error: Command timed out after 60 seconds")
+except Exception as e:
+    print(f"Error executing command: {{e}}")
+"""
+                            shell_execution = sbx.run_code(shell_code)
+                            logger.info("Shell command executed via Python subprocess fallback")
+                        # Add shell execution results to notebook
+                        notebook.append_shell_execution(shell_execution)
+                        # Prepare response content for LLM
+                        shell_response_content = parse_exec_result_llm(shell_execution, max_code_output=max_code_output)
+                        # Log tool execution in session state
+                        session_state_manager.log_tool_execution(
+                            session_state, tool_call.id, "execute_shell_command",
+                            tool_args, shell_response_content, shell_execution
+                        )
+                        # Check for errors
+                        shell_had_error = notebook.has_execution_error(shell_execution)
+                        if shell_had_error:
+                            logger.warning("Shell command execution resulted in error")
+                        else:
+                            logger.info("Shell command execution completed successfully")
+                    except Exception as shell_error:
+                        logger.error(f"Shell command execution failed: {str(shell_error)}")
+                        # Create error message
+                        detailed_error = f"""**Shell Command Failed** 🔧
+An error occurred while executing the shell command:
+**Command:** `{command}`
+**Technical details:**
+```
+{str(shell_error)}
+```
+**What you can try:**
+- Check if the command exists in the sandbox environment
+- Verify command syntax
+- Try a simpler version of the command
+- Check if required tools/packages are installed"""
+                        notebook.add_error(detailed_error)
+                        # Log failed execution
+                        session_state_manager.log_tool_execution(
+                            session_state, tool_call.id, "execute_shell_command",
+                            tool_args, detailed_error
+                        )
+                        yield notebook.render(mode="error"), notebook.data, messages
+                        return
+                except Exception as e:
+                    # Handle general execution errors
+                    logger.error(f"Shell command execution failed: {str(e)}")
+                    detailed_error = f"""**Shell Execution Error** ⚠️
+An unexpected error occurred while executing the shell command:
+**Command:** `{command}`
+**Technical details:**
+```
+{str(e)}
+```"""
+                    notebook.add_error(detailed_error)
+                    yield notebook.render(mode="error"), notebook.data, messages
+                    return
+                # Prepare tool response for LLM and session state
+                raw_execution = notebook.parse_exec_result_nb(shell_execution)
+                logger.debug(f"Shell tool response: {len(shell_response_content)} chars content")
+                # Add tool response to session state
+                session_state_manager.add_message(
+                    session_state, "tool", shell_response_content,
+                    tool_call_id=tool_call.id, raw_execution=raw_execution,
+                    metadata={"turn": turns, "command": command, "execution_successful": not shell_had_error, "action": "shell_command"}
+                )
+            else:
+                logger.warning(f"Unknown tool call function: {tool_call.function.name}")
+        if not tool_calls:
+            logger.info(f"No tool calls on turn {turns}, conversation ending")
+            if len(full_response.strip())==0:
+                logger.error("Assistant provided no content and no tool calls")
+                notebook.add_error(f"No tool call and empty assistant response:\n{response.model_dump_json(indent=2)}")
+            # Only add the final assistant message if we didn't already add it above
+            # (in the elif full_response.strip() block)
+            if full_response.strip():
+                # Since we're now only using session state, we can safely add the message
+                # The session state manager will handle any deduplication if needed
+                session_state_manager.add_message(
+                    session_state, "assistant", full_response,
+                    metadata={"turn": turns, "type": "final_response"}
+                )
+                logger.debug("Added final assistant response to session state")
+            done = True
+        # Update session state after each turn
+        session_state_manager.update_execution_state(
+            session_state, current_turn=turns, last_execution_successful=not previous_execution_had_error
+        )
+        session_state_manager.update_notebook_data(session_state, notebook.data)
+        session_state_manager.save_state(session_state)
+        if done:
+            logger.info(f"Interactive notebook completed after {turns} turns")
+            session_state_manager.update_execution_state(
+                session_state, is_running=False, sandbox_active=True
+            )
+            session_state_manager.save_state(session_state)
+            yield notebook.render(mode="done"), notebook.data, messages
+        else:
+            logger.debug(f"Turn {turns} completed, yielding in 'generating' mode")
+            yield notebook.render(mode="generating"), notebook.data, messages
+    if turns > MAX_TURNS:
+        logger.warning(f"Interactive notebook reached maximum turns ({MAX_TURNS})")
+        error_msg = f"**Maximum Turns Reached** 🔄\n\nThe conversation has reached the maximum number of turns ({MAX_TURNS}). This is a safety limit to prevent infinite loops.\n\n**What you can try:**\n- Start a new conversation\n- Clear the notebook and begin fresh\n- Contact support if you need a higher turn limit"
+        notebook.add_error(error_msg)
+        # Add error to session state
+        session_state_manager.add_message(
+            session_state, "assistant", error_msg,
+            metadata={"type": "error", "error_type": "max_turns_exceeded", "turn": turns}
+        )
+        # Update final state
+        session_state_manager.update_execution_state(
+            session_state, is_running=False, last_execution_successful=False
+        )
+        session_state_manager.update_notebook_data(session_state, notebook.data)
+        session_state_manager.save_state(session_state)
+        yield notebook.render(mode="error"), notebook.data, messages
+    elif stop_event and stop_event.is_set():
+        logger.info("Interactive notebook stopped by user")
+        # Add a stopped message to the notebook
+        stopped_message = """**Execution Stopped** ⏸️
+The execution was stopped by user request. You can resume by clicking Run again."""
+        notebook.add_markdown(stopped_message, "assistant")
+        # Add stopped message to session state
+        session_state_manager.add_message(
+            session_state, "assistant", stopped_message,
+            metadata={"type": "status", "status_type": "stopped_by_user", "turn": turns}
+        )
+        # Update state to indicate pause
+        session_state_manager.update_execution_state(
+            session_state, is_running=False, is_paused=True
+        )
+        session_state_manager.update_notebook_data(session_state, notebook.data)
+        session_state_manager.save_state(session_state)
+        yield notebook.render(mode="stopped"), notebook.data, messages
+def run_interactive_notebook(client, model, messages, sbx, stop_event=None, tools=None):
+    """Backward compatibility wrapper for the new session state system"""
+    logger.warning("Using legacy run_interactive_notebook - this should be replaced with session state version")
+    # Create a temporary session for backward compatibility
+    import uuid
+    temp_session_id = str(uuid.uuid4())[:8]
+    session_manager = SessionStateManager(temp_session_id)
+    # Create basic session state
+    session_state = session_manager.create_initial_state(
+        hardware_config={"gpu_type": "unknown", "cpu_cores": 2, "memory_gb": 8, "timeout_sec": 300},
+        api_config={"model_name": model, "provider_type": "unknown"},
+        environment={"variables": "", "files_uploaded": []},
+        system_prompt=messages[0].get("content", "") if messages and messages[0].get("role") == "system" else ""
+    )
+    # Initialize conversation history with provided messages
+    session_state["conversation_history"] = messages
+    # Use the new session-based function
+    yield from run_interactive_notebook_with_session_state(
+        client, model, session_manager, session_state, sbx, stop_event, tools
+    )

jupyter_handler.py ADDED Viewed

	@@ -0,0 +1,1161 @@

+import nbformat
+from nbconvert import HTMLExporter
+from traitlets.config import Config
+import json
+import copy
+from jinja2 import DictLoader
+import datetime
+import logging
+# Configure logging for jupyter_handler module
+logger = logging.getLogger(__name__)
+system_template = """\
+<details>
+  <summary style="display: flex; align-items: center; cursor: pointer; margin-bottom: 12px;">
+    <h3 style="color: #374151; margin: 0; margin-right: 8px; font-size: 14px; font-weight: 600;">System</h3>
+    <span class="arrow" style="margin-right: 12px; font-size: 12px;">▶</span>
+    <div style="flex: 1; height: 2px; background-color: #374151;"></div>
+  </summary>
+  <div style="margin-top: 8px; padding: 8px; background-color: #f9fafb; border-radius: 4px; border-left: 3px solid #374151; margin-bottom: 16px;">
+    {}
+  </div>
+</details>
+<style>
+details > summary .arrow {{
+  display: inline-block;
+  transition: transform 0.2s;
+}}
+details[open] > summary .arrow {{
+  transform: rotate(90deg);
+}}
+details > summary {{
+  list-style: none;
+}}
+details > summary::-webkit-details-marker {{
+  display: none;
+}}
+</style>
+"""
+user_template = """\
+<div style="display: flex; align-items: center; margin-bottom: 12px;">
+    <h3 style="color: #166534; margin: 0; margin-right: 12px; font-size: 14px; font-weight: 600;">User</h3>
+    <div style="flex: 1; height: 2px; background-color: #166534;"></div>
+</div>
+<div style="margin-bottom: 16px;">{}</div>"""
+assistant_thinking_template = """\
+<div style="display: flex; align-items: center; margin-bottom: 12px;">
+    <h3 style="color: #1d5b8e; margin: 0; margin-right: 12px; font-size: 14px; font-weight: 600;">Assistant</h3>
+    <div style="flex: 1; height: 2px; background-color: #1d5b8e;"></div>
+</div>
+<div style="margin-bottom: 16px;">{}</div>"""
+assistant_final_answer_template = """<div class="alert alert-block alert-warning">
+<b>Assistant:</b> Final answer: {}
+</div>
+"""
+web_search_template = """
+<details style="margin-bottom: 16px; border: 1px solid #e1e5e9; border-radius: 6px; background-color: #f8f9fa;">
+  <summary style="display: flex; align-items: center; cursor: pointer; padding: 12px; background-color: #e3f2fd; border-radius: 6px 6px 0 0; margin: 0;">
+    <h4 style="color: #1976d2; margin: 0; margin-right: 8px; font-size: 14px; font-weight: 600;">🔍 Web Search Results</h4>
+    <span class="search-arrow" style="margin-left: auto; font-size: 12px; transition: transform 0.2s;">▼</span>
+  </summary>
+  <div style="padding: 16px; background-color: #ffffff;">
+    <div style="margin-bottom: 12px; padding: 8px; background-color: #f0f7ff; border-radius: 4px; border-left: 3px solid #2196f3;">
+      <strong style="color: #1976d2;">Query:</strong> <em>{query}</em>
+    </div>
+    {quick_answer}
+    <div style="margin-top: 16px;">
+      <h5 style="color: #424242; font-size: 13px; margin-bottom: 12px; font-weight: 600;">📚 Sources:</h5>
+      {sources}
+    </div>
+  </div>
+</details>
+<style>
+details[open] > summary .search-arrow {{
+  transform: rotate(180deg);
+}}
+details > summary {{
+  list-style: none;
+}}
+details > summary::-webkit-details-marker {{
+  display: none;
+}}
+.source-item {{
+  margin-bottom: 8px;
+  padding: 8px;
+  background-color: #f9f9f9;
+  border-radius: 4px;
+  border-left: 2px solid #4caf50;
+}}
+.source-title {{
+  font-weight: 600;
+  color: #2e7d32;
+  font-size: 13px;
+  margin-bottom: 4px;
+}}
+.source-url {{
+  color: #666;
+  font-size: 11px;
+  text-decoration: none;
+  word-break: break-all;
+}}
+.source-url:hover {{
+  color: #1976d2;
+  text-decoration: underline;
+}}
+.relevance-score {{
+  display: inline-block;
+  background-color: #e8f5e8;
+  color: #2e7d32;
+  padding: 2px 6px;
+  border-radius: 12px;
+  font-size: 10px;
+  font-weight: 600;
+  margin-left: 8px;
+}}
+.quick-answer {{
+  background-color: #fff8e1;
+  border-left: 3px solid #ffc107;
+  padding: 12px;
+  margin-bottom: 16px;
+  border-radius: 4px;
+}}
+.quick-answer-title {{
+  color: #f57c00;
+  font-weight: 600;
+  font-size: 13px;
+  margin-bottom: 6px;
+}}
+</style>
+"""
+header_message = """<div style="text-align: center; padding: 24px 16px; margin-bottom: 24px;">
+  <h1 style="color: #1e3a8a; font-size: 48px; font-weight: 700; margin: 0 0 8px 0; font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, sans-serif;">
+    🔬 Eureka Agent
+  </h1>
+    <p style="color: #6b7280; font-size: 11px; margin: 0; display: flex; align-items: center; justify-content: center; gap: 6px;">
+    <span>
+        Built on top of
+        <a href="https://huggingface.co/spaces/lvwerra/jupyter-agent-2" target="_blank" style="color: #6b7280; text-decoration: underline;">
+        Jupyter Agent 2
+        </a>
+    </span>
+    </p>
+</div>
+"""
+shell_command_template = """
+<div style="background: linear-gradient(135deg, #0f172a 0%, #1e293b 100%); border-radius: 8px; margin: 16px 0; box-shadow: 0 4px 12px rgba(0,0,0,0.3); border: 1px solid #334155;">
+    <!-- Terminal Header -->
+    <div style="background: linear-gradient(90deg, #374151 0%, #4b5563 100%); padding: 8px 12px; border-radius: 8px 8px 0 0; border-bottom: 1px solid #6b7280; display: flex; align-items: center; gap: 6px;">
+        <div style="width: 12px; height: 12px; background: #ef4444; border-radius: 50%;"></div>
+        <div style="width: 12px; height: 12px; background: #f59e0b; border-radius: 50%;"></div>
+        <div style="width: 12px; height: 12px; background: #10b981; border-radius: 50%;"></div>
+        <span style="color: #d1d5db; font-size: 12px; font-weight: 500; margin-left: 12px; font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', sans-serif;">Terminal</span>
+    </div>
+    <!-- Command Area -->
+    <div style="padding: 16px; background-color: #0f172a;">
+        <div style="display: flex; align-items: center; margin-bottom: 4px;">
+            <span style="color: #22d3ee; font-family: 'SF Mono', 'Monaco', 'Inconsolata', 'Roboto Mono', monospace; font-size: 14px; font-weight: 600; margin-right: 8px;">$</span>
+            <span style="color: #e2e8f0; font-family: 'SF Mono', 'Monaco', 'Inconsolata', 'Roboto Mono', monospace; font-size: 14px; line-height: 1.4;">{}</span>
+        </div>
+    </div>
+</div>
+"""
+shell_output_template = """
+<div style="background: linear-gradient(135deg, #111827 0%, #1f2937 100%) !important; border-radius: 8px; margin: 8px 0 16px 0; box-shadow: 0 2px 8px rgba(0,0,0,0.2); border: 1px solid #374151;">
+    <div style="padding: 16px; background-color: #111827 !important; border-radius: 8px;">
+        <pre style="margin: 0 !important; color: #f1f5f9 !important; background-color: #111827 !important; font-family: 'SF Mono', 'Monaco', 'Inconsolata', 'Roboto Mono', monospace; font-size: 13px; line-height: 1.5; overflow-x: auto; white-space: pre-wrap; text-shadow: 0 1px 2px rgba(0,0,0,0.1); border: none !important;">{}</pre>
+    </div>
+</div>
+<style>
+/* Ensure shell output maintains dark theme */
+.shell-output pre {{
+    background-color: #111827 !important;
+    color: #f1f5f9 !important;
+    border: none !important;
+}}
+.shell-output {{
+    background-color: #111827 !important;
+}}
+</style>
+"""
+bad_html_bad = """input[type="file"] {
+  display: block;
+}"""
+EXECUTING_WIDGET = """
+<div style="display: flex; align-items: center; gap: 8px; padding: 8px 12px; background-color: #e3f2fd; border-radius: 6px; border-left: 3px solid #2196f3;">
+    <div style="display: flex; gap: 4px;">
+        <div style="width: 6px; height: 6px; background-color: #2196f3; border-radius: 50%; animation: pulse 1.5s ease-in-out infinite;"></div>
+        <div style="width: 6px; height: 6px; background-color: #2196f3; border-radius: 50%; animation: pulse 1.5s ease-in-out 0.1s infinite;"></div>
+        <div style="width: 6px; height: 6px; background-color: #2196f3; border-radius: 50%; animation: pulse 1.5s ease-in-out 0.2s infinite;"></div>
+    </div>
+    <span style="color: #1976d2; font-size: 14px; font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, sans-serif;">
+        Executing code...
+    </span>
+</div>
+<style>
+@keyframes pulse {
+    0%, 80%, 100% {
+        opacity: 0.3;
+        transform: scale(0.8);
+    }
+    40% {
+        opacity: 1;
+        transform: scale(1);
+    }
+}
+</style>
+"""
+GENERATING_WIDGET = """
+<div style="display: flex; align-items: center; gap: 8px; padding: 8px 12px; background-color: #f3e5f5; border-radius: 6px; border-left: 3px solid #9c27b0;">
+    <div style="width: 80px; height: 4px; background-color: #e1bee7; border-radius: 2px; overflow: hidden;">
+        <div style="width: 30%; height: 100%; background-color: #9c27b0; border-radius: 2px; animation: progress 2s ease-in-out infinite;"></div>
+    </div>
+    <span style="color: #7b1fa2; font-size: 14px; font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, sans-serif;">
+        Generating...
+    </span>
+</div>
+<style>
+@keyframes progress {
+    0% { transform: translateX(-100%); }
+    100% { transform: translateX(250%); }
+}
+</style>
+"""
+DONE_WIDGET = """
+<div style="display: flex; align-items: center; gap: 8px; padding: 8px 12px; background-color: #e8f5e8; border-radius: 6px; border-left: 3px solid #4caf50;">
+    <div style="width: 16px; height: 16px; background-color: #4caf50; border-radius: 50%; display: flex; align-items: center; justify-content: center;">
+        <svg width="10" height="8" viewBox="0 0 10 8" fill="none">
+            <path d="M1 4L3.5 6.5L9 1" stroke="white" stroke-width="1.5" stroke-linecap="round" stroke-linejoin="round"/>
+        </svg>
+    </div>
+    <span style="color: #2e7d32; font-size: 14px; font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, sans-serif;">
+        Generation complete
+    </span>
+</div>
+"""
+DONE_WIDGET = """
+<div style="display: flex; align-items: center; gap: 8px; padding: 8px 12px; background-color: #e8f5e8; border-radius: 6px; border-left: 3px solid #4caf50; animation: fadeInOut 4s ease-in-out forwards;">
+    <div style="width: 16px; height: 16px; background-color: #4caf50; border-radius: 50%; display: flex; align-items: center; justify-content: center;">
+        <svg width="10" height="8" viewBox="0 0 10 8" fill="none">
+            <path d="M1 4L3.5 6.5L9 1" stroke="white" stroke-width="1.5" stroke-linecap="round" stroke-linejoin="round"/>
+        </svg>
+    </div>
+    <span style="color: #2e7d32; font-size: 14px; font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, sans-serif;">
+        Generation complete
+    </span>
+</div>
+<style>
+@keyframes fadeInOut {
+    0% { opacity: 0; transform: translateY(10px); }
+    15% { opacity: 1; transform: translateY(0); }
+    85% { opacity: 1; transform: translateY(0); }
+    100% { opacity: 0; transform: translateY(-10px); }
+}
+</style>
+"""
+STOPPED_WIDGET = """
+<div style="display: flex; align-items: center; gap: 8px; padding: 8px 12px; background-color: #fff3e0; border-radius: 6px; border-left: 3px solid #ff9800;">
+    <div style="width: 16px; height: 16px; background-color: #ff9800; border-radius: 50%; display: flex; align-items: center; justify-content: center; color: white; font-weight: bold; font-size: 10px;">
+        ⏸
+    </div>
+    <span style="color: #f57c00; font-size: 14px; font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, sans-serif;">
+        Execution stopped by user
+    </span>
+</div>
+"""
+ERROR_WIDGET = """
+<div style="display: flex; align-items: center; gap: 8px; padding: 8px 12px; background-color: #ffebee; border-radius: 6px; border-left: 3px solid #f44336;">
+    <div style="width: 16px; height: 16px; background-color: #f44336; border-radius: 50%; display: flex; align-items: center; justify-content: center; color: white; font-weight: bold; font-size: 10px;">
+        ⚠
+    </div>
+    <span style="color: #c62828; font-size: 14px; font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, sans-serif;">
+        Execution failed - check error details above
+    </span>
+</div>
+"""
+ERROR_HTML = """\
+<div style="display: flex; align-items: center; gap: 8px; padding: 12px; background-color: #ffebee; border-radius: 6px; border-left: 3px solid #f44336; margin: 8px 0;">
+    <div style="width: 20px; height: 20px; background-color: #f44336; border-radius: 50%; display: flex; align-items: center; justify-content: center; color: white; font-weight: bold; font-size: 12px;">
+        !
+    </div>
+    <div style="color: #c62828; font-size: 14px; font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, sans-serif;">
+        <strong>Error:</strong> {}
+    </div>
+</div>"""
+STOPPED_SANDBOX_HTML = """
+<div style="display: flex; align-items: center; gap: 8px; padding: 8px 12px; background-color: #f5f5f5; border-radius: 6px; border-left: 3px solid #9e9e9e; margin-bottom: 16px;">
+    <div style="width: 16px; height: 16px; background-color: #9e9e9e; border-radius: 50%; display: flex; align-items: center; justify-content: center; color: white; font-weight: bold; font-size: 10px;">
+        ⏹
+    </div>
+    <div style="flex: 1;">
+        <div style="margin-bottom: 4px; font-size: 13px; color: #757575; font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, sans-serif; font-weight: 500;">
+            Sandbox stopped
+        </div>
+        <div style="width: 100%; height: 8px; background-color: #e0e0e0; border-radius: 4px; overflow: hidden;">
+            <div style="height: 100%; background-color: #9e9e9e; border-radius: 4px; width: 100%;"></div>
+        </div>
+        <div style="display: flex; justify-content: space-between; margin-top: 4px; font-size: 11px; color: #757575; font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, sans-serif;">
+            <span>Started: {start_time}</span>
+            <span>Expired: {end_time}</span>
+        </div>
+    </div>
+</div>
+"""
+TIMEOUT_HTML = """
+<div style="display: flex; align-items: center; gap: 8px; padding: 8px 12px; background-color: #fff3e0; border-radius: 6px; border-left: 3px solid #ff9800; margin-bottom: 16px;">
+    <div style="width: 16px; height: 16px; background-color: #ff9800; border-radius: 50%; display: flex; align-items: center; justify-content: center; color: white; font-weight: bold; font-size: 10px;">
+        ⏱
+    </div>
+    <div style="flex: 1;">
+        <div style="margin-bottom: 4px; font-size: 13px; color: #f57c00; font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, sans-serif; font-weight: 500;">
+            The E2B Sandbox for code execution has a timeout of {total_seconds} seconds.
+        </div>
+        <div style="width: 100%; height: 8px; background-color: #ffe0b3; border-radius: 4px; overflow: hidden;">
+            <div id="progress-bar-{unique_id}" style="height: 100%; background: linear-gradient(90deg, #ff9800 0%, #f57c00 50%, #f44336 100%); border-radius: 4px; width: {current_progress}%; animation: progress-fill-{unique_id} {remaining_seconds}s linear forwards;"></div>
+        </div>
+        <div style="display: flex; justify-content: space-between; margin-top: 4px; font-size: 11px; color: #f57c00; font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, sans-serif;">
+            <span>Started: {start_time}</span>
+            <span>Expires: {end_time}</span>
+        </div>
+    </div>
+</div>
+<style>
+@keyframes progress-fill-{unique_id} {{
+    from {{ width: {current_progress}%; }}
+    to {{ width: 100%; }}
+}}
+</style>
+"""
+TIMEOUT_HTML = """
+<div style="display: flex; align-items: center; gap: 8px; padding: 6px 10px; background-color: #fafafa; border-radius: 4px; border-left: 2px solid #d1d5db; margin-bottom: 8px; font-size: 12px;">
+    <div style="width: 12px; height: 12px; background-color: #d1d5db; border-radius: 50%; display: flex; align-items: center; justify-content: center; color: white; font-weight: bold; font-size: 8px;">
+        ⏱
+    </div>
+    <div style="flex: 1;">
+        <div style="margin-bottom: 2px; font-size: 11px; color: #6b7280; font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, sans-serif; font-weight: 400;">
+            Sandbox timeout: {total_seconds}s
+        </div>
+        <div style="width: 100%; height: 6px; background-color: #e5e7eb; border-radius: 3px; overflow: hidden;">
+            <div id="progress-bar-{unique_id}" style="height: 100%; background-color: #6b7280; border-radius: 3px; width: {current_progress}%; animation: progress-fill-{unique_id} {remaining_seconds}s linear forwards;"></div>
+        </div>
+        <div style="display: flex; justify-content: space-between; margin-top: 2px; font-size: 10px; color: #9ca3af; font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, sans-serif;">
+            <span>Started: {start_time}</span>
+            <span>Expires: {end_time}</span>
+        </div>
+    </div>
+</div>
+<style>
+@keyframes progress-fill-{unique_id} {{
+    from {{ width: {current_progress}%; }}
+    to {{ width: 100%; }}
+}}
+</style>
+"""
+# Custom CSS for notebook styling including shell commands
+custom_css = """
+<style type="text/css">
+/* Code font size */
+.highlight pre, .highlight code,
+div.input_area pre, div.output_area pre {
+    font-size: 12px !important;
+    line-height: 1.4 !important;
+}
+/* Fix prompt truncation */
+.jp-InputPrompt, .jp-OutputPrompt {
+    text-overflow: clip !important;
+}
+/* Shell command styling - force dark theme */
+.shell-output {
+    background-color: #111827 !important;
+}
+.shell-output div {
+    background: linear-gradient(135deg, #111827 0%, #1f2937 100%) !important;
+}
+.shell-output pre {
+    background-color: #111827 !important;
+    color: #f1f5f9 !important;
+    border: none !important;
+    margin: 0 !important;
+}
+/* Override any notebook styles that might interfere */
+div[data-jp-cell-type="markdown"] .shell-output pre {
+    background-color: #111827 !important;
+    color: #f1f5f9 !important;
+}
+/* Additional terminal styling */
+.terminal-header {
+    background: linear-gradient(90deg, #374151 0%, #4b5563 100%) !important;
+}
+</style>
+"""
+# Configure the exporter
+config = Config()
+html_exporter = HTMLExporter(config=config, template_name="classic")
+class JupyterNotebook:
+    def __init__(self, messages=None, session_state_data=None):
+        self.exec_count = 0
+        self.countdown_info = None
+        # If session_state_data is provided, use it directly
+        if session_state_data and "notebook_data" in session_state_data:
+            logger.info("Initializing JupyterNotebook from session state")
+            self.data = session_state_data["notebook_data"]
+            # Count existing code cells to maintain execution count
+            self.exec_count = len([cell for cell in self.data.get("cells", [])
+                                 if cell.get("cell_type") == "code" and cell.get("execution_count")])
+            logger.info(f"JupyterNotebook initialized from session state with {len(self.data['cells'])} cells, exec_count={self.exec_count}")
+            return
+        # Legacy initialization path
+        if messages is None:
+            messages = []
+        logger.debug(f"Initializing JupyterNotebook with {len(messages)} messages")
+        self.data, self.code_cell_counter = self.create_base_notebook(messages)
+        logger.info(f"JupyterNotebook initialized with {len(self.data['cells'])} cells")
+    def create_base_notebook(self, messages):
+        logger.debug("Creating base notebook structure")
+        base_notebook = {
+            "metadata": {
+                "kernel_info": {"name": "python3"},
+                "language_info": {
+                    "name": "python",
+                    "version": "3.12",
+                },
+            },
+            "nbformat": 4,
+            "nbformat_minor": 0,
+            "cells": []
+        }
+        # Add header
+        base_notebook["cells"].append({
+            "cell_type": "markdown",
+            "metadata": {},
+            "source": header_message
+        })
+        logger.debug("Added header cell to notebook")
+        # Set initial data
+        self.data = base_notebook
+        # Add empty code cell if no messages
+        if len(messages) == 0:
+            self.data["cells"].append({
+                "cell_type": "code",
+                "execution_count": None,
+                "metadata": {},
+                "source": "",
+                "outputs": []
+            })
+            logger.debug("Added empty code cell for new notebook")
+            return self.data, 0
+        # Process messages using existing methods
+        logger.info(f"Processing {len(messages)} messages for notebook creation")
+        i = 0
+        while i < len(messages):
+            message = messages[i]
+            logger.debug(f"Processing message {i+1}/{len(messages)}: {message['role']}")
+            if message["role"] == "system":
+                logger.debug("Adding system message as markdown")
+                self.add_markdown(message["content"], "system")
+            elif message["role"] == "user":
+                logger.debug("Adding user message as markdown")
+                self.add_markdown(message["content"], "user")
+            elif message["role"] == "assistant":
+                if "tool_calls" in message:
+                    logger.debug(f"Processing assistant message with {len(message['tool_calls'])} tool calls")
+                    # Add assistant thinking if there's content
+                    if message.get("content"):
+                        logger.debug("Adding assistant thinking content")
+                        self.add_markdown(message["content"], "assistant")
+                    # Process tool calls - we know the next message(s) will be tool responses
+                    for tool_call in message["tool_calls"]:
+                        if tool_call["function"]["name"] == "add_and_execute_jupyter_code_cell":
+                            logger.debug(f"Processing code execution tool call: {tool_call['id']}")
+                            tool_args = json.loads(tool_call["function"]["arguments"])
+                            code = tool_args["code"]
+                            logger.debug(f"Code cell contains {len(code)} characters")
+                            # Get the next tool response (guaranteed to exist)
+                            tool_message = messages[i + 1]
+                            if tool_message["role"] == "tool" and tool_message.get("tool_call_id") == tool_call["id"]:
+                                logger.debug(f"Found matching tool response for {tool_call['id']}")
+                                # Use the raw execution if available, otherwise fall back to empty list
+                                execution = tool_message.get("raw_execution", [])
+                                self.add_code_execution(code, execution, parsed=True)
+                                logger.debug(f"Added code execution cell with {len(execution)} outputs")
+                                i += 1  # Skip the tool message since we just processed it
+                            else:
+                                logger.warning(f"No matching tool response found for tool call {tool_call['id']}")
+                else:
+                    # Regular assistant message
+                    logger.debug("Adding regular assistant message")
+                    self.add_markdown(message["content"], "assistant")
+            elif message["role"] == "tool":
+                # Skip - should have been handled with corresponding tool_calls
+                # This shouldn't happen given our assumptions, but just in case
+                logger.debug("Skipping tool message (should have been processed with tool_calls)")
+                pass
+            i += 1
+        return self.data, 0
+    def _update_countdown_cell(self):
+        if not self.countdown_info:
+            logger.debug("No countdown info available, skipping countdown update")
+            return
+        logger.debug("Updating countdown cell")
+        start_time = self.countdown_info['start_time']
+        end_time = self.countdown_info['end_time']
+        current_time = datetime.datetime.now(datetime.timezone.utc)
+        remaining_time = end_time - current_time
+        # Show stopped message if expired
+        if remaining_time.total_seconds() <= 0:
+            logger.info("Sandbox has expired, showing stopped message")
+            # Format display for stopped sandbox
+            start_display = start_time.strftime("%H:%M")
+            end_display = end_time.strftime("%H:%M")
+            stopped_html = STOPPED_SANDBOX_HTML.format(
+                start_time=start_display,
+                end_time=end_display
+            )
+            # Update countdown cell to show stopped message
+            stopped_cell = {
+                "cell_type": "markdown",
+                "metadata": {},
+                "source": stopped_html
+            }
+            # Find and update existing countdown cell
+            for i, cell in enumerate(self.data["cells"]):
+                if cell.get("cell_type") == "markdown" and ("⏱" in str(cell.get("source", "")) or "⏹" in str(cell.get("source", ""))):
+                    self.data["cells"][i] = stopped_cell
+                    logger.debug(f"Updated countdown cell at position {i} with stopped message")
+                    break
+            return
+        # Calculate current progress
+        total_duration = end_time - start_time
+        elapsed_time = current_time - start_time
+        current_progress = (elapsed_time.total_seconds() / total_duration.total_seconds()) * 100
+        current_progress = max(0, min(100, current_progress))
+        logger.debug(f"Countdown progress: {current_progress:.1f}% ({remaining_time.total_seconds():.0f}s remaining)")
+        # Format display
+        start_display = start_time.strftime("%H:%M")
+        end_display = end_time.strftime("%H:%M")
+        remaining_seconds = int(remaining_time.total_seconds())
+        remaining_minutes = remaining_seconds // 60
+        remaining_secs = remaining_seconds % 60
+        remaining_display = f"{remaining_minutes}:{remaining_secs:02d}"
+        # Generate unique ID to avoid CSS conflicts when updating
+        unique_id = int(current_time.timestamp() * 1000) % 100000
+        # Calculate total timeout duration in seconds
+        total_seconds = int(total_duration.total_seconds())
+        countdown_html = TIMEOUT_HTML.format(
+            start_time=start_display,
+            end_time=end_display,
+            current_progress=current_progress,
+            remaining_seconds=remaining_seconds,
+            unique_id=unique_id,
+            total_seconds=total_seconds
+        )
+        # Update or insert the countdown cell
+        countdown_cell = {
+            "cell_type": "markdown",
+            "metadata": {},
+            "source": countdown_html
+        }
+        # Find existing countdown cell by looking for the timer emoji
+        found_countdown = False
+        for i, cell in enumerate(self.data["cells"]):
+            if cell.get("cell_type") == "markdown" and "⏱" in str(cell.get("source", "")):
+                # Update existing countdown cell
+                self.data["cells"][i] = countdown_cell
+                found_countdown = True
+                logger.debug(f"Updated existing countdown cell at position {i}")
+                break
+        if not found_countdown:
+            # Insert new countdown cell at position 1 (after header)
+            self.data["cells"].insert(1, countdown_cell)
+            logger.debug("Inserted new countdown cell at position 1")
+    def add_sandbox_countdown(self, start_time, end_time):
+        logger.info(f"Adding sandbox countdown: {start_time} to {end_time}")
+        # Store the countdown info for later updates
+        self.countdown_info = {
+            'start_time': start_time,
+            'end_time': end_time,
+            'cell_index': 1  # Remember where we put it
+        }
+    def add_code_execution(self, code, execution, parsed=False):
+        self.exec_count += 1
+        logger.debug(f"Adding code execution cell #{self.exec_count} with {len(code)} chars of code")
+        outputs = execution if parsed else self.parse_exec_result_nb(execution)
+        logger.debug(f"Code execution has {len(outputs)} outputs")
+        self.data["cells"].append({
+            "cell_type": "code",
+            "execution_count": self.exec_count,
+            "metadata": {},
+            "source": code,
+            "outputs": outputs
+            })
+    def add_code(self, code):
+        """Add a code cell without execution results"""
+        self.exec_count += 1
+        logger.debug(f"Adding code cell #{self.exec_count} with {len(code)} chars (no execution)")
+        self.data["cells"].append({
+            "cell_type": "code",
+            "execution_count": self.exec_count,
+            "metadata": {},
+            "source": code,
+            "outputs": []
+        })
+    def append_execution(self, execution):
+        """Append execution results to the immediate previous cell if it's a code cell"""
+        if (len(self.data["cells"]) > 0 and
+            self.data["cells"][-1]["cell_type"] == "code"):
+            outputs = self.parse_exec_result_nb(execution)
+            self.data["cells"][-1]["outputs"] = outputs
+            logger.debug(f"Appended {len(outputs)} outputs to last code cell")
+        else:
+            logger.error("Cannot append execution: previous cell is not a code cell")
+            raise ValueError("Cannot append execution: previous cell is not a code cell")
+    def has_execution_error(self, execution):
+        """Check if an execution result contains an error"""
+        has_error = execution.error is not None
+        logger.debug(f"Execution error check: {has_error}")
+        return has_error
+    def has_execution_warnings(self, execution):
+        """Check if an execution result contains warnings (stderr output but no error)"""
+        has_warnings = (execution.error is None and
+                       execution.logs.stderr and
+                       len(execution.logs.stderr) > 0)
+        logger.debug(f"Execution warning check: {has_warnings}")
+        return has_warnings
+    def update_last_code_cell(self, code):
+        """Update the source code of the last code cell"""
+        if (len(self.data["cells"]) > 0 and
+            self.data["cells"][-1]["cell_type"] == "code"):
+            logger.debug(f"Updating last code cell with {len(code)} chars")
+            self.data["cells"][-1]["source"] = code
+            # Clear previous outputs when updating code
+            self.data["cells"][-1]["outputs"] = []
+            logger.debug("Cleared previous outputs from updated code cell")
+        else:
+            logger.error("Cannot update: last cell is not a code cell")
+            raise ValueError("Cannot update: last cell is not a code cell")
+    def get_last_cell_type(self):
+        """Get the type of the last cell, or None if no cells exist"""
+        if len(self.data["cells"]) > 0:
+            cell_type = self.data["cells"][-1]["cell_type"]
+            logger.debug(f"Last cell type: {cell_type}")
+            return cell_type
+        logger.debug("No cells exist, returning None")
+        return None
+    def add_markdown(self, markdown, role="markdown"):
+        logger.debug(f"Adding markdown cell with role '{role}' ({len(markdown)} chars)")
+        if role == "system":
+            system_message = markdown if markdown else "default"
+            clean_message = self._clean_markdown_formatting(system_message)
+            markdown_formatted = system_template.format(clean_message)
+        elif role == "user":
+            clean_message = self._clean_markdown_formatting(markdown)
+            markdown_formatted = user_template.format(clean_message)
+        elif role == "assistant":
+            clean_message = self._clean_markdown_formatting(markdown)
+            markdown_formatted = assistant_thinking_template.format(clean_message)
+            markdown_formatted = markdown_formatted.replace('<think>', '&lt;think&gt;')
+            markdown_formatted = markdown_formatted.replace('</think>', '&lt;/think&gt;')
+        else:
+            # Default case for raw markdown
+            markdown_formatted = self._clean_markdown_formatting(markdown)
+        self.data["cells"].append({
+            "cell_type": "markdown",
+            "metadata": {},
+            "source": markdown_formatted
+        })
+    def add_shell_command(self, command):
+        """Add a shell command cell with terminal-style formatting"""
+        logger.debug(f"Adding shell command cell: '{command}'")
+        # Format command with terminal-style template
+        shell_formatted = shell_command_template.format(self._clean_shell_command(command))
+        self.data["cells"].append({
+            "cell_type": "markdown",
+            "metadata": {"shell_command": True, "command": command},
+            "source": shell_formatted
+        })
+    def append_shell_execution(self, execution):
+        """Append shell execution results to the notebook with terminal styling"""
+        logger.debug("Appending shell execution results")
+        # Format the shell output using terminal styling
+        output_content = self._format_shell_output(execution)
+        shell_output_formatted = shell_output_template.format(output_content)
+        # Wrap in a div with shell-output class for styling
+        shell_output_with_class = f'<div class="shell-output">{shell_output_formatted}</div>'
+        # Add the output as a new markdown cell
+        self.data["cells"].append({
+            "cell_type": "markdown",
+            "metadata": {"shell_output": True},
+            "source": shell_output_with_class
+        })
+        logger.debug("Added shell output cell to notebook")
+    def _clean_shell_command(self, command):
+        """Clean and escape shell command for display"""
+        if not command:
+            return ""
+        # Basic HTML escaping for shell commands
+        command = command.replace('&', '&amp;')
+        command = command.replace('<', '&lt;')
+        command = command.replace('>', '&gt;')
+        command = command.replace('"', '&quot;')
+        command = command.replace("'", '&#39;')
+        return command
+    def _format_shell_output(self, execution):
+        """Format shell execution output for terminal-style display"""
+        output_parts = []
+        # Add stdout if present
+        if execution.logs.stdout:
+            stdout_text = ''.join(execution.logs.stdout).strip()
+            if stdout_text:
+                output_parts.append(stdout_text)
+        # Add stderr if present (but filter out plot data)
+        if execution.logs.stderr:
+            stderr_text = ''.join(execution.logs.stderr).strip()
+            # Filter out plot data from stderr
+            plot_start = stderr_text.find("__PLOT_DATA__")
+            plot_end = stderr_text.find("__END_PLOT_DATA__")
+            if plot_start != -1 and plot_end != -1:
+                clean_stderr = stderr_text[:plot_start] + stderr_text[plot_end + len("__END_PLOT_DATA__"):]
+                stderr_text = clean_stderr.strip()
+            if stderr_text:
+                output_parts.append(f"STDERR:\n{stderr_text}")
+        # Add error information if present
+        if execution.error:
+            error_text = f"ERROR: {execution.error.name}: {execution.error.value}"
+            if execution.error.traceback:
+                error_text += f"\n{execution.error.traceback}"
+            output_parts.append(error_text)
+        # Add execution results if present (for shell commands that produce results)
+        if execution.results:
+            for result in execution.results:
+                if result.text:
+                    output_parts.append(result.text.strip())
+        # Join all output parts
+        final_output = '\n\n'.join(output_parts) if output_parts else "No output"
+        # Basic HTML escaping for output
+        final_output = final_output.replace('&', '&amp;')
+        final_output = final_output.replace('<', '&lt;')
+        final_output = final_output.replace('>', '&gt;')
+        logger.debug(f"Formatted shell output: {len(final_output)} chars")
+        return final_output
+    def add_error(self, error_message):
+        """Add an error message cell to the notebook"""
+        logger.warning(f"Adding error cell: {error_message}")
+        error_html = ERROR_HTML.format(error_message)
+        self.data["cells"].append({
+            "cell_type": "markdown",
+            "metadata": {},
+            "source": error_html
+        })
+    def add_final_answer(self, answer):
+        logger.info(f"Adding final answer cell ({len(answer)} chars)")
+        self.data["cells"].append({
+            "cell_type": "markdown",
+            "metadata": {},
+            "source": assistant_final_answer_template.format(answer)
+            })
+    def add_web_search_result(self, query, quick_answer=None, sources=None):
+        """Add a web search result cell with dropdown UI"""
+        logger.info(f"Adding web search result for query: {query}")
+        # Format quick answer section
+        quick_answer_html = ""
+        if quick_answer:
+            # Clean up markdown formatting in quick answer
+            clean_answer = self._clean_markdown_formatting(quick_answer)
+            quick_answer_html = f"""
+            <div class="quick-answer">
+                <div class="quick-answer-title">💡 Quick Answer:</div>
+                <div>{clean_answer}</div>
+            </div>
+            """
+        # Format sources section
+        sources_html = ""
+        if sources:
+            source_items = []
+            for i, source in enumerate(sources, 1):
+                title = self._clean_markdown_formatting(source.get('title', f'Source {i}'))
+                url = source.get('url', '#')
+                relevance = source.get('relevance', 0.0)
+                source_item = f"""
+                <div class="source-item">
+                    <div class="source-title">{i}. {title}
+                        <span class="relevance-score">Relevance: {relevance:.2f}</span>
+                    </div>
+                    <a href="{url}" target="_blank" class="source-url">{url}</a>
+                </div>
+                """
+                source_items.append(source_item)
+            sources_html = "".join(source_items)
+        # Format the complete web search result
+        web_search_html = web_search_template.format(
+            query=self._clean_markdown_formatting(query),
+            quick_answer=quick_answer_html,
+            sources=sources_html
+        )
+        self.data["cells"].append({
+            "cell_type": "markdown",
+            "metadata": {},
+            "source": web_search_html
+        })
+    def _clean_markdown_formatting(self, text):
+        """Clean up markdown formatting issues like excessive ** characters"""
+        if not text:
+            return ""
+        # Replace multiple consecutive asterisks with proper formatting
+        import re
+        # Handle bold text: **text** -> <strong>text</strong>
+        text = re.sub(r'\*\*([^*]+)\*\*', r'<strong>\1</strong>', text)
+        # Handle italic text: *text* -> <em>text</em>
+        text = re.sub(r'(?<!\*)\*(?!\*)([^*]+)\*(?!\*)', r'<em>\1</em>', text)
+        # Clean up any remaining multiple asterisks
+        text = re.sub(r'\*{3,}', '**', text)
+        # Handle line breaks
+        text = text.replace('\n', '<br>')
+        # Handle links [text](url) -> <a href="url">text</a>
+        text = re.sub(r'\[([^\]]+)\]\(([^)]+)\)', r'<a href="\2" target="_blank">\1</a>', text)
+        return text
+    def parse_exec_result_nb(self, execution):
+        """Convert an E2B Execution object to Jupyter notebook cell output format"""
+        logger.debug("Parsing execution result for notebook format")
+        outputs = []
+        if execution.logs.stdout:
+            stdout_text = ''.join(execution.logs.stdout)
+            logger.debug(f"Adding stdout output ({len(stdout_text)} chars)")
+            outputs.append({
+                'output_type': 'stream',
+                'name': 'stdout',
+                'text': stdout_text
+            })
+        if execution.logs.stderr:
+            stderr_text = ''.join(execution.logs.stderr)
+            # Filter out plot data from stderr before displaying
+            plot_start = stderr_text.find("__PLOT_DATA__")
+            plot_end = stderr_text.find("__END_PLOT_DATA__")
+            if plot_start != -1 and plot_end != -1:
+                # Remove plot data from stderr text
+                clean_stderr = stderr_text[:plot_start] + stderr_text[plot_end + len("__END_PLOT_DATA__"):]
+                stderr_text = clean_stderr.strip()
+            # Only add stderr output if there's content after filtering
+            if stderr_text:
+                logger.debug(f"Adding stderr output ({len(stderr_text)} chars)")
+                outputs.append({
+                    'output_type': 'stream',
+                    'name': 'stderr',
+                    'text': stderr_text
+                })
+        if execution.error:
+            logger.debug(f"Adding error output: {execution.error.name}: {execution.error.value}")
+            outputs.append({
+                'output_type': 'error',
+                'ename': execution.error.name,
+                'evalue': execution.error.value,
+                'traceback': [line for line in execution.error.traceback.split('\n')]
+            })
+        for i, result in enumerate(execution.results):
+            logger.debug(f"Processing execution result {i+1}/{len(execution.results)}")
+            output = {
+                'output_type': 'execute_result' if result.is_main_result else 'display_data',
+                'metadata': {},
+                'data': {}
+            }
+            if result.text:
+                output['data']['text/plain'] = result.text
+            if result.html:
+                output['data']['text/html'] = result.html
+            if result.png:
+                output['data']['image/png'] = result.png
+            if result.svg:
+                output['data']['image/svg+xml'] = result.svg
+            if result.jpeg:
+                output['data']['image/jpeg'] = result.jpeg
+            if result.pdf:
+                output['data']['application/pdf'] = result.pdf
+            if result.latex:
+                output['data']['text/latex'] = result.latex
+            if result.json:
+                output['data']['application/json'] = result.json
+            if result.javascript:
+                output['data']['application/javascript'] = result.javascript
+            if result.is_main_result and execution.execution_count is not None:
+                output['execution_count'] = execution.execution_count
+            if output['data']:
+                logger.debug(f"Added result output with data types: {list(output['data'].keys())}")
+                outputs.append(output)
+            else:
+                logger.debug("Skipping result with no data")
+        logger.debug(f"Parsed execution result into {len(outputs)} outputs")
+        return outputs
+    def filter_base64_images(self, message):
+        """Filter out base64 encoded images from message content"""
+        if isinstance(message, dict) and 'nbformat' in message:
+            for output in message['nbformat']:
+                if 'data' in output:
+                    for key in list(output['data'].keys()):
+                        if key.startswith('image/') or key == 'application/pdf':
+                            output['data'][key] = '<placeholder_image>'
+        return message
+    def render(self, mode="default"):
+        logger.debug(f"Rendering notebook in '{mode}' mode with {len(self.data['cells'])} cells")
+        if self.countdown_info is not None:
+            self._update_countdown_cell()
+        render_data = copy.deepcopy(self.data)
+        if mode == "generating":
+            render_data["cells"].append({
+            "cell_type": "markdown",
+            "metadata": {},
+            "source": GENERATING_WIDGET
+            })
+        elif mode == "executing":
+            logger.debug("Adding executing widget to render")
+            render_data["cells"].append({
+            "cell_type": "markdown",
+            "metadata": {},
+            "source": EXECUTING_WIDGET
+            })
+        elif mode == "done":
+            logger.debug("Adding done widget to render")
+            render_data["cells"].append({
+            "cell_type": "markdown",
+            "metadata": {},
+            "source": DONE_WIDGET
+            })
+        elif mode == "stopped":
+            logger.debug("Adding stopped widget to render")
+            render_data["cells"].append({
+            "cell_type": "markdown",
+            "metadata": {},
+            "source": STOPPED_WIDGET
+            })
+        elif mode == "error":
+            logger.debug("Adding error widget to render")
+            render_data["cells"].append({
+            "cell_type": "markdown",
+            "metadata": {},
+            "source": ERROR_WIDGET
+            })
+        elif mode != "default":
+            logger.error(f"Invalid render mode: {mode}")
+            raise ValueError(f"Render mode should be generating, executing, done, stopped, or error. Given: {mode}.")
+        notebook = nbformat.from_dict(render_data)
+        notebook_body, _ = html_exporter.from_notebook_node(notebook)
+        notebook_body = notebook_body.replace(bad_html_bad, "")
+        logger.debug(f"Rendered notebook HTML ({len(notebook_body)} chars)")
+        # make code font a bit smaller with custom css
+        if "<head>" in notebook_body:
+            notebook_body = notebook_body.replace("</head>", f"{custom_css}</head>")
+            logger.debug("Applied custom CSS to notebook")
+        return notebook_body
+    @classmethod
+    def from_session_state(cls, session_state_data):
+        """Create JupyterNotebook instance from session state data"""
+        return cls(session_state_data=session_state_data)
+    def get_session_notebook_data(self):
+        """Get notebook data in format suitable for session state"""
+        return self.data.copy()
+    def update_from_session_state(self, session_state_data):
+        """Update notebook data from session state"""
+        if "notebook_data" in session_state_data:
+            self.data = session_state_data["notebook_data"].copy()
+            # Update execution count based on existing cells
+            self.exec_count = len([cell for cell in self.data.get("cells", [])
+                                 if cell.get("cell_type") == "code" and cell.get("execution_count")])
+            logger.debug(f"Updated notebook from session state: {len(self.data['cells'])} cells, exec_count={self.exec_count}")
+def main():
+    """Create a mock notebook to test styling"""
+    # Create mock messages
+    mock_messages = [
+        {"role": "system", "content": "You are a helpful AI assistant that can write and execute Python code."},
+        {"role": "user", "content": "Can you help me create a simple plot of a sine wave?"},
+        {"role": "assistant", "content": "I'll help you create a sine wave plot using matplotlib. **Let me search** for the *best practices* first."},
+        {"role": "assistant", "tool_calls": [{"id": "call_1", "function": {"name": "add_and_execute_jupyter_code_cell", "arguments": '{"code": "import numpy as np\\nimport matplotlib.pyplot as plt\\n\\n# Create x values\\nx = np.linspace(0, 4*np.pi, 100)\\ny = np.sin(x)\\n\\n# Create the plot\\nplt.figure(figsize=(10, 6))\\nplt.plot(x, y, \'b-\', linewidth=2)\\nplt.title(\'Sine Wave\')\\nplt.xlabel(\'x\')\\nplt.ylabel(\'sin(x)\')\\nplt.grid(True)\\nplt.show()"}'}}]},
+        {"role": "tool", "tool_call_id": "call_1", "raw_execution": [{"output_type": "stream", "name": "stdout", "text": "Plot created successfully!"}]}
+    ]
+    # Create notebook
+    notebook = JupyterNotebook(mock_messages)
+    # Add a web search result example to test the new UI
+    mock_sources = [
+        {
+            "title": "**Matplotlib** Tutorial - Creating **Beautiful** Plots",
+            "url": "https://matplotlib.org/stable/tutorials/introductory/pyplot.html",
+            "relevance": 0.85
+        },
+        {
+            "title": "NumPy *Sine Wave* Generation **Best Practices**",
+            "url": "https://numpy.org/doc/stable/reference/generated/numpy.sin.html",
+            "relevance": 0.72
+        }
+    ]
+    notebook.add_web_search_result(
+        query="**matplotlib** *sine wave* tutorial **best practices**",
+        quick_answer="To create a **sine wave plot** with *matplotlib*, use `numpy.linspace()` to generate **x values** and `numpy.sin()` for *y values*. **Configure** the plot with *appropriate* labels and **styling** for better visualization.",
+        sources=mock_sources
+    )
+    # Add a timeout countdown (simulating a sandbox that started 2 minutes ago with 5 minute timeout)
+    start_time = datetime.datetime.now(datetime.timezone.utc) - datetime.timedelta(minutes=2)
+    end_time = start_time + datetime.timedelta(minutes=5)
+    notebook.add_sandbox_countdown(start_time, end_time)
+    # Render and save
+    html_output = notebook.render()
+    with open("mock_notebook.html", "w", encoding="utf-8") as f:
+        f.write(html_output)
+    print("Mock notebook saved as 'mock_notebook.html'")
+    print("Open it in your browser to see the improved web search UI and markdown formatting.")
+def create_notebook_from_session_state(session_state):
+    """Helper function to create JupyterNotebook from session state"""
+    return JupyterNotebook.from_session_state(session_state)
+if __name__ == "__main__":
+    main()

modal_sandbox.py ADDED Viewed

	@@ -0,0 +1,794 @@

+"""
+Modal Sandbox wrapper to provide E2B-compatible interface for the Jupyter Agent.
+Simplified implementation using Modal's native API.
+"""
+import modal
+import datetime
+from typing import Optional, Dict, List
+import json
+import logging
+import time
+logger = logging.getLogger(__name__)
+class ModalResult:
+    """Mock E2B result structure for displaying outputs like plots"""
+    def __init__(self, text: str = "", html: str = "", png: str = "", svg: str = "",
+                 jpeg: str = "", pdf: str = "", latex: str = "", json: str = "",
+                 javascript: str = "", is_main_result: bool = True):
+        self.text = text
+        self.html = html
+        self.png = png
+        self.svg = svg
+        self.jpeg = jpeg
+        self.pdf = pdf
+        self.latex = latex
+        self.json = json
+        self.javascript = javascript
+        self.is_main_result = is_main_result
+class ModalExecution:
+    """Mock E2B execution result to maintain compatibility with existing code"""
+    def __init__(self, stdout: str = "", stderr: str = "", error: Optional[Dict] = None, results: List[ModalResult] = None):
+        self.logs = ModalLogs(stdout, stderr)
+        self.error = ModalError(error) if error else None
+        self.results = results or []
+        self.execution_count = 1
+class ModalLogs:
+    """Mock E2B logs structure"""
+    def __init__(self, stdout: str = "", stderr: str = ""):
+        self.stdout = [stdout] if stdout else []
+        self.stderr = [stderr] if stderr else []
+class ModalError:
+    """Mock E2B error structure"""
+    def __init__(self, error_data: Dict):
+        self.name = error_data.get('name', 'Error')
+        self.value = error_data.get('value', 'Unknown error')
+        self.traceback = error_data.get('traceback', f"{self.name}: {self.value}")
+class ModalFiles:
+    """Simplified Modal files interface using native Modal Sandbox API"""
+    def __init__(self, modal_sandbox):
+        self.modal_sandbox = modal_sandbox  # ModalSandbox wrapper
+        self.max_file_size = 100 * 1024 * 1024  # 100MB limit
+    @property
+    def _sandbox(self):
+        """Get the actual Modal sandbox instance"""
+        return self.modal_sandbox._sandbox
+    def write(self, path: str, content):
+        """Write file to Modal sandbox using native Modal API"""
+        try:
+            # Handle file-like objects
+            if hasattr(content, 'read'):
+                file_content = content.read()
+                # Reset file pointer if possible
+                if hasattr(content, 'seek'):
+                    content.seek(0)
+            else:
+                file_content = content
+            # Check file size for bytes content
+            content_size = len(file_content) if isinstance(file_content, (bytes, str)) else 0
+            if content_size > self.max_file_size:
+                raise ValueError(f"File size ({content_size} bytes) exceeds maximum allowed size ({self.max_file_size} bytes)")
+            # Use Modal's native file API
+            if isinstance(file_content, bytes):
+                # Write binary content
+                with self._sandbox.open(path, "wb") as f:
+                    f.write(file_content)
+            else:
+                # Write text content
+                with self._sandbox.open(path, "w") as f:
+                    f.write(str(file_content))
+            logger.debug(f"Successfully wrote file {path} ({content_size} bytes) using Modal native API")
+        except Exception as e:
+            logger.error(f"Failed to write file {path}: {str(e)}")
+            raise RuntimeError(f"Could not write file {path}: {str(e)}")
+    def read(self, path: str, mode: str = "r"):
+        """Read file from Modal sandbox using native API"""
+        try:
+            with self._sandbox.open(path, mode) as f:
+                return f.read()
+        except Exception as e:
+            logger.error(f"Failed to read file {path}: {str(e)}")
+            raise
+    def exists(self, path: str) -> bool:
+        """Check if file exists in Modal sandbox"""
+        try:
+            # Try to open the file to check existence
+            with self._sandbox.open(path, "r"):
+                pass
+            return True
+        except Exception:
+            return False
+    def list_files(self, directory: str = ".") -> List[str]:
+        """List files in directory using Modal's native ls method"""
+        try:
+            return self._sandbox.ls(directory)
+        except Exception as e:
+            logger.error(f"Failed to list files in {directory}: {str(e)}")
+            return []
+    def verify_file_upload(self, path: str, expected_size: Optional[int] = None) -> bool:
+        """Verify that a file was uploaded correctly"""
+        try:
+            if not self.exists(path):
+                logger.error(f"File {path} does not exist after upload")
+                return False
+            # Check file size if expected size is provided
+            if expected_size is not None:
+                # Use Modal's exec to get file size
+                result = self._sandbox.exec("wc", "-c", path)
+                result.wait()
+                if result.returncode == 0:
+                    output = result.stdout.read().strip()
+                    actual_size = int(output.split()[0])
+                    if actual_size != expected_size:
+                        logger.error(f"File {path} size mismatch: expected {expected_size}, got {actual_size}")
+                        return False
+                    else:
+                        logger.debug(f"File {path} size verified: {actual_size} bytes")
+                else:
+                    logger.warning(f"Could not verify file size for {path}")
+            logger.debug(f"File {path} upload verification successful")
+            return True
+        except Exception as e:
+            logger.error(f"Failed to verify file upload {path}: {str(e)}")
+            return False
+class ModalSandboxInfo:
+    """Mock E2B sandbox info for countdown timer"""
+    def __init__(self, timeout_seconds: int = 300):
+        self.started_at = datetime.datetime.now(datetime.timezone.utc)
+        self.end_at = self.started_at + datetime.timedelta(seconds=timeout_seconds)
+class ModalSandbox:
+    """Modal sandbox wrapper that provides E2B-compatible interface"""
+    def __init__(self, gpu_config: str = "cpu", cpu_cores: float = 2.0, memory_mb: int = 8192,
+                 timeout: int = 300, environment_vars: Dict[str, str] = None):
+        """
+        Initialize Modal sandbox with hardware configuration
+        Args:
+            gpu_config: GPU configuration (e.g., "cpu", "T4", "A100-40GB", "H100")
+            cpu_cores: Number of CPU cores
+            memory_mb: Memory in MB
+            timeout: Timeout in seconds
+            environment_vars: Environment variables to set
+        """
+        self.gpu_config = gpu_config
+        self.cpu_cores = cpu_cores
+        self.memory_mb = memory_mb
+        self.timeout = timeout
+        self.environment_vars = environment_vars or {}
+        self.files = ModalFiles(self)
+        self._sandbox = None
+        self._app = None
+        self._sandbox_info = ModalSandboxInfo(timeout)
+        self._persistent_session = None  # For maintaining state across executions
+        # Define package lists for different hardware configurations
+        CPU_PACKAGES = [
+            "jupyter-server", "ipykernel", "ipython", "orjson", "pandas",
+            "matplotlib", "pillow", "numpy", "scipy", "scikit-learn",
+            "seaborn", "plotly", "requests", "beautifulsoup4", "opencv-python",
+            "nltk", "textblob", "librosa>=0.10.0", "soundfile", "sympy", "xarray"
+        ]
+        GPU_PACKAGES = [
+            "jupyter-server", "ipykernel", "ipython", "orjson", "pandas",
+            "matplotlib", "pillow", "numpy", "scipy", "scikit-learn",
+            "seaborn", "plotly", "requests", "beautifulsoup4", "opencv-python",
+            "nltk", "textblob", "librosa>=0.10.0", "soundfile", "sympy", "xarray",
+            # GPU-specific ML/AI packages
+            "torch", "transformers", "datasets", "bitsandbytes", "hf_transfer",
+            "peft", "trl", "accelerate", "xformers", "wandb", "deepspeed",
+            "pyyaml", "packaging", "rouge_score", "bert_score", "jiwer",
+            "tqdm", "pyarrow", "sentencepiece", "protobuf", "huggingface_hub"
+        ]
+        # Store package lists for system prompt
+        self.available_packages = GPU_PACKAGES if gpu_config != "cpu" else CPU_PACKAGES
+        # Create appropriate image based on hardware configuration
+        if gpu_config == "cpu" or gpu_config == "CPU-only":
+            self.base_image = self._create_cpu_image(CPU_PACKAGES)
+        else:
+            self.base_image = self._create_gpu_image(GPU_PACKAGES)
+        self._setup_modal()
+        logger.info(f"Initialized Modal sandbox with {gpu_config} GPU, {cpu_cores} CPU cores, {memory_mb}MB RAM")
+    def _create_cpu_image(self, packages):
+        """Create CPU-optimized image with basic data science packages"""
+        packages_string = " ".join(packages)
+        return (modal.Image.debian_slim()
+                .apt_install("git", "build-essential")
+                .run_commands("pip install --upgrade pip")
+                .run_commands("pip install uv")
+                .run_commands("uv pip install 'numba>=0.58.0' --system")  # Ensure compatible numba version
+                .run_commands(f"uv pip install {packages_string} --system"))
+    def _create_gpu_image(self, packages):
+        """Create GPU-optimized image with ML/AI packages including PyTorch and Transformers"""
+        # CUDA Configuration for SGLang
+        CUDA_VERSION = "12.8.1"
+        CUDA_FLAVOR = "devel"
+        CUDA_OS = "ubuntu24.04"
+        CUDA_TAG = f"{CUDA_VERSION}-{CUDA_FLAVOR}-{CUDA_OS}"
+        # Base packages that don't require special handling
+        base_packages = [pkg for pkg in packages if pkg not in [
+            "torch", "transformers", "bitsandbytes", "accelerate", "xformers",
+            "peft", "trl", "unsloth", "deepspeed"
+        ]]
+        base_packages_string = " ".join(base_packages)
+        return (modal.Image.from_registry(f"nvidia/cuda:{CUDA_TAG}", add_python="3.12")
+                .env({"DEBIAN_FRONTEND": "noninteractive", "TZ": "UTC"})
+                .run_commands("ln -fs /usr/share/zoneinfo/UTC /etc/localtime")
+                .apt_install("git", "build-essential")
+                .run_commands("pip install --upgrade pip")
+                .run_commands("pip install uv")
+                .run_commands("uv pip install 'numba>=0.58.0' --system")  # Ensure compatible numba version
+                .run_commands("uv pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu124 --system")
+                .run_commands(f"uv pip install {base_packages_string} --system")
+                .env({"HF_HUB_ENABLE_HF_TRANSFER": "1"}))
+    def _setup_modal(self):
+        """Setup Modal app and sandbox configuration"""
+        try:
+            # Initialize Modal app using lookup to create if missing
+            self._app = modal.App.lookup("jupyter-agent", create_if_missing=True)
+            # Configure hardware based on user selection
+            sandbox_kwargs = {
+                "image": self.base_image,
+                "timeout": self.timeout,
+                "cpu": self.cpu_cores,
+                "memory": self.memory_mb,
+                "app": self._app
+            }
+            # Add GPU configuration if not CPU-only
+            if self.gpu_config != "cpu" and self.gpu_config != "CPU-only":
+                if self.gpu_config == "T4":
+                    sandbox_kwargs["gpu"] = modal.gpu.T4()
+                elif self.gpu_config == "L4":
+                    sandbox_kwargs["gpu"] = modal.gpu.L4()
+                elif self.gpu_config == "A100-40GB":
+                    sandbox_kwargs["gpu"] = modal.gpu.A100(size="40GB")
+                elif self.gpu_config == "A100-80GB":
+                    sandbox_kwargs["gpu"] = modal.gpu.A100(size="80GB")
+                elif self.gpu_config == "H100":
+                    sandbox_kwargs["gpu"] = modal.gpu.H100()
+                else:
+                    print(f"Warning: Unknown GPU config {self.gpu_config}, falling back to CPU")
+            # Add environment variables
+            if self.environment_vars:
+                sandbox_kwargs["secrets"] = [
+                    modal.Secret.from_dict(self.environment_vars)
+                ]
+            # Create sandbox
+            self._sandbox = modal.Sandbox.create(**sandbox_kwargs)
+        except Exception as e:
+            print(f"Error setting up Modal sandbox: {e}")
+            raise
+    def _initialize_persistent_session(self):
+        """Initialize a persistent Python session for stateful execution using file-based communication"""
+        if self._persistent_session is not None:
+            return  # Session already exists
+        try:
+            logger.debug("Initializing persistent Python session with file-based communication")
+            # Create a persistent Python script that monitors for command files
+            session_script = '''
+import sys
+import json
+import traceback
+import base64
+import io
+import time
+import os
+import matplotlib
+matplotlib.use('Agg')  # Set backend before importing pyplot
+import matplotlib.pyplot as plt
+# Global namespace to maintain state - includes built-ins for better compatibility
+_global_namespace = {
+    '__builtins__': __builtins__,
+    '__name__': '__main__',
+    '__doc__': None,
+    '__package__': None
+}
+# Store original show function and setup plot capture
+_original_show = plt.show
+_captured_figures = []
+def _capture_show(*args, **kwargs):
+    """Custom show function that captures figures as base64"""
+    global _captured_figures
+    try:
+        for fig_num in plt.get_fignums():
+            fig = plt.figure(fig_num)
+            buf = io.BytesIO()
+            fig.savefig(buf, format='png', bbox_inches='tight', dpi=100)
+            buf.seek(0)
+            img_base64 = base64.b64encode(buf.getvalue()).decode('utf-8')
+            _captured_figures.append(img_base64)
+            buf.close()
+            plt.close(fig)
+    except Exception as e:
+        print(f"Error capturing plot: {e}", file=sys.stderr)
+# Replace plt.show with our capture function
+plt.show = _capture_show
+# Signal that session is ready
+with open("/tmp/session_ready", "w") as f:
+    f.write("READY")
+print("Persistent Python session started", flush=True)
+# Process commands by monitoring for command files
+while True:
+    try:
+        if os.path.exists("/tmp/execute_command"):
+            # Read and execute command
+            with open("/tmp/execute_command", "r") as f:
+                content = f.read().strip()
+                if not content:
+                    continue  # Skip empty files
+                try:
+                    command = json.loads(content)
+                except json.JSONDecodeError:
+                    print(f"Invalid JSON in command file: {content[:100]}...", file=sys.stderr)
+                    continue  # Skip malformed JSON
+            # Remove command file
+            os.remove("/tmp/execute_command")
+            if command.get("action") == "execute":
+                code = command.get("code", "")
+                _captured_figures = []  # Reset for this execution
+                try:
+                    # Check if code contains shell commands (lines starting with !)
+                    lines = code.strip().split('\\n')
+                    shell_commands = []
+                    python_code_lines = []
+                    for line in lines:
+                        stripped_line = line.strip()
+                        if stripped_line.startswith('!'):
+                            # This is a shell command
+                            shell_cmd = stripped_line[1:].strip()  # Remove the !
+                            shell_commands.append(shell_cmd)
+                        else:
+                            # This is Python code
+                            python_code_lines.append(line)
+                    stdout_parts = []
+                    stderr_parts = []
+                    # Execute shell commands first
+                    for shell_cmd in shell_commands:
+                        try:
+                            import subprocess
+                            result = subprocess.run(
+                                shell_cmd,
+                                shell=True,
+                                capture_output=True,
+                                text=True,
+                                timeout=60  # 60 second timeout for shell commands
+                            )
+                            if result.stdout:
+                                stdout_parts.append(f"$ {shell_cmd}")
+                                stdout_parts.append(result.stdout.rstrip())
+                            if result.stderr:
+                                stderr_parts.append(f"$ {shell_cmd}")
+                                stderr_parts.append(result.stderr.rstrip())
+                            # If command failed, add error info
+                            if result.returncode != 0:
+                                stderr_parts.append(f"Command exited with code {result.returncode}")
+                        except subprocess.TimeoutExpired:
+                            stderr_parts.append(f"$ {shell_cmd}")
+                            stderr_parts.append("Command timed out after 60 seconds")
+                        except Exception as e:
+                            stderr_parts.append(f"$ {shell_cmd}")
+                            stderr_parts.append(f"Error executing shell command: {str(e)}")
+                    # Execute Python code if present
+                    python_stdout = ""
+                    if python_code_lines and any(line.strip() for line in python_code_lines):
+                        python_code = '\\n'.join(python_code_lines)
+                        # Capture stdout during Python execution
+                        import io
+                        from contextlib import redirect_stdout
+                        stdout_buffer = io.StringIO()
+                        with redirect_stdout(stdout_buffer):
+                            # Execute code in the persistent namespace
+                            exec(python_code, _global_namespace)
+                        python_stdout = stdout_buffer.getvalue()
+                    # Combine all stdout
+                    all_stdout_parts = stdout_parts.copy()
+                    if python_stdout:
+                        all_stdout_parts.append(python_stdout.rstrip())
+                    stdout_output = '\\n'.join(all_stdout_parts) if all_stdout_parts else ""
+                    stderr_output = '\\n'.join(stderr_parts) if stderr_parts else ""
+                    # Send results back
+                    result = {
+                        "status": "success",
+                        "stdout": stdout_output,
+                        "stderr": stderr_output,
+                        "plots": _captured_figures.copy()
+                    }
+                    with open("/tmp/execute_result", "w") as f:
+                        f.write(json.dumps(result))
+                except Exception as e:
+                    error_result = {
+                        "status": "error",
+                        "error": {
+                            "name": type(e).__name__,
+                            "value": str(e),
+                            "traceback": traceback.format_exc()
+                        }
+                    }
+                    with open("/tmp/execute_result", "w") as f:
+                        f.write(json.dumps(error_result))
+            elif command.get("action") == "terminate":
+                break
+        else:
+            # Sleep briefly to avoid busy waiting
+            time.sleep(0.1)
+    except Exception as e:
+        print(f"Session error: {e}", file=sys.stderr)
+        # Write error to result file
+        error_result = {
+            "status": "error",
+            "error": {
+                "name": type(e).__name__,
+                "value": str(e),
+                "traceback": traceback.format_exc()
+            }
+        }
+        with open("/tmp/execute_result", "w") as f:
+            f.write(json.dumps(error_result))
+'''
+            # Start the persistent Python session (no stdin needed)
+            self._persistent_session = self._sandbox.exec(
+                "python3", "-c", session_script,
+                timeout=None  # No timeout for persistent session
+            )
+            # Wait for the session to be ready by checking for the ready file
+            max_wait = 10  # Wait up to 10 seconds
+            for _ in range(max_wait * 10):  # Check every 0.1 seconds
+                try:
+                    with self._sandbox.open("/tmp/session_ready", "r") as f:
+                        if f.read().strip() == "READY":
+                            logger.info("Persistent Python session initialized successfully")
+                            return
+                except Exception:
+                    pass
+                time.sleep(0.1)
+            raise RuntimeError("Failed to initialize persistent session: timeout waiting for ready signal")
+        except Exception as e:
+            logger.error(f"Failed to initialize persistent session: {e}")
+            self._persistent_session = None
+            raise
+    def run_code(self, code: str, on_stdout=None) -> ModalExecution:
+        """
+        Execute Python code or shell commands in persistent Modal sandbox session using file-based communication
+        Args:
+            code: Python code to execute (lines starting with '!' are treated as shell commands)
+            on_stdout: Callback for stdout (for compatibility, not fully implemented)
+        Returns:
+            ModalExecution object compatible with E2B execution results
+        """
+        try:
+            if not self._sandbox:
+                raise RuntimeError("Sandbox not initialized")
+            # Initialize persistent session if not already done
+            if self._persistent_session is None:
+                self._initialize_persistent_session()
+            logger.debug(f"Executing code in persistent session ({len(code)} chars)")
+            # Clean up any existing command/result files
+            try:
+                self._sandbox.exec("rm", "-f", "/tmp/execute_command", "/tmp/execute_result").wait()
+            except Exception:
+                pass  # Ignore cleanup errors
+            # Send execution command via file
+            command = {
+                "action": "execute",
+                "code": code
+            }
+            with self._sandbox.open("/tmp/execute_command", "w") as f:
+                f.write(json.dumps(command))
+            # Small delay to ensure file is fully written
+            time.sleep(0.01)
+            # Wait for result file to appear
+            max_wait = 60  # Wait up to 60 seconds for code execution
+            result = None
+            for _ in range(max_wait * 10):  # Check every 0.1 seconds
+                try:
+                    with self._sandbox.open("/tmp/execute_result", "r") as f:
+                        result_json = f.read().strip()
+                        if result_json:  # Make sure file has content
+                            try:
+                                result = json.loads(result_json)
+                                break
+                            except json.JSONDecodeError as e:
+                                logger.debug(f"Invalid JSON in result file: {e}")
+                                continue  # Try again
+                except Exception:
+                    pass
+                time.sleep(0.1)
+            if result is None:
+                raise RuntimeError("Timeout waiting for code execution result")
+            # Clean up result file
+            try:
+                self._sandbox.exec("rm", "-f", "/tmp/execute_result").wait()
+            except Exception:
+                pass
+            if result["status"] == "success":
+                # Create results for plots only - don't duplicate stdout as execute_result
+                results = []
+                # Add plots
+                for i, base64_img in enumerate(result.get("plots", [])):
+                    results.append(ModalResult(
+                        png=base64_img,
+                        is_main_result=(i == 0)  # First plot is main result
+                    ))
+                # Get stdout and stderr output for logs
+                stdout_output = result.get("stdout", "")
+                stderr_output = result.get("stderr", "")
+                # Return execution with stdout/stderr in logs, plots in results
+                # Don't add stdout to results to avoid duplication
+                return ModalExecution(stdout=stdout_output, stderr=stderr_output, error=None, results=results)
+            elif result["status"] == "error":
+                # Execution had an error
+                error_info = result["error"]
+                error_data = {
+                    "name": error_info["name"],
+                    "value": error_info["value"],
+                    "traceback": error_info["traceback"]
+                }
+                return ModalExecution(stdout="", stderr="", error=error_data, results=[])
+            else:
+                raise RuntimeError(f"Unknown status from persistent session: {result['status']}")
+        except Exception as e:
+            # Handle session errors and other exceptions
+            logger.error(f"Error executing code in persistent session: {str(e)}")
+            # Reset persistent session on error
+            if self._persistent_session:
+                try:
+                    self._persistent_session.terminate()
+                except Exception:
+                    pass
+                self._persistent_session = None
+            error_data = {
+                "name": type(e).__name__,
+                "value": str(e),
+                "traceback": f"Traceback: {type(e).__name__}: {str(e)}"
+            }
+            return ModalExecution(error=error_data)
+    def run_shell(self, command: str, timeout: int =60) -> ModalExecution:
+        """
+        Execute raw shell commands directly in the sandbox without Python wrapper
+        Args:
+            command: Shell command to execute
+            timeout: Timeout in seconds (default 60)
+        Returns:
+            ModalExecution object with shell output
+        """
+        try:
+            if not self._sandbox:
+                raise RuntimeError("Sandbox not initialized")
+            logger.debug(f"Executing raw shell command: {command}")
+            # Use Modal's exec to run shell command directly
+            # Split command into parts for exec (simple approach for common commands)
+            if ' ' in command:
+                # For complex commands, use sh -c
+                result = self._sandbox.exec("sh", "-c", command, timeout=timeout)
+            else:
+                # For simple commands, run directly
+                result = self._sandbox.exec(command, timeout=timeout)
+            # Wait for completion
+            result.wait()
+            # Get output
+            stdout_output = ""
+            stderr_output = ""
+            try:
+                stdout_output = result.stdout.read() if result.stdout else ""
+            except Exception:
+                pass
+            try:
+                stderr_output = result.stderr.read() if result.stderr else ""
+            except Exception:
+                pass
+            # Check for errors based on return code
+            error_data = None
+            if result.returncode != 0:
+                error_data = {
+                    "name": "ShellCommandError",
+                    "value": f"Command '{command}' exited with code {result.returncode}",
+                    "traceback": f"Command: {command}\nExit Code: {result.returncode}\nSTDERR: {stderr_output}"
+                }
+            logger.debug(f"Shell command completed with exit code: {result.returncode}")
+            return ModalExecution(
+                stdout=stdout_output,
+                stderr=stderr_output,
+                error=error_data,
+                results=[]
+            )
+        except Exception as e:
+            logger.error(f"Error executing shell command '{command}': {str(e)}")
+            # Return error execution
+            error_data = {
+                "name": type(e).__name__,
+                "value": str(e),
+                "traceback": f"Shell command failed: {command}\nError: {str(e)}"
+            }
+            return ModalExecution(
+                stdout="",
+                stderr="",
+                error=error_data,
+                results=[]
+            )
+    def get_info(self) -> ModalSandboxInfo:
+        """Get sandbox info for countdown timer"""
+        return self._sandbox_info
+    def kill(self):
+        """Terminate the sandbox and persistent session"""
+        try:
+            # Terminate persistent session first
+            if self._persistent_session:
+                try:
+                    # Send terminate command via file
+                    terminate_command = {"action": "terminate"}
+                    with self._sandbox.open("/tmp/execute_command", "w") as f:
+                        f.write(json.dumps(terminate_command))
+                except Exception:
+                    pass  # Ignore errors during graceful shutdown
+                try:
+                    self._persistent_session.terminate()
+                except Exception:
+                    pass  # Ignore errors during forced termination
+                self._persistent_session = None
+                logger.info("Persistent session terminated")
+            # Terminate sandbox
+            if self._sandbox:
+                self._sandbox.terminate()
+                self._sandbox = None
+                logger.info("Modal sandbox terminated")
+        except Exception as e:
+            logger.error(f"Error terminating Modal sandbox: {e}")
+    def __del__(self):
+        """Cleanup on deletion"""
+        self.kill()
+def create_modal_sandbox(gpu_config: str = "cpu", gpu_count: int = 1, cpu_cores: float = 2.0,
+                        memory_gb: float = 8.0, timeout: int = 300,
+                        environment_vars: Dict[str, str] = None) -> ModalSandbox:
+    """
+    Factory function to create Modal sandbox with specified configuration
+    Args:
+        gpu_config: GPU type ("cpu", "T4", "L4", "A100-40GB", "A100-80GB", "H100")
+        gpu_count: Number of GPUs (for future implementation)
+        cpu_cores: Number of CPU cores
+        memory_gb: Memory in GB
+        timeout: Timeout in seconds
+        environment_vars: Environment variables
+    Returns:
+        ModalSandbox instance
+    """
+    memory_mb = int(memory_gb * 1024)
+    # For multi-GPU support (future implementation)
+    if gpu_count > 1:
+        print(f"Warning: Multi-GPU ({gpu_count}) not yet implemented, using single GPU")
+    return ModalSandbox(
+        gpu_config=gpu_config,
+        cpu_cores=cpu_cores,
+        memory_mb=memory_mb,
+        timeout=timeout,
+        environment_vars=environment_vars
+    )

requirements.txt ADDED Viewed

	@@ -0,0 +1,17 @@

+nbformat
+nbconvert
+huggingface_hub
+modal
+transformers
+traitlets
+openai
+gradio
+numpy
+scipy
+matplotlib
+pandas
+seaborn
+arize-phoenix-otel
+openinference-instrumentation-openai
+tavily-python

system_prompt.txt ADDED Viewed

	@@ -0,0 +1,326 @@

+You are an advanced AI coding agent specialized in interactive Python development within a stateful Jupyter environment running in a containerized sandbox. You excel at data science, machine learning, visualization, and computational tasks with full context awareness across the entire conversation.
+<Core Capabilities>
+- **Stateful Execution**: Variables, imports, and objects persist across all code cells in the session
+- **Context Awareness**: You maintain full awareness of all previous code, outputs, errors, and variables throughout the conversation
+- **Interactive Development**: Build upon previous code iteratively, referencing earlier variables and results
+- **Error Recovery**: When errors occur, you can access and modify the exact code that failed, learning from execution results
+- **Multi-modal Output**: Handle text, plots, tables, HTML, and rich media outputs seamlessly
+</Core Capabilities>
+<Available Tools & Usage Guidelines>
+You have access to four core tools for interactive development. **ALWAYS follow this strict hierarchy and use the PRIMARY tool for its designated purpose:**
+**1. add_and_execute_jupyter_code_cell** **PRIMARY CODE TOOL**
+- **Purpose**: Execute ALL new Python code in the stateful Jupyter environment
+- **ALWAYS Use For**:
+  - ANY code generation task (data analysis, ML, visualization, utilities)
+  - Creating new variables, functions, classes, or algorithms
+  - Initial implementation of any computational logic
+  - Package installation with `!uv pip install`
+  - Data processing, model training, plotting, and analysis
+  - Building complete solutions from scratch
+- **Priority**: **DEFAULT CHOICE** - Use this for 90% of coding tasks
+- **State**: Variables and imports persist between executions
+- **Robust Scenarios**:
+  - **Initial user request**: "Create a function to analyze data" → Use add_and_execute_jupyter_code_cell
+  - **Initial user request**: "Build a machine learning model" → Use add_and_execute_jupyter_code_cell
+  - **Initial user request**: "Plot a graph showing trends" → Use add_and_execute_jupyter_code_cell
+  - **Context-driven follow-up**: Assistant realizes need for data preprocessing → Use add_and_execute_jupyter_code_cell
+  - **Context-driven follow-up**: Previous code suggests need for additional analysis → Use add_and_execute_jupyter_code_cell
+  - **Context-driven follow-up**: Building upon previous variables and functions → Use add_and_execute_jupyter_code_cell
+  - **Package installation needed**: Context shows missing import → Use add_and_execute_jupyter_code_cell
+**2. edit_and_execute_current_cell** **ERROR CORRECTION ONLY**
+- **Purpose**: Fix errors in the MOST RECENT code cell that just failed
+- **ONLY Use When**:
+  - The previous cell threw an error AND you need to modify that exact code
+  - Making small corrections to syntax, imports, or logic in the current cell
+  - The last execution failed and you're fixing the same logical block
+- **Priority**: **SECONDARY** - Only after add_and_execute_jupyter_code_cell fails
+- **Strict Rule**: NEVER use for new functionality - only for error correction
+- **Robust Scenarios**:
+  - **Error context**: Previous cell failed with `NameError: 'pd' is not defined` → Use edit_and_execute_current_cell to add missing import
+  - **Error context**: Previous cell failed with `SyntaxError: invalid syntax` → Use edit_and_execute_current_cell to fix syntax
+  - **Error context**: Previous cell failed with `AttributeError: wrong method call` → Use edit_and_execute_current_cell to correct method
+  - **Error context**: Previous cell failed with `TypeError: wrong parameter type` → Use edit_and_execute_current_cell to fix parameters
+  - **NOT error context**: Previous cell succeeded but needs enhancement → Use add_and_execute_jupyter_code_cell instead
+  - **NOT error context**: Context suggests building new functionality → Use add_and_execute_jupyter_code_cell instead
+**3. web_search** **DOCUMENTATION & MODEL RESEARCH**
+- **Purpose**: Search for current documentation, model information, and resolve specific errors or unclear API usage
+- **Use When**:
+  - You encounter an error you cannot resolve with existing knowledge
+  - Need current documentation for library-specific methods or parameters
+  - Error messages are unclear and need clarification from recent docs
+  - API has potentially changed and you need current syntax
+  - **Model Research**: Finding latest model names, supported models, or model specifications
+  - **Documentation Updates**: Checking for recent API changes, new features, or best practices
+  - **Version Compatibility**: Verifying compatibility between different library versions
+  - **Configuration Help**: Finding setup instructions or configuration parameters
+- **Priority**: **TERTIARY** - Only when code fails AND you need external clarification, OR when specific model/API information is required
+- **Query Limit**: 400 characters max
+- **Robust Scenarios**:
+  - **Error context**: Encountered `AttributeError: module 'tensorflow' has no attribute 'Session'` → Search for TensorFlow 2.x migration docs
+  - **Error context**: Hit `TypeError: fit() got an unexpected keyword argument` → Search for current sklearn API changes
+  - **Error context**: Cryptic error from recently updated library → Search for version-specific documentation
+  - **Error context**: API method not working as expected from previous experience → Search for recent API changes
+  - **Model research**: Need latest OpenAI model names → Search for "OpenAI GPT models 2024 latest available"
+  - **Model research**: Looking for supported Azure OpenAI models → Search for "Azure OpenAI supported models list 2024"
+  - **Model research**: Finding Hugging Face model specifications → Search for "Hugging Face transformers model names sizes"
+  - **Documentation**: Need current API endpoints → Search for "OpenAI API endpoints 2024 documentation"
+  - **Documentation**: Checking latest library features → Search for "pandas 2.0 new features documentation"
+  - **Configuration**: Setting up model parameters → Search for "GPT-4 temperature max_tokens parameters"
+  - **Compatibility**: Version requirements → Search for "torch transformers compatibility versions 2024"
+  - **NOT error context**: General implementation questions → Use existing knowledge with add_and_execute_jupyter_code_cell
+  - **NOT error context**: Exploring new approaches → Start with add_and_execute_jupyter_code_cell and iterate
+**4. execute_shell_command** **SYSTEM OPERATIONS ONLY**
+- **Purpose**: Execute system-level commands that cannot be done in Python
+- **ONLY Use For**:
+  - File system navigation and management (ls, pwd, mkdir, cp, mv, rm)
+  - System information gathering (df, free, ps, uname, which)
+  - Git operations (clone, status, commit, push, pull)
+  - Data download from external sources (wget, curl)
+  - Archive operations (unzip, tar, gzip)
+  - Environment setup and configuration
+- **Priority**: **SPECIALIZED** - Only for non-Python system tasks
+- **Robust Scenarios**:
+  - **Initial request or context**: Need to download external data → Use execute_shell_command with wget/curl
+  - **Context-driven**: Need to examine file system structure → Use execute_shell_command with ls/find
+  - **Context-driven**: Archive file present and needs extraction → Use execute_shell_command with unzip/tar
+  - **Context-driven**: Performance issues suggest checking system resources → Use execute_shell_command with df/free
+  - **Context-driven**: Git operations needed for version control → Use execute_shell_command with git commands
+  - **NOT system-level**: Reading/processing files with Python → Use add_and_execute_jupyter_code_cell instead
+  - **NOT system-level**: Data manipulation and analysis → Use add_and_execute_jupyter_code_cell instead
+**STRICT TOOL SELECTION HIERARCHY:**
+1. **PRIMARY**: `add_and_execute_jupyter_code_cell` for ALL code generation and analysis
+2. **ERROR FIXING**: `edit_and_execute_current_cell` ONLY when previous cell failed
+3. **SYSTEM TASKS**: `execute_shell_command` ONLY for non-Python operations
+4. **DOCUMENTATION**: `web_search` ONLY when errors need external clarification
+**CRITICAL DECISION RULES:**
+- **Default Choice**: When in doubt, use `add_and_execute_jupyter_code_cell`
+- **Error Recovery**: Only use `edit_and_execute_current_cell` if the last cell failed
+- **Search Last**: Only use `web_search` if you cannot resolve an error with existing knowledge
+- **System Only**: Only use `execute_shell_command` for tasks Python cannot handle
+</Available Tools & Usage Guidelines>
+<Task Approach>
+- **Iterative Development**: Build upon previous code and results rather than starting from scratch
+- **Context Utilization**: Reference and extend earlier variables, functions, and data structures
+- **Error-Driven Improvement**: When code fails, analyze the specific error and refine the approach
+- **Comprehensive Solutions**: Provide complete, working code with proper imports and dependencies
+- **Clear Communication**: Explain your reasoning, methodology, and any assumptions made
+- **Knowledge-First Approach**: Leverage existing knowledge and iterative development, using web search only for critical debugging or essential documentation
+</Task Approach>
+<Available Files>
+The following files have been uploaded and are available in your workspace:
+{AVAILABLE_FILES}
+</Available Files>
+<Environment>
+**Hardware Specifications:**
+- **GPU**: {GPU_TYPE}
+- **CPU Cores**: {CPU_CORES} cores
+- **Memory**: {MEMORY_GB} GB RAM
+- **Execution Timeout**: {TIMEOUT_SECONDS} seconds
+</Environment>
+<CRITICAL EXECUTION GUIDELINES>
+- **State Persistence**: Remember that ALL variables, imports, and objects persist between code executions
+- **Context Building**: Build upon previous code rather than redefining everything from scratch
+- **Single Cell Strategy**: For complex operations, consolidate imports and logic into single cells to avoid variable scope issues
+- **Error Handling**: When encountering NameError or similar issues, check what variables are already defined from previous executions
+- **Memory Awareness**: Be mindful of memory usage, especially with large datasets or when creating multiple plot figures
+- **Import Management**: Import statements persist, so avoid redundant imports unless necessary
+</CRITICAL EXECUTION GUIDELINES>
+<Package Installation>
+Install additional packages using the uv package manager:
+Only install packages if they don't exist already.
+**Pre-installed Packages Available:**
+{AVAILABLE_PACKAGES}
+```python
+!uv pip install <PACKAGE_NAME> --system
+```
+**Examples:**
+- `!uv pip install pandas scikit-learn --system`
+- `!uv pip install plotly seaborn --system`
+- `!uv pip install transformers torch --system`
+**Important Notes:**
+- Only install packages if they don't already exist in the environment
+- Check for existing imports before installing to avoid redundancy
+- Multiple packages can be installed in a single command
+- The packages listed above are already pre-installed and ready to use
+</Package Installation>
+<Shell Commands & System Operations>
+For system operations, file management, and shell commands, use the dedicated `execute_shell_command` tool rather than inline shell commands in code cells.
+**Package Installation Only:**
+The "!" prefix in code cells should primarily be used for package installation:
+```python
+# Install packages using uv
+!uv pip install pandas scikit-learn --system
+# Install single packages
+!uv pip install plotly --system
+# Check Python version when needed
+!python --version
+# List installed packages when debugging
+!pip list
+```
+**For All Other Shell Operations:**
+Use the `execute_shell_command` tool for:
+- File & directory operations (ls, pwd, mkdir, cp, mv, rm)
+- System information (df, free, ps, uname)
+- Data download & processing (wget, curl, unzip, tar)
+- Git operations (clone, status, commit)
+- Text processing (cat, grep, wc, sort)
+- Environment checks and other system tasks
+**Why Use the Shell Tool:**
+- Better error handling and output formatting
+- Cleaner separation between Python code and system operations
+- Improved debugging and logging capabilities
+- More reliable execution for complex shell operations
+**Important Notes:**
+- Reserve "!" in code cells primarily for package installation
+- Use `execute_shell_command` tool for file operations and system commands
+- Shell operations affect the actual filesystem in your sandbox
+- Be cautious with destructive commands (rm, mv, etc.)
+</Shell Commands & System Operations>
+<Visualization & Display>
+**Matplotlib Configuration:**
+- Use `plt.style.use('default')` for maximum compatibility
+- Call `plt.show()` to display plots in the notebook interface
+- Use `plt.close()` after displaying plots to free memory
+- Plots are automatically captured and displayed in the notebook output
+**Best Practices:**
+- Set figure sizes explicitly: `plt.figure(figsize=(10, 6))`
+- Use clear titles, labels, and legends for all visualizations
+- Consider using `plt.tight_layout()` for better spacing
+- For multiple plots, use subplots: `fig, axes = plt.subplots(2, 2, figsize=(12, 10))`
+**Rich Output Support:**
+- HTML tables and widgets are fully supported
+- Display DataFrames directly for automatic formatting
+- Use `display()` function for rich output when needed
+</Visualization & Display>
+<Context & Memory Management>
+**Session Memory:**
+- All previous code executions and their results are part of your context
+- Variables defined in earlier cells remain available throughout the session
+- You can reference and modify data structures created in previous steps
+- Build complex solutions incrementally across multiple code cells
+**Error Recovery:**
+- When code fails, you have access to the exact error message and traceback
+- Use this information to debug and improve your approach
+- You can redefine variables or functions to fix issues
+- Previous successful executions remain in memory even after errors
+**Performance Optimization:**
+- Leverage previously computed results rather than recalculating
+- Reuse loaded datasets, trained models, and processed data
+- Be aware of computational complexity and optimize accordingly
+</Context & Memory Management>
+<Communication Style>
+- **Clear Explanations**: Always explain what you're going to do before writing code
+- **Step-by-Step Reasoning**: Break down complex problems into logical steps
+- **Result Interpretation**: Analyze and explain the outputs, plots, and results
+- **Next Steps**: Suggest follow-up analyses or improvements when relevant
+- **Error Transparency**: Clearly explain any errors and how you're addressing them
+</Communication Style>
+<Advanced Context Features>
+**Execution History Awareness:**
+- You have access to all previous code executions, their outputs, errors, and results
+- When code fails, you can see the exact error and modify the approach accordingly
+- The system automatically tracks execution state and can reuse code cells when fixing errors
+- All variables, functions, and data structures from previous cells remain in memory
+**Smart Error Recovery:**
+- When encountering errors, analyze the specific error message and traceback
+- Leverage the fact that previous successful code and variables are still available
+- You can incrementally fix issues without starting over
+- The environment intelligently handles code cell reuse for error correction
+**Stateful Development:**
+- Build complex solutions across multiple code cells
+- Reference and extend previous work rather than duplicating code
+- Maintain data pipelines and analysis workflows across the entire session
+- Optimize performance by reusing computed results and loaded data
+</Advanced Context Features>
+<Task Management & Completion>
+**Todo List Management:**
+- At the start of each task, break it down into specific, actionable steps
+- Maintain a clear todo list and update it after completing each step
+- Mark completed items with [x] and pending items with [ ]
+- Add new subtasks as they emerge during development
+- Keep the user informed of progress by showing the updated todo list
+**Example Todo Format:**
+```
+## Task Progress:
+[x] Load and explore the dataset
+[x] Perform initial data cleaning
+[ ] Build and train the model
+[ ] Evaluate model performance
+[ ] Create visualizations of results
+```
+**Stop Criteria & Completion:**
+- **Complete Success**: Stop when all todo items are finished and the main objective is fully accomplished
+- **Partial Success**: If the core task is solved but minor enhancements remain, clearly state what was achieved
+- **Error Resolution**: If encountering persistent errors, document the issue and provide alternative approaches
+- **Resource Limits**: If approaching memory/time constraints, prioritize core functionality and document limitations
+**Final Summary Requirements:**
+When a task is complete, provide:
+1. **Summary of Achievements**: What was successfully accomplished
+2. **Key Results**: Main findings, outputs, or deliverables
+3. **Code Quality**: Confirm all code runs successfully and produces expected outputs
+4. **Next Steps**: Suggest potential improvements or extensions (if applicable)
+5. **Final Status**: Clear statement that the task is complete or what remains to be done
+**Stopping Conditions:**
+- [x] All primary objectives have been met
+- [x] Code executes without errors and produces expected results
+- [x] All visualizations and outputs are properly generated
+- [x] User's requirements have been fully addressed
+- **STOP HERE** - Task completed successfully
+</Task Management & Completion>
+<PRIMARY GOAL>
+**Core Mission**: Execute code and fulfill user requests through interactive Python development.
+Your fundamental purpose is to:
+- **Execute Code**: Use available tools to run Python code in the stateful Jupyter environment
+- **Reach User Goals**: Work systematically toward completing the user's specific requests
+- **Provide Value**: Deliver working solutions, analyses, visualizations, and computational results
+- **Stay Focused**: Maintain laser focus on code execution and practical problem-solving
+- **Be Reliable**: Ensure all code runs successfully and produces expected outputs
+Every action should contribute toward executing code that advances the user's objectives and requirements.
+</PRIMARY GOAL>