AdithyaSK commited on
Commit
744e5e2
·
1 Parent(s): 2ea19ae

Eureka agent init - Adithya S K

Browse files
Files changed (10) hide show
  1. .env.example +29 -0
  2. .gitattributes +2 -0
  3. LICENSE +200 -0
  4. README.md +153 -13
  5. app.py +1761 -0
  6. jupyter_agent.py +1463 -0
  7. jupyter_handler.py +1161 -0
  8. modal_sandbox.py +794 -0
  9. requirements.txt +17 -0
  10. system_prompt.txt +326 -0
.env.example ADDED
@@ -0,0 +1,29 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # OpenAI Configuration (choose ONE of the options below)
2
+
3
+ # Option 1: Standard OpenAI
4
+ # OPENAI_API_KEY=sk-your-openai-api-key-here
5
+ # MODEL_NAME=gpt-4
6
+
7
+ # Option 2: Azure OpenAI
8
+ # AZURE_OPENAI_ENDPOINT=https://your-resource-name.openai.azure.com/
9
+ # AZURE_OPENAI_API_KEY=your-azure-openai-api-key-here
10
+ # MODEL_NAME=gpt-4 # This should be your deployment name in Azure
11
+
12
+ # Option 3: Custom Provider (e.g., Cerebras)
13
+ # PROVIDER_API_ENDPOINT=https://api.cerebras.ai/v1
14
+ # PROVIDER_API_KEY=your-cerebras-api-key-here
15
+ # MODEL_NAME=llama3.1-70b
16
+
17
+ # Phoenix Tracing (Optional)
18
+ # PHOENIX_API_KEY=your-phoenix-api-key-here
19
+ # PHOENIX_COLLECTOR_ENDPOINT=https://app.phoenix.arize.com/v1/traces
20
+
21
+ # Modal Configuration (for sandbox execution)
22
+ # MODAL_TOKEN_ID=your-modal-token-id
23
+ # MODAL_TOKEN_SECRET=your-modal-token-secret
24
+
25
+ # Hugging Face Token (Optional)
26
+ # HF_TOKEN=your-huggingface-token
27
+
28
+ # TAVILY API KEY FOR WEB SEARCH
29
+ # TAVILY_API_KEY=
.gitattributes CHANGED
@@ -33,3 +33,5 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ jupyter-agent-2.png filter=lfs diff=lfs merge=lfs -text
37
+ powered-by.png filter=lfs diff=lfs merge=lfs -text
LICENSE ADDED
@@ -0,0 +1,200 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Apache License
2
+ Version 2.0, January 2004
3
+ http://www.apache.org/licenses/
4
+
5
+ TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
6
+
7
+ 1. Definitions.
8
+
9
+ "License" shall mean the terms and conditions for use, reproduction,
10
+ and distribution as defined by Sections 1 through 9 of this document.
11
+
12
+ "Licensor" shall mean the copyright owner or entity granting the License.
13
+
14
+ "Legal Entity" shall mean the union of the acting entity and all
15
+ other entities that control, are controlled by, or are under common
16
+ control with that entity. For the purposes of this definition,
17
+ "control" means (i) the power, direct or indirect, to cause the
18
+ direction or management of such entity, whether by contract or
19
+ otherwise, or (ii) ownership of fifty percent (50%) or more of the
20
+ outstanding shares, or (iii) beneficial ownership of such entity.
21
+
22
+ "You" (or "Your") shall mean an individual or Legal Entity
23
+ exercising permissions granted by this License.
24
+
25
+ "Source" form shall mean the preferred form for making modifications,
26
+ including but not limited to software source code, documentation
27
+ source, and configuration files.
28
+
29
+ "Object" form shall mean any form resulting from mechanical
30
+ transformation or translation of a Source form, including but
31
+ not limited to compiled object code, generated documentation,
32
+ and conversions to other media types.
33
+
34
+ "Work" shall mean the work of authorship, whether in Source or
35
+ Object form, made available under the License, as indicated by a
36
+ copyright notice that is included in or attached to the work
37
+ (which may not be construed as modifying the License).
38
+
39
+ "Derivative Works" shall mean any work, whether in Source or Object
40
+ form, that is based upon (or derived from) the Work and for which the
41
+ editorial revisions, annotations, elaborations, or other modifications
42
+ represent, as a whole, an original work of authorship. For the purposes
43
+ of this License, Derivative Works shall not include works that remain
44
+ separable from, or merely link (or bind by name) to the interfaces of,
45
+ the Work.
46
+
47
+ "Contribution" shall mean any work of authorship, including
48
+ the original version of the Work and any modifications or additions
49
+ to that Work or Derivative Works thereof, that is intentionally
50
+ submitted to Licensor for inclusion in the Work by the copyright owner
51
+ or by an individual or Legal Entity authorized to submit on behalf of
52
+ the copyright owner. For the purposes of this definition, "submitted"
53
+ means any form of electronic, verbal, or written communication sent
54
+ to the Licensor or its representatives, including but not limited to
55
+ communication on electronic mailing lists, source code control systems,
56
+ and issue tracking systems that are managed by, or on behalf of, the
57
+ Licensor for the purpose of discussing and improving the Work, but
58
+ excluding communication that is conspicuously marked or otherwise
59
+ designated in writing by the copyright owner as "Not a Contribution."
60
+
61
+ "Contributor" shall mean Licensor and any individual or Legal Entity
62
+ on behalf of whom a Contribution has been received by Licensor and
63
+ subsequently incorporated within the Work.
64
+
65
+ 2. Grant of Copyright License. Subject to the terms and conditions of
66
+ this License, each Contributor hereby grants to You a perpetual,
67
+ worldwide, non-exclusive, no-charge, royalty-free, irrevocable
68
+ copyright license to use, reproduce, prepare Derivative Works of,
69
+ publicly perform, publicly display, sublicense, and distribute the
70
+ Work and Derivative Works thereof in Source or Object form.
71
+
72
+ 3. Grant of Patent License. Subject to the terms and conditions of
73
+ this License, each Contributor hereby grants to You a perpetual,
74
+ worldwide, non-exclusive, no-charge, royalty-free, irrevocable
75
+ (except as stated in this section) patent license to make, have made,
76
+ use, offer to sell, sell, import, and otherwise transfer the Work,
77
+ where such license applies only to those patent claims licensable
78
+ by such Contributor that are necessarily infringed by their
79
+ Contribution(s) alone or by combination of their Contribution(s)
80
+ with the Work to which such Contribution(s) was submitted. If You
81
+ institute patent litigation against any entity (including a
82
+ cross-claim or counterclaim in a lawsuit) alleging that the Work
83
+ or a Contribution incorporated within the Work constitutes direct
84
+ or contributory patent infringement, then any patent licenses
85
+ granted to You under this License for that Work shall terminate
86
+ as of the date such litigation is filed.
87
+
88
+ 4. Redistribution. You may reproduce and distribute copies of the
89
+ Work or Derivative Works thereof in any medium, with or without
90
+ modifications, and in Source or Object form, provided that You
91
+ meet the following conditions:
92
+
93
+ (a) You must give any other recipients of the Work or
94
+ Derivative Works a copy of this License; and
95
+
96
+ (b) You must cause any modified files to carry prominent notices
97
+ stating that You changed the files; and
98
+
99
+ (c) You must retain, in the Source form of any Derivative Works
100
+ that You distribute, all copyright, trademark, patent,
101
+ attribution and other notices from the Source form of the Work,
102
+ excluding those notices that do not pertain to any part of
103
+ the Derivative Works; and
104
+
105
+ (d) If the Work includes a "NOTICE" file as part of its
106
+ distribution, then any Derivative Works that You distribute must
107
+ include a readable copy of the attribution notices contained
108
+ within such NOTICE file, excluding those notices that do not
109
+ pertain to any part of the Derivative Works, in at least one
110
+ of the following places: within a NOTICE file distributed
111
+ as part of the Derivative Works; within the Source form or
112
+ documentation, if provided along with the Derivative Works; or,
113
+ within a display generated by the Derivative Works, if and
114
+ wherever such third-party notices normally appear. The contents
115
+ of the NOTICE file are for informational purposes only and
116
+ do not modify the License. You may add Your own attribution
117
+ notices within Derivative Works that You distribute, alongside
118
+ or as an addendum to the NOTICE text from the Work, provided
119
+ that such additional attribution notices cannot be construed
120
+ as modifying the License.
121
+
122
+ You may add Your own copyright statement for Your modifications
123
+ and may provide additional or different license terms and conditions
124
+ for use, reproduction, or distribution of Your modifications, or
125
+ for any such Derivative Works as a whole, provided Your use,
126
+ reproduction, and distribution of the Work otherwise complies with
127
+ the conditions stated in this License.
128
+
129
+ 5. Submission of Contributions. Unless You explicitly state otherwise,
130
+ any Contribution intentionally submitted for inclusion in the Work
131
+ by You to the Licensor shall be under the terms and conditions of
132
+ this License, without any additional terms or conditions.
133
+ Notwithstanding the above, nothing herein shall supersede or modify
134
+ the terms of any separate license agreement you may have executed
135
+ with Licensor regarding such Contributions.
136
+
137
+ 6. Trademarks. This License does not grant permission to use the trade
138
+ names, trademarks, service marks, or product names of the Licensor,
139
+ except as required for reasonable and customary use in describing the
140
+ origin of the Work and reproducing the content of the NOTICE file.
141
+
142
+ 7. Disclaimer of Warranty. Unless required by applicable law or
143
+ agreed to in writing, Licensor provides the Work (and each
144
+ Contributor provides its Contributions) on an "AS IS" BASIS,
145
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
146
+ implied, including, without limitation, any warranties or conditions
147
+ of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
148
+ PARTICULAR PURPOSE. You are solely responsible for determining the
149
+ appropriateness of using or redistributing the Work and assume any
150
+ risks associated with Your exercise of permissions under this License.
151
+
152
+ 8. Limitation of Liability. In no event and under no legal theory,
153
+ whether in tort (including negligence), contract, or otherwise,
154
+ unless required by applicable law (such as deliberate and grossly
155
+ negligent acts) or agreed to in writing, shall any Contributor be
156
+ liable to You for damages, including any direct, indirect, special,
157
+ incidental, or consequential damages of any character arising as a
158
+ result of this License or out of the use or inability to use the
159
+ Work (including but not limited to damages for loss of goodwill,
160
+ work stoppage, computer failure or malfunction, or any and all
161
+ other commercial damages or losses), even if such Contributor
162
+ has been advised of the possibility of such damages.
163
+
164
+ 9. Accepting Support, Warranty or Additional Liability. While redistributing
165
+ the Work or Derivative Works thereof, You may choose to offer,
166
+ and charge a fee for, acceptance of support, warranty, indemnity,
167
+ or other liability obligations and/or rights consistent with this
168
+ License. However, in accepting such obligations, You may act only
169
+ on Your own behalf and on Your sole responsibility, not on behalf
170
+ of any other Contributor, and only if You agree to indemnify,
171
+ defend, and hold each Contributor harmless for any liability
172
+ incurred by, or claims asserted against, such Contributor by reason
173
+ of your accepting any such warranty or additional liability.
174
+
175
+ END OF TERMS AND CONDITIONS
176
+
177
+ APPENDIX: How to apply the Apache License to your work.
178
+
179
+ To apply the Apache License to your work, attach the following
180
+ boilerplate notice, with the fields enclosed by brackets "[]"
181
+ replaced with your own identifying information. (Don't include
182
+ the brackets!) The text should be enclosed in the appropriate
183
+ comment syntax for the file format. We also recommend that a
184
+ file or class name and description of purpose be included on the
185
+ same "printed page" as the copyright notice for easier
186
+ identification within third-party archives.
187
+
188
+ Copyright 2024 adithya-s-k
189
+
190
+ Licensed under the Apache License, Version 2.0 (the "License");
191
+ you may not use this file except in compliance with the License.
192
+ You may obtain a copy of the License at
193
+
194
+ http://www.apache.org/licenses/LICENSE-2.0
195
+
196
+ Unless required by applicable law or agreed to in writing, software
197
+ distributed under the License is distributed on an "AS IS" BASIS,
198
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
199
+ See the License for the specific language governing permissions and
200
+ limitations under the License.
README.md CHANGED
@@ -1,13 +1,153 @@
1
- ---
2
- title: EurekaAgent
3
- emoji: 😻
4
- colorFrom: pink
5
- colorTo: pink
6
- sdk: gradio
7
- sdk_version: 5.43.1
8
- app_file: app.py
9
- pinned: false
10
- license: apache-2.0
11
- ---
12
-
13
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Eureka Agent
2
+
3
+ An AI-powered research automation system that can execute Python code, analyze data, and generate insights through an interactive Jupyter-like interface.
4
+
5
+ <img width="1936" height="855" alt="Screenshot 2025-08-22 at 11 45 12 PM" src="https://github.com/user-attachments/assets/8d4ea793-4027-4aa3-8d6f-cbebbbd6e0c2" />
6
+
7
+ ## 🎯 What it does
8
+
9
+ Eureka Agent automates research workflows by:
10
+
11
+ - **Executing Python code** in a secure containerized environment
12
+ - **Analyzing data** with full context awareness across conversations
13
+ - **Generating visualizations** and interactive outputs
14
+ - **Iterative development** - builds upon previous code and results
15
+ - **Error recovery** - learns from execution failures and improves
16
+
17
+ ## ⚡ Key Features
18
+
19
+ - **Stateful Jupyter Environment**: Variables and imports persist across all code executions
20
+ - **GPU/CPU Support**: Configurable hardware (CPU, T4, L4, A100, H100)
21
+ - **Interactive Development**: Build complex solutions incrementally
22
+ - **Rich Output Support**: Plots, tables, HTML, and multimedia content
23
+ - **Error Handling**: Intelligent error recovery and debugging assistance
24
+ - **File Upload**: Process your own datasets and documents
25
+
26
+ ## 🚀 Quick Start
27
+
28
+ ### Prerequisites
29
+
30
+ - Python 3.8+
31
+ - Modal account (for containerized execution)
32
+ - OpenAI API key or compatible LLM provider
33
+
34
+ ### Installation
35
+
36
+ 1. Clone the repository:
37
+
38
+ ```bash
39
+ git clone https://github.com/adithya-s-k/EurekaAgent
40
+ cd EurekaAgent
41
+ ```
42
+
43
+ 2. Install dependencies:
44
+
45
+ ```bash
46
+ pip install -r requirements.txt
47
+ ```
48
+
49
+ 3. Set up environment variables:
50
+
51
+ ```bash
52
+ export OPENAI_API_KEY="your-api-key"
53
+ export MODAL_TOKEN_ID="your-modal-token-id"
54
+ export MODAL_TOKEN_SECRET="your-modal-token-secret"
55
+ ```
56
+
57
+ ### Running the Application
58
+
59
+ ```bash
60
+ python app.py
61
+ ```
62
+
63
+ The application will launch a Gradio interface accessible via your web browser.
64
+
65
+ ## 🔧 Configuration
66
+
67
+ ### Environment Variables
68
+
69
+ | Variable | Description | Required | Format/Example |
70
+ | ---------------------------- | ----------------------------- | -------- | ------------------------------- |
71
+ | `MODAL_TOKEN_ID` | Modal token ID | Yes | `ak-...` |
72
+ | `MODAL_TOKEN_SECRET` | Modal token secret | Yes | `as-...` |
73
+ | `PROVIDER_API_KEY` | AI Provider API key | Yes\* | `sk-...`, `gsk_...`, `csk-...` |
74
+ | `PROVIDER_API_ENDPOINT` | AI Provider API endpoint | Yes\* | `https://api.anthropic.com/v1/` |
75
+ | `MODEL_NAME` | Model to use | Yes\* | `claude-sonnet-4-20250514` |
76
+ | `HF_TOKEN` | Hugging Face token (optional) | No | `hf_...` |
77
+ | `TAVILY_API_KEY` | Tavily API key for web search | No | `tvly-...` |
78
+ | `PHOENIX_API_KEY` | Phoenix tracing API key | No | - |
79
+ | `PHOENIX_COLLECTOR_ENDPOINT` | Phoenix collector endpoint | No | - |
80
+ | `ENVIRONMENT` | Environment mode | No | `dev`/`prod` |
81
+
82
+ \*At least one complete AI provider configuration must be provided
83
+
84
+ **Legacy OpenAI Support:**
85
+ | Variable | Description | Required |
86
+ | ----------------------- | ----------------------------- | -------- |
87
+ | `OPENAI_API_KEY` | OpenAI API key | No |
88
+ | `AZURE_OPENAI_ENDPOINT` | Azure OpenAI endpoint | No |
89
+ | `AZURE_OPENAI_API_KEY` | Azure OpenAI API key | No |
90
+
91
+ ### Hardware Options
92
+
93
+ - **CPU Only**: Free, suitable for basic tasks
94
+ - **NVIDIA T4**: Low-cost GPU for small models
95
+ - **NVIDIA L4**: Mid-range GPU for better performance
96
+ - **NVIDIA A100**: High-end GPU for large models (40GB/80GB variants)
97
+ - **NVIDIA H100**: Latest flagship GPU for maximum performance
98
+
99
+ ## 💡 Usage Examples
100
+
101
+ ### Basic Data Analysis
102
+
103
+ ```
104
+ "Analyze the uploaded CSV file and create visualizations showing key trends"
105
+ ```
106
+
107
+ ### Machine Learning
108
+
109
+ ```
110
+ "Train a neural network to classify the iris dataset and evaluate its performance"
111
+ ```
112
+
113
+ ### Research Tasks
114
+
115
+ ```
116
+ "Download stock price data for the last year and perform technical analysis"
117
+ ```
118
+
119
+ ## 🏗️ Architecture
120
+
121
+ - **Frontend**: Gradio web interface with real-time status updates
122
+ - **Backend**: Python application with multi-provider AI integration
123
+ - **Execution Environment**: Modal containerized sandboxes with GPU support
124
+ - **Code Execution**: Persistent Jupyter-like stateful environment
125
+ - **Session Management**: Comprehensive session state tracking with Phoenix tracing
126
+ - **Storage**: File-based session persistence with notebook compatibility
127
+ - **Web Search**: Integrated Tavily search for current information
128
+ - **Hardware Support**: CPU, T4, L4, A100, H100 configurations
129
+
130
+ ## 📁 Project Structure
131
+
132
+ ```
133
+ EurekaAgent/
134
+ ├── app.py # Main Gradio application with API key management
135
+ ├── jupyter_handler.py # Jupyter notebook management and rendering
136
+ ├── jupyter_agent.py # Utility functions, execution logic, and session management
137
+ ├── modal_sandbox.py # Modal sandbox configuration with GPU support
138
+ ├── system_prompt.txt # System prompt for the AI agent
139
+ ├── requirements.txt # Python dependencies
140
+ └── temp/ # Temporary files, notebooks, and session states
141
+ ├── <session_id>/
142
+ │ ├── session_state.json # Complete session state and history
143
+ │ └── jupyter-agent.ipynb # Legacy notebook file for UI compatibility
144
+ └── jupyter-agent.ipynb # Default notebook template
145
+ ```
146
+
147
+ ## 🤝 Contributing
148
+
149
+ This project is a fork of [Jupyter Agent 2](https://huggingface.co/spaces/lvwerra/jupyter-agent-2) by Hugging Face. Contributions are welcome!
150
+
151
+ ## 📄 License
152
+
153
+ See [LICENSE](./LICENSE) file for details.
app.py ADDED
@@ -0,0 +1,1761 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import os
2
+ import logging
3
+ import gradio as gr
4
+ from gradio.utils import get_space
5
+ from modal_sandbox import create_modal_sandbox
6
+ from pathlib import Path
7
+ import json
8
+ from datetime import datetime
9
+ import threading
10
+ import re
11
+ from openai import OpenAI, AzureOpenAI
12
+ from jupyter_handler import JupyterNotebook
13
+
14
+ if not get_space():
15
+ try:
16
+ from dotenv import load_dotenv
17
+
18
+ load_dotenv()
19
+ except (ImportError, ModuleNotFoundError):
20
+ pass
21
+ from jupyter_agent import (
22
+ run_interactive_notebook_with_session_state,
23
+ SessionStateManager,
24
+ )
25
+
26
+ TMP_DIR = './temp/'
27
+
28
+ # Environment and API key management utilities
29
+ def get_environment():
30
+ """Get the current environment (dev/prod)"""
31
+ return os.environ.get("ENVIRONMENT", "prod").lower()
32
+
33
+ def is_dev_environment():
34
+ """Check if running in development environment"""
35
+ return get_environment() == "dev"
36
+
37
+ def get_required_api_keys():
38
+ """Get dictionary of required API keys and their current status"""
39
+ required_keys = {
40
+ "MODAL_TOKEN_ID": {
41
+ "value": os.environ.get("MODAL_TOKEN_ID"),
42
+ "required": True,
43
+ "description": "Modal Token ID for sandbox access"
44
+ },
45
+ "MODAL_TOKEN_SECRET": {
46
+ "value": os.environ.get("MODAL_TOKEN_SECRET"),
47
+ "required": True,
48
+ "description": "Modal Token Secret for sandbox access"
49
+ },
50
+ "HF_TOKEN": {
51
+ "value": os.environ.get("HF_TOKEN"),
52
+ "required": False,
53
+ "description": "Hugging Face Token for model access"
54
+ },
55
+ "PROVIDER_API_KEY": {
56
+ "value": os.environ.get("PROVIDER_API_KEY"),
57
+ "required": True,
58
+ "description": "AI Provider API Key (Anthropic, OpenAI, etc.)"
59
+ },
60
+ "PROVIDER_API_ENDPOINT": {
61
+ "value": os.environ.get("PROVIDER_API_ENDPOINT"),
62
+ "required": True,
63
+ "description": "AI Provider API Endpoint"
64
+ },
65
+ "MODEL_NAME": {
66
+ "value": os.environ.get("MODEL_NAME"),
67
+ "required": True,
68
+ "description": "Model name to use"
69
+ },
70
+ "TAVILY_API_KEY": {
71
+ "value": os.environ.get("TAVILY_API_KEY"),
72
+ "required": False,
73
+ "description": "Tavily API Key for web search functionality"
74
+ }
75
+ }
76
+ return required_keys
77
+
78
+ def get_missing_api_keys():
79
+ """Get list of missing required API keys"""
80
+ required_keys = get_required_api_keys()
81
+ missing_keys = {}
82
+
83
+ for key, config in required_keys.items():
84
+ if config["required"] and not config["value"]:
85
+ missing_keys[key] = config
86
+
87
+ return missing_keys
88
+
89
+ def validate_api_key_format(key_name, key_value):
90
+ """Basic validation for API key formats"""
91
+ if not key_value or not key_value.strip():
92
+ return False, "API key cannot be empty"
93
+
94
+ key_value = key_value.strip()
95
+
96
+ # Basic format validation
97
+ if key_name == "MODAL_TOKEN_ID" and not key_value.startswith("ak-"):
98
+ return False, "Modal Token ID should start with 'ak-'"
99
+ elif key_name == "MODAL_TOKEN_SECRET" and not key_value.startswith("as-"):
100
+ return False, "Modal Token Secret should start with 'as-'"
101
+ elif key_name == "HF_TOKEN" and not key_value.startswith("hf_"):
102
+ return False, "Hugging Face token should start with 'hf_'"
103
+ elif key_name == "PROVIDER_API_KEY":
104
+ # Check for common API key prefixes
105
+ valid_prefixes = ["sk-", "gsk_", "csk-"]
106
+ if not any(key_value.startswith(prefix) for prefix in valid_prefixes):
107
+ return False, "API key format may be invalid (expected prefixes: sk-, gsk_, csk-)"
108
+ elif key_name == "PROVIDER_API_ENDPOINT" and not (key_value.startswith("http://") or key_value.startswith("https://")):
109
+ return False, "API endpoint should start with http:// or https://"
110
+ elif key_name == "TAVILY_API_KEY" and not key_value.startswith("tvly-"):
111
+ return False, "Tavily API key should start with 'tvly-'"
112
+
113
+ return True, "Valid format"
114
+
115
+ def apply_user_api_keys(api_keys_dict):
116
+ """Apply user-provided API keys to environment"""
117
+ for key, value in api_keys_dict.items():
118
+ if value and value.strip():
119
+ os.environ[key] = value.strip()
120
+ logger.info(f"Applied user-provided API key: {key}")
121
+
122
+ def get_previous_notebooks():
123
+ """Get list of previous notebook sessions (dev only)"""
124
+ if not is_dev_environment():
125
+ return []
126
+
127
+ notebooks = []
128
+ tmp_dir = Path(TMP_DIR)
129
+
130
+ if not tmp_dir.exists():
131
+ return notebooks
132
+
133
+ for session_dir in tmp_dir.iterdir():
134
+ if session_dir.is_dir() and session_dir.name != ".":
135
+ notebook_file = session_dir / "jupyter-agent.ipynb"
136
+ if notebook_file.exists():
137
+ try:
138
+ # Get creation time and basic info
139
+ stat = notebook_file.stat()
140
+ size = stat.st_size
141
+ modified = stat.st_mtime
142
+
143
+ # Try to read basic notebook info
144
+ with open(notebook_file, 'r') as f:
145
+ notebook_data = json.load(f)
146
+ cell_count = len(notebook_data.get('cells', []))
147
+
148
+ # Format timestamp
149
+ formatted_time = datetime.fromtimestamp(modified).strftime("%Y-%m-%d %H:%M")
150
+
151
+ # Try to load session state for additional info
152
+ config_info = ""
153
+ try:
154
+ session_manager = SessionStateManager(session_dir.name, TMP_DIR)
155
+ session_state = session_manager.load_state()
156
+ if session_state:
157
+ hardware = session_state.get("hardware_config", {})
158
+ gpu = hardware.get("gpu_type", "unknown")
159
+ config_info = f", {gpu}"
160
+ except Exception:
161
+ pass
162
+
163
+ notebooks.append({
164
+ 'session_id': session_dir.name,
165
+ 'path': str(notebook_file),
166
+ 'modified': modified,
167
+ 'size': size,
168
+ 'cell_count': cell_count,
169
+ 'display_name': f"{session_dir.name} ({cell_count} cells{config_info}, {formatted_time})"
170
+ })
171
+ except Exception as e:
172
+ logger.warning(f"Failed to read notebook info for {session_dir.name}: {e}")
173
+
174
+ # Sort by modification time (newest first)
175
+ notebooks.sort(key=lambda x: x['modified'], reverse=True)
176
+ return notebooks
177
+
178
+ def parse_environment_variables(env_vars_text):
179
+ """
180
+ Parse environment variables from text input
181
+
182
+ Args:
183
+ env_vars_text: String containing environment variables in KEY=value format, one per line
184
+
185
+ Returns:
186
+ dict: Dictionary of parsed environment variables
187
+ """
188
+ env_dict = {}
189
+ if not env_vars_text or not env_vars_text.strip():
190
+ return env_dict
191
+
192
+ for line in env_vars_text.strip().split('\n'):
193
+ line = line.strip()
194
+ if not line or line.startswith('#'): # Skip empty lines and comments
195
+ continue
196
+
197
+ if '=' in line:
198
+ key, value = line.split('=', 1) # Split only on first =
199
+ key = key.strip()
200
+ value = value.strip()
201
+ if key: # Only add if key is not empty
202
+ env_dict[key] = value
203
+ else:
204
+ logger.warning(f"Skipping invalid environment variable format: {line}")
205
+
206
+ return env_dict
207
+
208
+ def create_notification_html(message, notification_type="info", show_spinner=False):
209
+ """
210
+ Create HTML for notification messages
211
+
212
+ Args:
213
+ message: The notification message
214
+ notification_type: Type of notification ('info', 'success', 'warning', 'error')
215
+ show_spinner: Whether to show a loading spinner
216
+ """
217
+ colors = {
218
+ 'info': '#3498db',
219
+ 'success': '#27ae60',
220
+ 'warning': '#f39c12',
221
+ 'error': '#e74c3c',
222
+ 'loading': '#6c5ce7'
223
+ }
224
+
225
+ icons = {
226
+ 'info': '🔄',
227
+ 'success': '✅',
228
+ 'warning': '⚠️',
229
+ 'error': '❌',
230
+ 'loading': '⏳'
231
+ }
232
+
233
+ color = colors.get(notification_type, colors['info'])
234
+ icon = icons.get(notification_type, icons['info'])
235
+
236
+ spinner_html = ""
237
+ if show_spinner or notification_type == 'loading':
238
+ spinner_html = """
239
+ <div style="
240
+ display: inline-block;
241
+ width: 20px;
242
+ height: 20px;
243
+ border: 2px solid #f3f3f3;
244
+ border-top: 2px solid {color};
245
+ border-radius: 50%;
246
+ animation: spin 1s linear infinite;
247
+ margin-right: 8px;
248
+ "></div>
249
+ <style>
250
+ @keyframes spin {{
251
+ 0% {{ transform: rotate(0deg); }}
252
+ 100% {{ transform: rotate(360deg); }}
253
+ }}
254
+ </style>
255
+ """.format(color=color)
256
+
257
+ return f"""
258
+ <div style="
259
+ background-color: {color}20;
260
+ border-left: 4px solid {color};
261
+ padding: 12px 16px;
262
+ margin: 10px 0;
263
+ border-radius: 4px;
264
+ font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, sans-serif;
265
+ font-size: 14px;
266
+ color: #2c3e50;
267
+ display: flex;
268
+ align-items: center;
269
+ ">
270
+ {spinner_html}
271
+ <strong>{icon} {message}</strong>
272
+ </div>
273
+ """
274
+
275
+ def create_progress_notification(message, progress_percent=None):
276
+ """Create a progress notification with optional progress bar"""
277
+ progress_html = ""
278
+ if progress_percent is not None:
279
+ progress_html = f"""
280
+ <div style="
281
+ width: 100%;
282
+ background-color: #e0e0e0;
283
+ border-radius: 5px;
284
+ margin-top: 8px;
285
+ height: 8px;
286
+ ">
287
+ <div style="
288
+ width: {progress_percent}%;
289
+ background-color: #3498db;
290
+ height: 8px;
291
+ border-radius: 5px;
292
+ transition: width 0.3s ease;
293
+ "></div>
294
+ </div>
295
+ <small style="color: #666; margin-top: 4px; display: block;">{progress_percent}% complete</small>
296
+ """
297
+
298
+ return create_notification_html(message, "loading", show_spinner=True) + progress_html
299
+
300
+
301
+ def initialize_phoenix_tracing():
302
+ """Initialize Phoenix tracing with proper error handling and session support"""
303
+ try:
304
+ from phoenix.otel import register
305
+
306
+ phoenix_api_key = os.getenv("PHOENIX_API_KEY")
307
+ collector_endpoint = os.getenv("PHOENIX_COLLECTOR_ENDPOINT")
308
+
309
+ if not phoenix_api_key:
310
+ logger.info("Phoenix API key not found, skipping Phoenix tracing initialization")
311
+ return None
312
+
313
+ if not collector_endpoint:
314
+ logger.info("Phoenix collector endpoint not found, skipping Phoenix tracing initialization")
315
+ return None
316
+
317
+ logger.info("Initializing Phoenix tracing with session support...")
318
+
319
+ # Set required environment variables
320
+ os.environ["PHOENIX_API_KEY"] = phoenix_api_key
321
+ os.environ["PHOENIX_COLLECTOR_ENDPOINT"] = collector_endpoint
322
+ os.environ["OTEL_EXPORTER_OTLP_HEADERS"] = f"api_key={phoenix_api_key}"
323
+ os.environ["PHOENIX_CLIENT_HEADERS"] = f"api_key={phoenix_api_key}"
324
+
325
+ # Configure the Phoenix tracer with OpenAI instrumentation enabled
326
+ tracer_provider = register(
327
+ project_name="eureka-agent",
328
+ auto_instrument=True, # Keep auto-instrument enabled for OpenAI tracing
329
+ set_global_tracer_provider=True
330
+ )
331
+
332
+ # Additional instrumentation setup for session tracking
333
+ try:
334
+ from openinference.instrumentation.openai import OpenAIInstrumentor
335
+
336
+ # Ensure OpenAI instrumentation is properly configured
337
+ if not OpenAIInstrumentor().is_instrumented_by_opentelemetry:
338
+ OpenAIInstrumentor().instrument()
339
+ logger.info("OpenAI instrumentation configured for Phoenix session tracking")
340
+ else:
341
+ logger.info("OpenAI instrumentation already active")
342
+
343
+ except ImportError:
344
+ logger.warning("OpenAI instrumentation not available - session grouping may not work optimally")
345
+ except Exception as e:
346
+ logger.warning(f"Failed to configure OpenAI instrumentation: {str(e)}")
347
+
348
+ logger.info("Phoenix tracing initialized successfully with session support")
349
+ return tracer_provider
350
+
351
+ except ImportError:
352
+ logger.info("Phoenix not installed, skipping tracing initialization")
353
+ return None
354
+ except Exception as e:
355
+ logger.warning(f"Failed to initialize Phoenix tracer (non-critical): {str(e)}")
356
+ return None
357
+
358
+
359
+
360
+ # Configure logging
361
+ logging.basicConfig(
362
+ level=logging.INFO,
363
+ format='%(asctime)s - %(name)s - %(levelname)s - %(message)s',
364
+ handlers=[
365
+ logging.FileHandler('jupyter_agent.log'),
366
+ logging.StreamHandler()
367
+ ]
368
+ )
369
+ logger = logging.getLogger(__name__)
370
+
371
+ # Initialize Phoenix tracing
372
+ tracer_provider = initialize_phoenix_tracing()
373
+
374
+
375
+ MODAL_TOKEN_ID = os.environ.get("MODAL_TOKEN_ID")
376
+ MODAL_TOKEN_SECRET = os.environ.get("MODAL_TOKEN_SECRET")
377
+ HF_TOKEN = os.environ.get("HF_TOKEN")
378
+ SANDBOXES = {}
379
+ SANDBOX_TIMEOUT = 300
380
+ STOP_EVENTS = {} # Store stop events for each session
381
+ EXECUTION_STATES = {} # Store execution states for each session
382
+
383
+ # GPU configuration options for the UI
384
+ GPU_OPTIONS = [
385
+ ("CPU Only", "cpu"),
386
+ ("NVIDIA T4 (16GB)", "T4"),
387
+ ("NVIDIA L4 (24GB)", "L4"),
388
+ ("NVIDIA A100 40GB", "A100-40GB"),
389
+ ("NVIDIA A100 80GB", "A100-80GB"),
390
+ ("NVIDIA H100 (80GB)", "H100")
391
+ ]
392
+
393
+
394
+ def initialize_openai_client():
395
+ """Initialize OpenAI client with proper error handling and fallbacks"""
396
+ client = None
397
+ model_name = None
398
+
399
+ # Check if we have any API keys configured
400
+ has_azure = os.environ.get("AZURE_OPENAI_ENDPOINT") and os.environ.get("AZURE_OPENAI_API_KEY")
401
+ has_provider = os.environ.get("PROVIDER_API_ENDPOINT") and os.environ.get("PROVIDER_API_KEY")
402
+ has_openai = os.environ.get("OPENAI_API_KEY")
403
+
404
+ if not (has_azure or has_provider or has_openai):
405
+ logger.warning("No API keys found in environment - client will be initialized later when user provides keys")
406
+ return None, None
407
+
408
+ try:
409
+ # Option 1: Azure OpenAI
410
+ if has_azure:
411
+ logger.info("Initializing Azure OpenAI client")
412
+ client = AzureOpenAI(
413
+ api_version="2024-12-01-preview",
414
+ azure_endpoint=os.environ.get("AZURE_OPENAI_ENDPOINT"),
415
+ api_key=os.environ.get("AZURE_OPENAI_API_KEY")
416
+ )
417
+ model_name = os.environ.get("MODEL_NAME", "gpt-4") # Default fallback
418
+ logger.info(f"Azure OpenAI client initialized with model: {model_name}")
419
+
420
+ # Option 2: Custom Provider (Cerebras, etc.)
421
+ elif has_provider:
422
+ logger.info("Initializing custom provider OpenAI client")
423
+ client = OpenAI(
424
+ base_url=os.environ.get("PROVIDER_API_ENDPOINT"),
425
+ api_key=os.environ.get("PROVIDER_API_KEY")
426
+ )
427
+ model_name = os.environ.get("MODEL_NAME", "gpt-4") # Default fallback
428
+ logger.info(f"Custom provider client initialized with model: {model_name}")
429
+
430
+ # Option 3: Standard OpenAI
431
+ elif has_openai:
432
+ logger.info("Initializing standard OpenAI client")
433
+ client = OpenAI(
434
+ api_key=os.environ.get("OPENAI_API_KEY")
435
+ )
436
+ model_name = os.environ.get("MODEL_NAME", "gpt-4") # Default fallback
437
+ logger.info(f"OpenAI client initialized with model: {model_name}")
438
+
439
+ # Test the client with a simple request (optional - skip if client initialization should be fast)
440
+ if client:
441
+ logger.info("Testing client connection...")
442
+ try:
443
+ # Simple test to verify the client works
444
+ _ = client.chat.completions.create(
445
+ model=model_name,
446
+ messages=[{"role": "user", "content": "Hello"}],
447
+ max_tokens=5
448
+ )
449
+ logger.info("Client connection test successful")
450
+ except Exception as test_error:
451
+ logger.error(f"Client connection test failed: {str(test_error)}")
452
+ # Don't raise here, let the main application handle it
453
+
454
+ return client, model_name
455
+
456
+ except Exception as e:
457
+ logger.error(f"Failed to initialize OpenAI client: {str(e)}")
458
+ logger.warning("Client will be initialized later when user provides valid API keys")
459
+ return None, None
460
+
461
+ client, model_name = initialize_openai_client()
462
+
463
+ # If no client was initialized, it means no API keys are available
464
+ if client is None:
465
+ logger.info("No OpenAI client initialized - waiting for user to provide API keys through UI")
466
+
467
+
468
+
469
+ init_notebook = JupyterNotebook()
470
+
471
+ if not os.path.exists(TMP_DIR):
472
+ os.makedirs(TMP_DIR)
473
+ logger.info(f"Created temporary directory: {TMP_DIR}")
474
+ else:
475
+ logger.info(f"Using existing temporary directory: {TMP_DIR}")
476
+
477
+ with open(TMP_DIR+"jupyter-agent.ipynb", 'w', encoding='utf-8') as f:
478
+ json.dump(JupyterNotebook().data, f, indent=2)
479
+ logger.info(f"Initialized default notebook file: {TMP_DIR}jupyter-agent.ipynb")
480
+
481
+ try:
482
+ with open("system_prompt.txt", "r") as f:
483
+ DEFAULT_SYSTEM_PROMPT = f.read()
484
+ logger.info("Loaded system prompt from ds-system-prompt.txt")
485
+ except FileNotFoundError:
486
+ logger.warning("ds-system-prompt.txt not found, using fallback system prompt")
487
+
488
+
489
+ def execute_jupyter_agent(
490
+ user_input, files, message_history, gpu_type, cpu_cores, memory_gb, timeout_sec, env_vars_text,
491
+ modal_token_id, modal_token_secret, hf_token, provider_api_key, provider_api_endpoint, user_model_name,
492
+ tavily_api_key, enable_web_search, request: gr.Request
493
+ ):
494
+ session_id = request.session_hash
495
+ logger.info(f"Starting execution for session {session_id}")
496
+ logger.info(f"Hardware config: GPU={gpu_type}, CPU={cpu_cores}, Memory={memory_gb}GB, Timeout={timeout_sec}s")
497
+ logger.info(f"User input length: {len(user_input)} characters")
498
+
499
+ # Check if execution is already running for this session
500
+ if session_id in EXECUTION_STATES and EXECUTION_STATES[session_id].get("running", False):
501
+ error_message = "❌ Execution already in progress for this session. Please wait for it to complete or stop it first."
502
+ error_notification = create_notification_html(error_message, "warning")
503
+
504
+ # Return current state without starting new execution
505
+ session_dir = os.path.join(TMP_DIR, session_id)
506
+ save_dir = os.path.join(session_dir, 'jupyter-agent.ipynb')
507
+ if os.path.exists(save_dir):
508
+ yield error_notification, message_history, save_dir
509
+ else:
510
+ yield error_notification, message_history, TMP_DIR + "jupyter-agent.ipynb"
511
+ return
512
+
513
+ # Initialize session state manager
514
+ session_manager = SessionStateManager(session_id, TMP_DIR)
515
+
516
+ # Check if this is a continuing session
517
+ existing_session_state = session_manager.load_state()
518
+ is_continuing_session = existing_session_state is not None
519
+
520
+ if is_continuing_session:
521
+ logger.info(f"Found existing session state for {session_id} - continuing from previous state")
522
+ else:
523
+ logger.info(f"No existing session state found for {session_id} - starting new session")
524
+
525
+ # Apply user-provided API keys if any are provided
526
+ user_api_keys = {}
527
+ if modal_token_id:
528
+ user_api_keys["MODAL_TOKEN_ID"] = modal_token_id
529
+ if modal_token_secret:
530
+ user_api_keys["MODAL_TOKEN_SECRET"] = modal_token_secret
531
+ if hf_token:
532
+ user_api_keys["HF_TOKEN"] = hf_token
533
+ if provider_api_key:
534
+ user_api_keys["PROVIDER_API_KEY"] = provider_api_key
535
+ if provider_api_endpoint:
536
+ user_api_keys["PROVIDER_API_ENDPOINT"] = provider_api_endpoint
537
+ if user_model_name:
538
+ user_api_keys["MODEL_NAME"] = user_model_name
539
+ if tavily_api_key:
540
+ user_api_keys["TAVILY_API_KEY"] = tavily_api_key
541
+
542
+ # Check if we have a client or need to initialize one with user keys
543
+ global client, model_name
544
+ if client is None and not user_api_keys:
545
+ missing_keys = get_missing_api_keys()
546
+ if missing_keys:
547
+ error_message = f"""❌ Missing Required API Keys
548
+
549
+ Please provide the following API keys to continue:
550
+ {chr(10).join([f"• {key}: {config['description']}" for key, config in missing_keys.items()])}
551
+
552
+ You can either:
553
+ 1. Add them to your .env file, or
554
+ 2. Enter them in the API Keys section above"""
555
+ error_notification = create_notification_html(error_message, "error")
556
+ yield error_notification, message_history, TMP_DIR + "jupyter-agent.ipynb"
557
+ return
558
+
559
+ # Validate user-provided API keys
560
+ if user_api_keys:
561
+ validation_message = "🔍 Validating API keys..."
562
+ validation_notification = create_progress_notification(validation_message)
563
+ yield validation_notification, message_history, TMP_DIR + "jupyter-agent.ipynb"
564
+
565
+ validation_errors = []
566
+ for key, value in user_api_keys.items():
567
+ is_valid, message = validate_api_key_format(key, value)
568
+ if not is_valid:
569
+ validation_errors.append(f"{key}: {message}")
570
+
571
+ if validation_errors:
572
+ error_message = f"❌ API Key Validation Failed:\n" + "\n".join(f"• {error}" for error in validation_errors)
573
+ error_notification = create_notification_html(error_message, "error")
574
+ yield error_notification, message_history, TMP_DIR + "jupyter-agent.ipynb"
575
+ return
576
+
577
+ logger.info(f"Applying user-provided API keys: {list(user_api_keys.keys())}")
578
+ apply_user_api_keys(user_api_keys)
579
+
580
+ # Reinitialize OpenAI client with new keys if provider keys were updated
581
+ if any(key in user_api_keys for key in ["PROVIDER_API_KEY", "PROVIDER_API_ENDPOINT", "MODEL_NAME"]):
582
+ try:
583
+ reinit_message = "🔄 Reinitializing AI client with new credentials..."
584
+ reinit_notification = create_progress_notification(reinit_message)
585
+ yield reinit_notification, message_history, TMP_DIR + "jupyter-agent.ipynb"
586
+
587
+ client, model_name = initialize_openai_client()
588
+ if client is None:
589
+ error_message = "Failed to initialize client with provided API keys. Please check your credentials."
590
+ logger.error(error_message)
591
+ error_notification = create_notification_html(error_message, "error")
592
+ yield error_notification, message_history, TMP_DIR + "jupyter-agent.ipynb"
593
+ return
594
+ logger.info("Reinitialized OpenAI client with user-provided keys")
595
+
596
+ success_message = "✅ API credentials validated and applied successfully!"
597
+ success_notification = create_notification_html(success_message, "success")
598
+ yield success_notification, message_history, TMP_DIR + "jupyter-agent.ipynb"
599
+ except Exception as e:
600
+ error_message = f"Failed to initialize client with provided API keys: {str(e)}"
601
+ logger.error(error_message)
602
+ error_notification = create_notification_html(error_message, "error")
603
+ yield error_notification, message_history, TMP_DIR + "jupyter-agent.ipynb"
604
+ return
605
+
606
+ # Initialize or reset stop event for this session
607
+ STOP_EVENTS[session_id] = threading.Event()
608
+ EXECUTION_STATES[session_id] = {"running": True, "paused": False, "current_phase": "initializing"}
609
+
610
+ # Set up save directory early for notifications
611
+ session_dir = os.path.join(TMP_DIR, request.session_hash)
612
+ os.makedirs(session_dir, exist_ok=True)
613
+ save_dir = os.path.join(session_dir, 'jupyter-agent.ipynb')
614
+
615
+ # Create initial notebook file so it exists for Gradio
616
+ with open(save_dir, 'w', encoding='utf-8') as f:
617
+ json.dump(init_notebook.data, f, indent=2)
618
+ logger.info(f"Initialized notebook for session {session_id}")
619
+
620
+ # Session configuration is now handled by SessionStateManager
621
+
622
+ if request.session_hash not in SANDBOXES:
623
+ logger.info(f"Creating new Modal sandbox for session {session_id}")
624
+
625
+ # Show initialization notification with spinner
626
+ gpu_info = gpu_type.upper() if gpu_type != "cpu" else "CPU Only"
627
+ if gpu_type in ["T4", "L4", "A100-40GB", "A100-80GB", "H100"]:
628
+ gpu_info = f"NVIDIA {gpu_type}"
629
+
630
+ init_message = f"Initializing {gpu_info} sandbox with {cpu_cores} CPU cores and {memory_gb}GB RAM..."
631
+ notification_html = create_progress_notification(init_message)
632
+ yield notification_html, message_history, save_dir
633
+
634
+ # Create Modal sandbox with user-specified configuration
635
+ environment_vars = {}
636
+ if MODAL_TOKEN_ID and MODAL_TOKEN_SECRET:
637
+ environment_vars.update({
638
+ "MODAL_TOKEN_ID": MODAL_TOKEN_ID,
639
+ "MODAL_TOKEN_SECRET": MODAL_TOKEN_SECRET
640
+ })
641
+ logger.debug(f"Modal credentials configured for session {session_id}")
642
+
643
+ # Parse and add user-provided environment variables
644
+ user_env_vars = parse_environment_variables(env_vars_text)
645
+ if user_env_vars:
646
+ environment_vars.update(user_env_vars)
647
+ logger.info(f"Added {len(user_env_vars)} custom environment variables for session {session_id}")
648
+ logger.debug(f"Custom environment variables: {list(user_env_vars.keys())}")
649
+
650
+ try:
651
+ SANDBOXES[request.session_hash] = create_modal_sandbox(
652
+ gpu_config=gpu_type,
653
+ cpu_cores=cpu_cores,
654
+ memory_gb=memory_gb,
655
+ timeout=int(timeout_sec),
656
+ environment_vars=environment_vars
657
+ )
658
+ logger.info(f"Successfully created Modal sandbox for session {session_id}")
659
+
660
+ # Show success notification
661
+ success_message = f"✨ {gpu_info} sandbox ready! Environment initialized with all packages."
662
+ success_notification_html = create_notification_html(success_message, "success")
663
+ yield success_notification_html, message_history, save_dir
664
+
665
+ except Exception as e:
666
+ logger.error(f"Failed to create Modal sandbox for session {session_id}: {str(e)}")
667
+ # Show error notification
668
+ error_message = f"Failed to initialize sandbox: {str(e)}"
669
+ error_notification_html = create_notification_html(error_message, "error")
670
+ yield error_notification_html, message_history, save_dir
671
+ raise
672
+ else:
673
+ logger.info(f"Reusing existing Modal sandbox for session {session_id}")
674
+ # Show reuse notification
675
+ gpu_info = gpu_type.upper() if gpu_type != "cpu" else "CPU Only"
676
+ if gpu_type in ["T4", "L4", "A100-40GB", "A100-80GB", "H100"]:
677
+ gpu_info = f"NVIDIA {gpu_type}"
678
+ reuse_message = f"Using existing {gpu_info} sandbox - ready to execute!"
679
+ reuse_notification_html = create_notification_html(reuse_message, "success")
680
+ yield reuse_notification_html, message_history, save_dir
681
+
682
+ sbx = SANDBOXES[request.session_hash]
683
+ logger.debug(f"Notebook will be saved to: {save_dir}")
684
+
685
+ # Initial notebook render
686
+ yield init_notebook.render(), message_history, save_dir
687
+
688
+
689
+
690
+ filenames = []
691
+ if files is not None:
692
+ logger.info(f"Processing {len(files)} uploaded files for session {session_id}")
693
+ for filepath in files:
694
+ filpath = Path(filepath)
695
+ try:
696
+ # Get file size for verification
697
+ file_size = os.path.getsize(filepath)
698
+
699
+ with open(filepath, "rb") as file:
700
+ logger.info(f"Uploading file {filepath} ({file_size} bytes) to session {session_id}")
701
+ sbx.files.write(filpath.name, file)
702
+
703
+ # Verify upload succeeded
704
+ if sbx.files.verify_file_upload(filpath.name, file_size):
705
+ filenames.append(filpath.name)
706
+ logger.debug(f"Successfully uploaded and verified {filpath.name}")
707
+ else:
708
+ logger.error(f"File upload verification failed for {filpath.name}")
709
+ raise RuntimeError(f"File upload verification failed for {filpath.name}")
710
+
711
+ except Exception as e:
712
+ logger.error(f"Failed to upload file {filepath} for session {session_id}: {str(e)}")
713
+ raise
714
+ else:
715
+ logger.info(f"No files to upload for session {session_id}")
716
+
717
+ # Initialize or continue session state
718
+ if is_continuing_session:
719
+ # Load existing session state
720
+ session_state = existing_session_state
721
+
722
+ # Validate and repair conversation history to prevent API errors
723
+ session_manager.validate_and_repair_conversation(session_state)
724
+
725
+ message_history = session_manager.get_conversation_history(session_state)
726
+ logger.info(f"Continuing session {session_id} with {len(message_history)} existing messages")
727
+
728
+ # Add new user input if provided
729
+ if user_input and user_input.strip():
730
+ # Check if this input was already added by comparing with the last message
731
+ last_message = message_history[-1] if message_history else None
732
+ should_add_input = True
733
+
734
+ if last_message and last_message.get("role") == "user":
735
+ # If the last message is from user and has the same content, don't add duplicate
736
+ if last_message.get("content") == user_input:
737
+ should_add_input = False
738
+ logger.debug(f"User input already present in session {session_id}")
739
+
740
+ if should_add_input:
741
+ session_manager.add_message(session_state, "user", user_input)
742
+ message_history = session_manager.get_conversation_history(session_state)
743
+ logger.info(f"Added new user input to existing session {session_id}")
744
+
745
+ # Show notification that we're continuing the conversation
746
+ continue_message = "🔄 Continuing conversation with new input..."
747
+ continue_notification = create_progress_notification(continue_message)
748
+ yield continue_notification, message_history, save_dir
749
+ else:
750
+ # Create new session state
751
+ logger.info(f"Initializing new session {session_id}")
752
+
753
+ # Format files section
754
+ if files is None:
755
+ files_section = "- None"
756
+ else:
757
+ files_section = "- " + "\n- ".join(filenames)
758
+ logger.info(f"System prompt includes {len(filenames)} files: {filenames}")
759
+
760
+ # Format GPU information
761
+ gpu_info = gpu_type.upper() if gpu_type != "cpu" else "CPU Only"
762
+ if gpu_type in ["T4", "L4", "A100-40GB", "A100-80GB", "H100"]:
763
+ gpu_info = f"NVIDIA {gpu_type}"
764
+
765
+ # Format available packages based on hardware configuration
766
+ packages_list = sbx.available_packages
767
+ packages_section = "\n".join([f"- {package}" for package in packages_list])
768
+
769
+ # Format the complete system prompt with named placeholders
770
+ system_prompt = DEFAULT_SYSTEM_PROMPT.replace("{AVAILABLE_FILES}", files_section)
771
+ system_prompt = system_prompt.replace("{GPU_TYPE}", gpu_info)
772
+ system_prompt = system_prompt.replace("{CPU_CORES}", str(cpu_cores))
773
+ system_prompt = system_prompt.replace("{MEMORY_GB}", str(memory_gb))
774
+ system_prompt = system_prompt.replace("{TIMEOUT_SECONDS}", str(timeout_sec))
775
+ system_prompt = system_prompt.replace("{AVAILABLE_PACKAGES}", packages_section)
776
+
777
+ # Create session state with configuration
778
+ hardware_config = {
779
+ "gpu_type": gpu_type,
780
+ "cpu_cores": cpu_cores,
781
+ "memory_gb": memory_gb,
782
+ "timeout_sec": timeout_sec
783
+ }
784
+
785
+ api_config = {
786
+ "model_name": model_name or user_model_name or "unknown",
787
+ "provider_endpoint": os.environ.get("PROVIDER_API_ENDPOINT") or provider_api_endpoint,
788
+ "provider_type": "openai_compatible"
789
+ }
790
+
791
+ environment_config = {
792
+ "variables": env_vars_text or "",
793
+ "files_uploaded": filenames if filenames else []
794
+ }
795
+
796
+ # Create initial session state
797
+ session_state = session_manager.create_initial_state(
798
+ hardware_config, api_config, environment_config, system_prompt
799
+ )
800
+
801
+ # Add user input if provided
802
+ if user_input and user_input.strip():
803
+ session_manager.add_message(session_state, "user", user_input)
804
+
805
+ # Get conversation history
806
+ message_history = session_manager.get_conversation_history(session_state)
807
+
808
+ # Save initial state
809
+ session_manager.save_state(session_state)
810
+
811
+ logger.info(f"Created new session {session_id} with {len(message_history)} messages")
812
+
813
+ logger.debug(f"Session {session_id} ready with {len(message_history)} messages")
814
+
815
+ # Determine which tools to use based on web search toggle
816
+ from jupyter_agent import TOOLS
817
+ if enable_web_search:
818
+ # Check if Tavily API key is available
819
+ tavily_key = os.environ.get("TAVILY_API_KEY") or tavily_api_key
820
+ if tavily_key:
821
+ selected_tools = TOOLS # Use all tools (code + search)
822
+ logger.info(f"Web search enabled for session {session_id} - using all tools")
823
+ else:
824
+ selected_tools = TOOLS[:1] # Use only code execution tool
825
+ logger.warning(f"Web search enabled but no Tavily API key found for session {session_id} - using code tool only")
826
+ else:
827
+ selected_tools = TOOLS[:1] # Use only code execution tool
828
+ logger.info(f"Web search disabled for session {session_id} - using code tool only")
829
+
830
+ logger.info(f"Starting interactive notebook execution for session {session_id}")
831
+
832
+ # Import Phoenix session context if available
833
+ try:
834
+ from jupyter_agent import create_phoenix_session_context
835
+ phoenix_available = True
836
+ except ImportError:
837
+ phoenix_available = False
838
+
839
+ # Prepare session metadata for Phoenix tracing at the session level
840
+ if phoenix_available:
841
+ session_level_metadata = {
842
+ "agent_type": "eureka-agent",
843
+ "session_type": "jupyter_execution",
844
+ "gpu_type": gpu_type,
845
+ "cpu_cores": cpu_cores,
846
+ "memory_gb": memory_gb,
847
+ "timeout_sec": timeout_sec,
848
+ "web_search_enabled": enable_web_search,
849
+ "tools_available": len(selected_tools)
850
+ }
851
+
852
+ # Add API provider info if available
853
+ if model_name:
854
+ session_level_metadata["model"] = model_name
855
+
856
+ session_context = create_phoenix_session_context(
857
+ session_id=session_id,
858
+ user_id=None, # Could add user identification if available
859
+ metadata=session_level_metadata
860
+ )
861
+ else:
862
+ from contextlib import nullcontext
863
+ session_context = nullcontext()
864
+
865
+ # Wrap the entire execution in a Phoenix session context
866
+ with session_context:
867
+ logger.debug(f"Starting session-level Phoenix tracing for {session_id}")
868
+ try:
869
+ for notebook_html, notebook_data, messages in run_interactive_notebook_with_session_state(
870
+ client, model_name, session_manager, session_state, sbx, STOP_EVENTS[session_id], selected_tools
871
+ ):
872
+ message_history = messages
873
+ logger.debug(f"Interactive notebook yield for session {session_id}")
874
+ # Update session state and yield with legacy notebook file for UI compatibility
875
+ session_manager.update_notebook_data(session_state, notebook_data)
876
+ session_manager.save_state(session_state)
877
+
878
+ # Create legacy notebook file for UI download compatibility
879
+ with open(save_dir, 'w', encoding='utf-8') as f:
880
+ json.dump(notebook_data, f, indent=2)
881
+
882
+ yield notebook_html, message_history, save_dir
883
+
884
+ except Exception as e:
885
+ logger.error(f"Error during interactive notebook execution for session {session_id}: {str(e)}")
886
+ # Save error state
887
+ session_manager.update_execution_state(session_state, is_running=False, last_execution_successful=False)
888
+ session_manager.save_state(session_state)
889
+ raise
890
+
891
+ # Final save and cleanup
892
+ try:
893
+ session_manager.update_execution_state(session_state, is_running=False)
894
+ session_manager.save_state(session_state)
895
+ logger.info(f"Final session state saved for session {session_id}")
896
+
897
+ # Create final legacy notebook file for UI
898
+ with open(save_dir, 'w', encoding='utf-8') as f:
899
+ json.dump(notebook_data, f, indent=2)
900
+
901
+ except Exception as e:
902
+ logger.error(f"Failed to save final session state for session {session_id}: {str(e)}")
903
+ raise
904
+
905
+ yield notebook_html, message_history, save_dir
906
+ logger.info(f"Completed execution for session {session_id}")
907
+
908
+ # Update legacy execution state for compatibility
909
+ if session_id in EXECUTION_STATES:
910
+ EXECUTION_STATES[session_id]["running"] = False
911
+
912
+ def clear(msg_state, request: gr.Request):
913
+ """Clear notebook but keep session data (less destructive than shutdown)"""
914
+ session_id = request.session_hash
915
+ logger.info(f"Clearing notebook for session {session_id}")
916
+
917
+ # Stop any running execution
918
+ if session_id in STOP_EVENTS:
919
+ STOP_EVENTS[session_id].set()
920
+
921
+ # Clear execution states but keep session data
922
+ if session_id in EXECUTION_STATES:
923
+ EXECUTION_STATES[session_id]["running"] = False
924
+ EXECUTION_STATES[session_id]["paused"] = False
925
+ EXECUTION_STATES[session_id]["current_phase"] = "ready"
926
+
927
+ # Reset message state for UI
928
+ msg_state = []
929
+ logger.info(f"Reset notebook display for session {session_id}")
930
+
931
+ return init_notebook.render(), msg_state
932
+
933
+ def stop_execution(request: gr.Request):
934
+ """Stop the current execution for this session"""
935
+ session_id = request.session_hash
936
+ logger.info(f"Stopping execution for session {session_id}")
937
+
938
+ if session_id in STOP_EVENTS and session_id in EXECUTION_STATES:
939
+ # Check if execution is actually running
940
+ if EXECUTION_STATES[session_id].get("running", False):
941
+ STOP_EVENTS[session_id].set()
942
+ logger.info(f"Stop signal sent for session {session_id}")
943
+
944
+ # Update execution state
945
+ EXECUTION_STATES[session_id]["running"] = False
946
+ EXECUTION_STATES[session_id]["paused"] = True
947
+ EXECUTION_STATES[session_id]["current_phase"] = "stopping"
948
+
949
+ # Also update session state if available
950
+ session_manager = SessionStateManager(session_id, TMP_DIR)
951
+ session_state = session_manager.load_state()
952
+ if session_state:
953
+ session_manager.update_execution_state(
954
+ session_state, is_running=False, is_paused=True, current_phase="stopping"
955
+ )
956
+ session_manager.save_state(session_state)
957
+
958
+ return "⏸️ Execution stopped - click Run to resume with new input"
959
+ else:
960
+ logger.info(f"No active execution to stop for session {session_id}")
961
+ return "⚪ No active execution to stop"
962
+ else:
963
+ logger.warning(f"No execution session found for {session_id}")
964
+ return "❌ No execution session found"
965
+
966
+ def shutdown_sandbox(request: gr.Request):
967
+ """Shutdown the sandbox while preserving all session data and files"""
968
+ session_id = request.session_hash
969
+ logger.info(f"Shutting down sandbox for {session_id} (preserving all session data and files)")
970
+
971
+ try:
972
+ # 1. Stop any running execution first
973
+ if session_id in STOP_EVENTS:
974
+ STOP_EVENTS[session_id].set()
975
+ logger.info(f"Stopped execution for session {session_id}")
976
+
977
+ # 2. Shutdown Modal sandbox only
978
+ if session_id in SANDBOXES:
979
+ logger.info(f"Killing Modal sandbox for session {session_id}")
980
+ SANDBOXES[session_id].kill()
981
+ SANDBOXES.pop(session_id)
982
+ logger.info(f"Successfully shutdown sandbox for session {session_id}")
983
+
984
+ # 3. Log what's being preserved (but don't remove anything)
985
+ session_manager = SessionStateManager(session_id, TMP_DIR)
986
+ if session_manager.session_exists():
987
+ logger.info(f"Preserving session data for {session_id}")
988
+
989
+ # Load session state to show what's being preserved
990
+ session_state = session_manager.load_state()
991
+ if session_state:
992
+ # Log what we're preserving
993
+ stats = session_state.get("session_stats", {})
994
+ llm_interactions = len(session_state.get("llm_interactions", []))
995
+ tool_executions = len(session_state.get("tool_executions", []))
996
+
997
+ logger.info(f"Preserving session {session_id}: "
998
+ f"{stats.get('total_messages', 0)} messages, "
999
+ f"{llm_interactions} LLM interactions, "
1000
+ f"{tool_executions} tool executions, "
1001
+ f"{stats.get('total_code_executions', 0)} code runs")
1002
+
1003
+ # Log all preserved files
1004
+ if session_manager.session_dir.exists():
1005
+ try:
1006
+ preserved_files = []
1007
+ for file_path in session_manager.session_dir.iterdir():
1008
+ if file_path.is_file():
1009
+ preserved_files.append(file_path.name)
1010
+
1011
+ if preserved_files:
1012
+ logger.info(f"Preserving {len(preserved_files)} files in {session_id}: {preserved_files}")
1013
+ else:
1014
+ logger.info(f"No files found in session {session_id}")
1015
+
1016
+ except OSError as e:
1017
+ logger.warning(f"Could not check session directory {session_id}: {e}")
1018
+
1019
+ # 4. Keep execution tracking data (don't clear anything)
1020
+ logger.info(f"Preserving execution state and stop events for {session_id}")
1021
+
1022
+ logger.info(f"Sandbox shutdown completed for session {session_id} (all data preserved)")
1023
+ return gr.Button(visible=False)
1024
+
1025
+ except Exception as e:
1026
+ logger.error(f"Error during shutdown for session {session_id}: {str(e)}")
1027
+ return f"❌ Error during shutdown: {str(e)}", gr.Button(visible=True)
1028
+
1029
+ # continue_execution function removed - functionality integrated into execute_jupyter_agent
1030
+
1031
+ def get_execution_status(request: gr.Request):
1032
+ """Get the current execution status for UI updates"""
1033
+ session_id = request.session_hash
1034
+
1035
+ if session_id not in EXECUTION_STATES:
1036
+ return "⚪ Ready"
1037
+
1038
+ state = EXECUTION_STATES[session_id]
1039
+ if state["running"]:
1040
+ if session_id in STOP_EVENTS and STOP_EVENTS[session_id].is_set():
1041
+ return "⏸️ Stopping..."
1042
+ else:
1043
+ # Check if we have more detailed phase information
1044
+ phase = state.get("current_phase", "running")
1045
+ if phase == "generating":
1046
+ return "🟢 Generating response..."
1047
+ elif phase == "executing_code":
1048
+ return "🟢 Executing code..."
1049
+ elif phase == "searching":
1050
+ return "🟢 Searching web..."
1051
+ else:
1052
+ return "🟢 Running"
1053
+ elif state.get("paused", False):
1054
+ return "⏸️ Paused - Click Run to continue"
1055
+ else:
1056
+ return "⚪ Ready"
1057
+
1058
+ def is_sandbox_active(request: gr.Request):
1059
+ """Check if sandbox is active for the current session"""
1060
+ session_id = request.session_hash
1061
+ return session_id in SANDBOXES
1062
+
1063
+ def get_sandbox_status_and_visibility(request: gr.Request):
1064
+ """Get sandbox status message and button visibility"""
1065
+ session_id = request.session_hash
1066
+ if session_id in SANDBOXES:
1067
+ return "🟢 Sandbox active", gr.Button(visible=True)
1068
+ else:
1069
+ return "⚪ No sandbox active", gr.Button(visible=False)
1070
+
1071
+ def update_sandbox_button_visibility(request: gr.Request):
1072
+ """Update only the button visibility based on sandbox status"""
1073
+ session_id = request.session_hash
1074
+ return gr.Button(visible=session_id in SANDBOXES)
1075
+
1076
+ def reset_ui_after_shutdown(request: gr.Request):
1077
+ """Reset UI components after complete shutdown"""
1078
+ session_id = request.session_hash
1079
+
1080
+ # Check if session is truly cleared
1081
+ is_cleared = (session_id not in SANDBOXES and
1082
+ session_id not in EXECUTION_STATES and
1083
+ session_id not in STOP_EVENTS)
1084
+
1085
+ if is_cleared:
1086
+ # Return reset state for all UI components
1087
+ return (
1088
+ init_notebook.render(), # Reset notebook display
1089
+ [], # Clear message state
1090
+ "⚪ Ready", # Reset status
1091
+ "⚪ No sandbox active", # Reset sandbox status
1092
+ gr.Button(visible=False) # Hide shutdown button
1093
+ )
1094
+ else:
1095
+ # Return current state if not fully cleared
1096
+ status = get_execution_status(request)
1097
+ sandbox_status, button_vis = get_sandbox_status_and_visibility(request)
1098
+ return (
1099
+ init_notebook.render(), # Still reset notebook display
1100
+ [], # Still clear message state
1101
+ status,
1102
+ sandbox_status,
1103
+ button_vis
1104
+ )
1105
+
1106
+ def reconstruct_message_history_from_notebook(notebook_data):
1107
+ """Reconstruct message history from notebook cells"""
1108
+ message_history = []
1109
+ cells = notebook_data.get('cells', [])
1110
+
1111
+ system_prompt = None
1112
+ current_conversation = []
1113
+
1114
+ for cell in cells:
1115
+ cell_type = cell.get('cell_type', '')
1116
+
1117
+ if cell_type == 'markdown':
1118
+ content = cell.get('source', '')
1119
+ if isinstance(content, list):
1120
+ content = ''.join(content)
1121
+
1122
+ # Check if this is a system message
1123
+ if 'System' in content and 'IMPORTANT EXECUTION GUIDELINES' in content:
1124
+ # Extract the system prompt content
1125
+ system_content = content
1126
+ # Clean up the HTML and extract the actual content
1127
+ # Remove HTML tags and extract the text content
1128
+ clean_content = re.sub(r'<[^>]+>', '', system_content)
1129
+ clean_content = re.sub(r'\n+', '\n', clean_content).strip()
1130
+ system_prompt = clean_content
1131
+
1132
+ elif 'User' in content and not any(word in content for word in ['Assistant', 'System']):
1133
+ # This is a user message
1134
+ # Extract the user content after the User header
1135
+ user_content = content.split('User')[1] if 'User' in content else content
1136
+ # Clean up HTML and formatting
1137
+ user_content = re.sub(r'<[^>]+>', '', user_content)
1138
+ user_content = re.sub(r'-{3,}', '', user_content)
1139
+ user_content = user_content.strip()
1140
+
1141
+ if user_content:
1142
+ current_conversation.append({
1143
+ "role": "user",
1144
+ "content": user_content
1145
+ })
1146
+
1147
+ elif 'Assistant' in content:
1148
+ # This is an assistant message
1149
+ assistant_content = content.split('Assistant')[1] if 'Assistant' in content else content
1150
+ # Clean up HTML and formatting
1151
+ assistant_content = re.sub(r'<[^>]+>', '', assistant_content)
1152
+ assistant_content = re.sub(r'-{3,}', '', assistant_content)
1153
+ assistant_content = assistant_content.strip()
1154
+
1155
+ if assistant_content:
1156
+ current_conversation.append({
1157
+ "role": "assistant",
1158
+ "content": assistant_content
1159
+ })
1160
+
1161
+ # Build the final message history
1162
+ if system_prompt:
1163
+ message_history.append({
1164
+ "role": "system",
1165
+ "content": system_prompt
1166
+ })
1167
+
1168
+ # Add the conversation messages
1169
+ message_history.extend(current_conversation)
1170
+
1171
+ return message_history
1172
+
1173
+ def load_previous_notebook(notebook_choice, request: gr.Request):
1174
+ """Load a previous notebook with complete session configuration (dev only)"""
1175
+ if not is_dev_environment():
1176
+ return (init_notebook.render(), [], "Load previous notebooks is only available in development mode",
1177
+ None, None, None, None, None, "", "", "", "", "", "", "", False)
1178
+
1179
+ if not notebook_choice or notebook_choice == "None":
1180
+ return (init_notebook.render(), [], "Please select a notebook to load",
1181
+ None, None, None, None, None, "", "", "", "", "", "", "", False)
1182
+
1183
+ try:
1184
+ # Parse the notebook choice to get the session ID
1185
+ session_id = notebook_choice.split(" ")[0]
1186
+ notebook_path = Path(TMP_DIR) / session_id / "jupyter-agent.ipynb"
1187
+
1188
+ if not notebook_path.exists():
1189
+ return (init_notebook.render(), [], f"Notebook file not found: {notebook_path}",
1190
+ None, None, None, None, None, "", "", "", "", "", "", "", False)
1191
+
1192
+ # Load the notebook
1193
+ with open(notebook_path, 'r') as f:
1194
+ notebook_data = json.load(f)
1195
+
1196
+ # Load session state
1197
+ temp_session_manager = SessionStateManager(session_id, TMP_DIR)
1198
+ session_state = temp_session_manager.load_state()
1199
+ session_config = None # For backward compatibility
1200
+
1201
+ # Extract config from session state for UI restoration
1202
+ if session_state:
1203
+ session_config = {
1204
+ "hardware": session_state.get("hardware_config", {}),
1205
+ "environment_vars": session_state.get("environment", {}).get("variables", ""),
1206
+ "api_keys": {
1207
+ "model_name": session_state.get("api_config", {}).get("model_name", "")
1208
+ }
1209
+ }
1210
+
1211
+ # Create a new JupyterNotebook instance with the loaded data
1212
+ loaded_notebook = JupyterNotebook()
1213
+ loaded_notebook.data = notebook_data
1214
+
1215
+ # Reconstruct message history from notebook cells
1216
+ message_history = reconstruct_message_history_from_notebook(notebook_data)
1217
+
1218
+ # Store the loaded notebook info in session for continue functionality
1219
+ session_id_hash = request.session_hash
1220
+ if session_id_hash not in EXECUTION_STATES:
1221
+ EXECUTION_STATES[session_id_hash] = {}
1222
+
1223
+ EXECUTION_STATES[session_id_hash]["loaded_notebook"] = {
1224
+ "notebook_data": notebook_data,
1225
+ "message_history": message_history,
1226
+ "original_session": session_id,
1227
+ "session_config": session_config
1228
+ }
1229
+
1230
+ logger.info(f"Successfully loaded notebook from {notebook_path}")
1231
+ logger.info(f"Reconstructed message history with {len(message_history)} messages")
1232
+
1233
+ # Prepare configuration values to restore UI state
1234
+ config_loaded = ""
1235
+ gpu_type = None
1236
+ cpu_cores = None
1237
+ memory_gb = None
1238
+ timeout_sec = None
1239
+ env_vars = ""
1240
+ modal_token_id = ""
1241
+ modal_token_secret = ""
1242
+ hf_token = ""
1243
+ provider_api_key = ""
1244
+ provider_api_endpoint = ""
1245
+ model_name = ""
1246
+
1247
+ if session_config:
1248
+ hardware = session_config.get("hardware", {})
1249
+ gpu_type = hardware.get("gpu_type")
1250
+ cpu_cores = hardware.get("cpu_cores")
1251
+ memory_gb = hardware.get("memory_gb")
1252
+ timeout_sec = hardware.get("timeout_sec")
1253
+ env_vars = session_config.get("environment_vars", "")
1254
+
1255
+ api_keys = session_config.get("api_keys", {})
1256
+ modal_token_id = api_keys.get("modal_token_id", "")
1257
+ modal_token_secret = api_keys.get("modal_token_secret", "")
1258
+ hf_token = api_keys.get("hf_token", "")
1259
+ provider_api_key = api_keys.get("provider_api_key", "")
1260
+ provider_api_endpoint = api_keys.get("provider_api_endpoint", "")
1261
+ model_name = api_keys.get("model_name", "")
1262
+
1263
+ config_loaded = f"✅ Configuration restored: GPU={gpu_type}, CPU={cpu_cores}, Memory={memory_gb}GB, Timeout={timeout_sec}s"
1264
+
1265
+ success_message = f"✅ Loaded notebook: {session_id} ({len(notebook_data.get('cells', []))} cells, {len(message_history)} messages)"
1266
+ if config_loaded:
1267
+ success_message += f"\n{config_loaded}"
1268
+
1269
+ return (loaded_notebook.render(), message_history, success_message,
1270
+ gpu_type, cpu_cores, memory_gb, timeout_sec, env_vars,
1271
+ modal_token_id, modal_token_secret, hf_token, provider_api_key, provider_api_endpoint, model_name,
1272
+ "", False) # Default empty tavily_api_key and False for enable_web_search
1273
+
1274
+ except Exception as e:
1275
+ logger.error(f"Failed to load notebook {notebook_choice}: {str(e)}")
1276
+ error_message = f"❌ Failed to load notebook: {str(e)}"
1277
+ return (init_notebook.render(), [], error_message,
1278
+ None, None, None, None, None, "", "", "", "", "", "", "", False)
1279
+
1280
+ def get_notebook_options():
1281
+ """Get options for notebook dropdown (dev only)"""
1282
+ if not is_dev_environment():
1283
+ return ["Load previous notebooks is only available in development mode"]
1284
+
1285
+ notebooks = get_previous_notebooks()
1286
+ if not notebooks:
1287
+ return ["No previous notebooks found"]
1288
+
1289
+ options = ["None"] + [nb['display_name'] for nb in notebooks[:20]] # Limit to 20 most recent
1290
+ return options
1291
+
1292
+ def refresh_notebook_options():
1293
+ """Refresh the notebook options dropdown"""
1294
+ return gr.Dropdown(choices=get_notebook_options(), value="None")
1295
+
1296
+ # Legacy session configuration functions removed - replaced by SessionStateManager
1297
+ # All session data is now stored in a single comprehensive session_state.json file
1298
+
1299
+
1300
+ css = """
1301
+ #component-0 {
1302
+ height: 100vh;
1303
+ overflow-y: auto;
1304
+ padding: 20px;
1305
+ }
1306
+
1307
+ .gradio-container {
1308
+ height: 100vh !important;
1309
+ }
1310
+
1311
+ .contain {
1312
+ height: 100vh !important;
1313
+ }
1314
+
1315
+ /* Button states for execution control */
1316
+ .button-executing {
1317
+ opacity: 0.6 !important;
1318
+ pointer-events: none !important;
1319
+ cursor: not-allowed !important;
1320
+ }
1321
+
1322
+ .button-executing::after {
1323
+ content: " ⏳";
1324
+ }
1325
+
1326
+ .status-running {
1327
+ animation: pulse 2s infinite;
1328
+ }
1329
+
1330
+ @keyframes pulse {
1331
+ 0% { opacity: 1; }
1332
+ 50% { opacity: 0.5; }
1333
+ 100% { opacity: 1; }
1334
+ }
1335
+ """
1336
+
1337
+
1338
+ # Create the interface
1339
+ with gr.Blocks() as demo:
1340
+ msg_state = gr.State(value=[])
1341
+
1342
+ # Environment info display
1343
+ env_info = gr.Markdown(f"""
1344
+ **Environment**: {get_environment().upper()} | **Features**: {"Development features enabled" if is_dev_environment() else "Production mode"}
1345
+ """)
1346
+
1347
+ html_output = gr.HTML(value=JupyterNotebook().render())
1348
+
1349
+ user_input = gr.Textbox(
1350
+ # value="train a 5 neuron neural network to classify the iris dataset",
1351
+ value="can you finetune llama 3.2 1b on tiny stories dataset and using unsloth",
1352
+ lines=3,
1353
+ label="Agent task"
1354
+ )
1355
+
1356
+ with gr.Accordion("Upload files ⬆ | Download notebook⬇", open=False):
1357
+ files = gr.File(label="Upload files to use", file_count="multiple")
1358
+ file = gr.File(TMP_DIR+"jupyter-agent.ipynb", label="Download Jupyter Notebook")
1359
+
1360
+
1361
+ with gr.Row():
1362
+ # Web Search Configuration
1363
+ with gr.Accordion("🔍 Web Search Settings", open=False):
1364
+ with gr.Row():
1365
+ enable_web_search = gr.Checkbox(
1366
+ label="Enable Web Search",
1367
+ value=bool(os.environ.get("TAVILY_API_KEY")), # Default to True if API key is available
1368
+ info="Allow the agent to search the web for current information and documentation"
1369
+ )
1370
+
1371
+ # Show web search status with better formatting
1372
+ tavily_status = "✅ Available" if os.environ.get("TAVILY_API_KEY") else "❌ API Key Required"
1373
+ gr.Markdown(f"**Status:** {tavily_status}")
1374
+
1375
+ gr.Markdown("""
1376
+ **Web Search Features:**
1377
+ - 🌐 Search for current tutorials, documentation, and best practices
1378
+ - 🐛 Find solutions to error messages and debugging help
1379
+ - 📚 Access up-to-date library documentation and examples
1380
+ - 💡 Get recent examples and code snippets from the web
1381
+
1382
+ ⚠️ **Note**: Web search requires a Tavily API key. Get one free at [tavily.com](https://tavily.com)
1383
+ """)
1384
+ # Previous notebooks section (dev only)
1385
+ if is_dev_environment():
1386
+ with gr.Accordion("📂 Load Previous Notebook (Dev Only)", open=False):
1387
+ notebook_dropdown = gr.Dropdown(
1388
+ choices=get_notebook_options(),
1389
+ value="None",
1390
+ label="Select Previous Notebook",
1391
+ info="Load a previously created notebook session"
1392
+ )
1393
+ with gr.Row():
1394
+ load_notebook_btn = gr.Button("📖 Load Selected", variant="secondary")
1395
+ refresh_notebooks_btn = gr.Button("🔄 Refresh List", variant="secondary")
1396
+
1397
+ load_status = gr.Textbox(
1398
+ label="Load Status",
1399
+ interactive=False,
1400
+ visible=False
1401
+ )
1402
+ # Check for missing API keys and show input fields conditionally
1403
+ missing_keys = get_missing_api_keys()
1404
+
1405
+ # API Key Configuration (shown only if keys are missing)
1406
+ if missing_keys:
1407
+ with gr.Accordion("🔑 Required API Keys (Missing from .env)", open=True):
1408
+ gr.Markdown("""
1409
+ **⚠️ Some required API keys are missing from your .env file.**
1410
+ Please provide them below to use the application:
1411
+ """)
1412
+
1413
+ api_key_components = {}
1414
+
1415
+ if "MODAL_TOKEN_ID" in missing_keys:
1416
+ api_key_components["modal_token_id"] = gr.Textbox(
1417
+ label="Modal Token ID",
1418
+ placeholder="ak-...",
1419
+ info="Modal Token ID for sandbox access",
1420
+ type="password"
1421
+ )
1422
+ else:
1423
+ api_key_components["modal_token_id"] = gr.Textbox(visible=False)
1424
+
1425
+ if "MODAL_TOKEN_SECRET" in missing_keys:
1426
+ api_key_components["modal_token_secret"] = gr.Textbox(
1427
+ label="Modal Token Secret",
1428
+ placeholder="as-...",
1429
+ info="Modal Token Secret for sandbox access",
1430
+ type="password"
1431
+ )
1432
+ else:
1433
+ api_key_components["modal_token_secret"] = gr.Textbox(visible=False)
1434
+
1435
+ if "HF_TOKEN" in missing_keys:
1436
+ api_key_components["hf_token"] = gr.Textbox(
1437
+ label="Hugging Face Token (Optional)",
1438
+ placeholder="hf_...",
1439
+ info="Hugging Face Token for model access",
1440
+ type="password"
1441
+ )
1442
+ else:
1443
+ api_key_components["hf_token"] = gr.Textbox(visible=False)
1444
+
1445
+ if "PROVIDER_API_KEY" in missing_keys:
1446
+ api_key_components["provider_api_key"] = gr.Textbox(
1447
+ label="AI Provider API Key",
1448
+ placeholder="sk-, gsk_, or csk-...",
1449
+ info="API Key for your AI provider (Anthropic, OpenAI, Cerebras, etc.)",
1450
+ type="password"
1451
+ )
1452
+ else:
1453
+ api_key_components["provider_api_key"] = gr.Textbox(visible=False)
1454
+
1455
+ if "PROVIDER_API_ENDPOINT" in missing_keys:
1456
+ api_key_components["provider_api_endpoint"] = gr.Textbox(
1457
+ label="AI Provider API Endpoint",
1458
+ placeholder="https://api.anthropic.com/v1/",
1459
+ info="API endpoint for your AI provider"
1460
+ )
1461
+ else:
1462
+ api_key_components["provider_api_endpoint"] = gr.Textbox(visible=False)
1463
+
1464
+ if "MODEL_NAME" in missing_keys:
1465
+ api_key_components["model_name"] = gr.Textbox(
1466
+ label="Model Name",
1467
+ placeholder="claude-sonnet-4-20250514",
1468
+ info="Name of the model to use"
1469
+ )
1470
+ else:
1471
+ api_key_components["model_name"] = gr.Textbox(visible=False)
1472
+
1473
+ if "TAVILY_API_KEY" in missing_keys:
1474
+ api_key_components["tavily_api_key"] = gr.Textbox(
1475
+ label="Tavily API Key (Optional)",
1476
+ placeholder="tvly-...",
1477
+ info="Tavily API Key for web search functionality",
1478
+ type="password"
1479
+ )
1480
+ else:
1481
+ api_key_components["tavily_api_key"] = gr.Textbox(visible=False)
1482
+ else:
1483
+ # Create hidden components when no keys are missing
1484
+ api_key_components = {
1485
+ "modal_token_id": gr.Textbox(visible=False),
1486
+ "modal_token_secret": gr.Textbox(visible=False),
1487
+ "hf_token": gr.Textbox(visible=False),
1488
+ "provider_api_key": gr.Textbox(visible=False),
1489
+ "provider_api_endpoint": gr.Textbox(visible=False),
1490
+ "model_name": gr.Textbox(visible=False),
1491
+ "tavily_api_key": gr.Textbox(visible=False)
1492
+ }
1493
+
1494
+
1495
+
1496
+
1497
+
1498
+ with gr.Accordion("Hardware Configuration ⚙️", open=False):
1499
+ with gr.Row():
1500
+ with gr.Column():
1501
+ env_vars = gr.Textbox(
1502
+ label="Environment Variables",
1503
+ placeholder="Enter environment variables (one per line):\nAPI_KEY=your_key_here\nDATA_PATH=/path/to/data\nDEBUG=true",
1504
+ lines=5,
1505
+ info="Add custom environment variables for the sandbox. Format: KEY=value (one per line)"
1506
+ )
1507
+
1508
+ env_info = gr.Markdown("""
1509
+ **Environment Variables Info:**
1510
+ - Variables will be available in the sandbox environment
1511
+ - Use KEY=value format, one per line
1512
+ - Common examples: API keys, data paths, configuration flags
1513
+ - Variables are session-specific and not persisted between sessions
1514
+
1515
+ ⚠️ **Security**: Avoid sensitive credentials in shared environments
1516
+ """)
1517
+ with gr.Column():
1518
+ with gr.Row():
1519
+ gpu_type = gr.Dropdown(
1520
+ choices=GPU_OPTIONS,
1521
+ value="cpu",
1522
+ label="GPU Type",
1523
+ info="Select hardware acceleration"
1524
+ )
1525
+ cpu_cores = gr.Slider(
1526
+ minimum=0.25,
1527
+ maximum=16,
1528
+ value=2.0,
1529
+ step=0.25,
1530
+ label="CPU Cores",
1531
+ info="Number of CPU cores"
1532
+ )
1533
+ with gr.Row():
1534
+ memory_gb = gr.Slider(
1535
+ minimum=0.5,
1536
+ maximum=64,
1537
+ value=8.0,
1538
+ step=0.5,
1539
+ label="Memory (GB)",
1540
+ info="RAM allocation"
1541
+ )
1542
+ timeout_sec = gr.Slider(
1543
+ minimum=60,
1544
+ maximum=1800,
1545
+ value=300,
1546
+ step=60,
1547
+ label="Timeout (seconds)",
1548
+ info="Maximum execution time"
1549
+ )
1550
+
1551
+ hardware_info = gr.Markdown("""
1552
+ **Hardware Options:**
1553
+ - **CPU Only**: Free, good for basic tasks
1554
+ - **T4**: Low-cost GPU, good for small models
1555
+ - **L4**: Mid-range GPU, better performance
1556
+ - **A100 40/80GB**: High-end GPU for large models
1557
+ - **H100**: Latest flagship GPU for maximum performance
1558
+
1559
+ ⚠️ **Note**: GPU instances cost more. Choose based on your workload.
1560
+ """)
1561
+
1562
+ # with gr.Accordion("Environment Variables 🔧", open=False):
1563
+
1564
+
1565
+ with gr.Row():
1566
+ generate_btn = gr.Button("Run!", variant="primary")
1567
+ stop_btn = gr.Button("⏸️ Stop", variant="secondary")
1568
+ # continue_btn removed - Run button handles continuation automatically
1569
+ clear_btn = gr.Button("Clear Notebook", variant="stop")
1570
+ shutdown_btn = gr.Button("🔴 Shutdown Sandbox", variant="stop", visible=False)
1571
+
1572
+ # Status display
1573
+ status_display = gr.Textbox(
1574
+ value="⚪ Ready",
1575
+ label="Execution Status",
1576
+ interactive=False,
1577
+ max_lines=1
1578
+ )
1579
+
1580
+ generate_btn.click(
1581
+ fn=execute_jupyter_agent,
1582
+ inputs=[
1583
+ user_input, files, msg_state, gpu_type, cpu_cores, memory_gb, timeout_sec, env_vars,
1584
+ api_key_components["modal_token_id"], api_key_components["modal_token_secret"],
1585
+ api_key_components["hf_token"], api_key_components["provider_api_key"],
1586
+ api_key_components["provider_api_endpoint"], api_key_components["model_name"],
1587
+ api_key_components["tavily_api_key"], enable_web_search
1588
+ ],
1589
+ outputs=[html_output, msg_state, file],
1590
+ show_progress="hidden",
1591
+ )
1592
+
1593
+ stop_btn.click(
1594
+ fn=stop_execution,
1595
+ outputs=[status_display],
1596
+ show_progress="hidden",
1597
+ )
1598
+
1599
+ # continue_btn.click handler removed - Run button handles continuation automatically
1600
+
1601
+ clear_btn.click(fn=clear, inputs=[msg_state], outputs=[html_output, msg_state])
1602
+
1603
+ shutdown_btn.click(
1604
+ fn=shutdown_sandbox,
1605
+ outputs=[shutdown_btn],
1606
+ show_progress="hidden",
1607
+ )
1608
+
1609
+ # Add event handlers for notebook loading (dev only)
1610
+ if is_dev_environment():
1611
+ load_notebook_btn.click(
1612
+ fn=load_previous_notebook,
1613
+ inputs=[notebook_dropdown],
1614
+ outputs=[
1615
+ html_output, msg_state, load_status,
1616
+ gpu_type, cpu_cores, memory_gb, timeout_sec, env_vars,
1617
+ api_key_components["modal_token_id"], api_key_components["modal_token_secret"],
1618
+ api_key_components["hf_token"], api_key_components["provider_api_key"],
1619
+ api_key_components["provider_api_endpoint"], api_key_components["model_name"],
1620
+ api_key_components["tavily_api_key"], enable_web_search
1621
+ ],
1622
+ show_progress="hidden"
1623
+ )
1624
+
1625
+ refresh_notebooks_btn.click(
1626
+ fn=refresh_notebook_options,
1627
+ outputs=[notebook_dropdown],
1628
+ show_progress="hidden"
1629
+ )
1630
+
1631
+ # Show/hide load status based on selection
1632
+ notebook_dropdown.change(
1633
+ fn=lambda choice: gr.Textbox(visible=choice != "None"),
1634
+ inputs=[notebook_dropdown],
1635
+ outputs=[load_status]
1636
+ )
1637
+
1638
+ # Periodic status update using timer
1639
+ status_timer = gr.Timer(2.0) # Update every 2 seconds
1640
+ status_timer.tick(
1641
+ fn=get_execution_status,
1642
+ outputs=[status_display],
1643
+ show_progress="hidden"
1644
+ )
1645
+
1646
+ # Update button visibility periodically
1647
+ button_timer = gr.Timer(3.0) # Check every 3 seconds
1648
+ button_timer.tick(
1649
+ fn=update_sandbox_button_visibility,
1650
+ outputs=[shutdown_btn],
1651
+ show_progress="hidden"
1652
+ )
1653
+
1654
+ demo.load(
1655
+ fn=None,
1656
+ inputs=None,
1657
+ outputs=None,
1658
+ js=""" () => {
1659
+ if (document.querySelectorAll('.dark').length) {
1660
+ document.querySelectorAll('.dark').forEach(el => el.classList.remove('dark'));
1661
+ }
1662
+
1663
+ // Add execution state management functions
1664
+ window.setExecutionState = function(isExecuting) {
1665
+ // Find Run button by text content since variant attribute might not be reliable
1666
+ const buttons = document.querySelectorAll('button');
1667
+ let runButton = null;
1668
+ let stopButton = null;
1669
+
1670
+ buttons.forEach(button => {
1671
+ const text = button.textContent.trim().toLowerCase();
1672
+ if (text.includes('run') && !text.includes('stop')) {
1673
+ runButton = button;
1674
+ } else if (text.includes('stop') || text.includes('⏸️')) {
1675
+ stopButton = button;
1676
+ }
1677
+ });
1678
+
1679
+ if (runButton) {
1680
+ if (isExecuting) {
1681
+ runButton.classList.add('button-executing');
1682
+ runButton.disabled = true;
1683
+ runButton.style.opacity = '0.6';
1684
+ runButton.style.cursor = 'not-allowed';
1685
+ runButton.style.pointerEvents = 'none';
1686
+ if (runButton.textContent.indexOf('⏳') === -1) {
1687
+ runButton.textContent = runButton.textContent.replace('!', '! ⏳');
1688
+ }
1689
+ } else {
1690
+ runButton.classList.remove('button-executing');
1691
+ runButton.disabled = false;
1692
+ runButton.style.opacity = '1';
1693
+ runButton.style.cursor = 'pointer';
1694
+ runButton.style.pointerEvents = 'auto';
1695
+ runButton.textContent = runButton.textContent.replace(' ⏳', '');
1696
+ }
1697
+ }
1698
+
1699
+ // Also update stop button visibility/state
1700
+ if (stopButton) {
1701
+ stopButton.style.display = isExecuting ? 'block' : 'inline-block';
1702
+ }
1703
+ };
1704
+
1705
+ // Monitor for status changes and update button states
1706
+ window.monitorExecutionStatus = function() {
1707
+ // Try multiple ways to find the status element
1708
+ let statusElement = document.querySelector('input[label*="Execution Status"], input[label*="Status"], textarea[label*="Status"]');
1709
+
1710
+ if (!statusElement) {
1711
+ // Fallback: look for any input that might contain status
1712
+ const allInputs = document.querySelectorAll('input, textarea');
1713
+ allInputs.forEach(input => {
1714
+ if (input.value && (input.value.includes('🟢') || input.value.includes('⚪') || input.value.includes('⏸️'))) {
1715
+ statusElement = input;
1716
+ }
1717
+ });
1718
+ }
1719
+
1720
+ if (statusElement) {
1721
+ const status = statusElement.value || '';
1722
+ const isRunning = status.includes('🟢') || status.includes('Running') || status.includes('Generating') || status.includes('Executing');
1723
+ const isReady = status.includes('⚪') || status.includes('Ready');
1724
+
1725
+ window.setExecutionState(isRunning);
1726
+
1727
+ // Add visual indicator to status element
1728
+ if (isRunning) {
1729
+ statusElement.style.background = '#e3f2fd';
1730
+ statusElement.style.borderColor = '#2196f3';
1731
+ } else if (isReady) {
1732
+ statusElement.style.background = '#f5f5f5';
1733
+ statusElement.style.borderColor = '#ccc';
1734
+ } else {
1735
+ statusElement.style.background = '#fff3e0';
1736
+ statusElement.style.borderColor = '#ff9800';
1737
+ }
1738
+ }
1739
+ };
1740
+
1741
+ // Set up mutation observer to watch for status changes
1742
+ const observer = new MutationObserver(function(mutations) {
1743
+ mutations.forEach(function(mutation) {
1744
+ if (mutation.type === 'childList' || mutation.type === 'attributes') {
1745
+ setTimeout(window.monitorExecutionStatus, 100);
1746
+ }
1747
+ });
1748
+ });
1749
+
1750
+ // Start observing
1751
+ observer.observe(document.body, {
1752
+ childList: true,
1753
+ subtree: true,
1754
+ attributes: true
1755
+ });
1756
+ }
1757
+ """
1758
+ )
1759
+
1760
+ logger.info("Starting Gradio application")
1761
+ demo.launch(ssr_mode=False)
jupyter_agent.py ADDED
@@ -0,0 +1,1463 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from jupyter_handler import JupyterNotebook
2
+ import json
3
+ import logging
4
+ import os
5
+ import datetime
6
+ from pathlib import Path
7
+ from typing import Dict, List, Any, Optional
8
+ from tavily import TavilyClient
9
+
10
+ # Phoenix tracing imports
11
+ try:
12
+ from openinference.instrumentation import using_session
13
+ PHOENIX_AVAILABLE = True
14
+ print("Phoenix session tracking imports successful")
15
+ except ImportError:
16
+ PHOENIX_AVAILABLE = False
17
+ print("Phoenix session tracking not available - missing openinference packages")
18
+
19
+ # Configure logging for utils module
20
+ logger = logging.getLogger(__name__)
21
+
22
+ # Initialize Tavily client
23
+ TAVILY_API_KEY = os.getenv("TAVILY_API_KEY")
24
+ tavily_client = TavilyClient(api_key=TAVILY_API_KEY) if TAVILY_API_KEY else None
25
+
26
+
27
+ TOOLS = [
28
+ {
29
+ "type": "function",
30
+ "function": {
31
+ "name": "add_and_execute_jupyter_code_cell",
32
+ "description": "A Python code execution environment that runs code in a Jupyter notebook interface. This is stateful - variables and imports persist between executions.",
33
+ "parameters": {
34
+ "type": "object",
35
+ "properties": {
36
+ "code": {
37
+ "type": "string",
38
+ "description": "The Python code to execute."
39
+ }
40
+ },
41
+ "required": ["code"]
42
+ }
43
+ }
44
+ },
45
+ {
46
+ "type": "function",
47
+ "function": {
48
+ "name": "edit_and_execute_current_cell",
49
+ "description": "Edit the current/last code cell and execute the new code. Use this to fix errors or modify the previous code instead of creating a new cell.",
50
+ "parameters": {
51
+ "type": "object",
52
+ "properties": {
53
+ "code": {
54
+ "type": "string",
55
+ "description": "The updated Python code to replace the current cell with and execute."
56
+ }
57
+ },
58
+ "required": ["code"]
59
+ }
60
+ }
61
+ },
62
+ {
63
+ "type": "function",
64
+ "function": {
65
+ "name": "execute_shell_command",
66
+ "description": "Execute shell/system commands like ls, cat, mkdir, etc. This runs independently of Python and provides terminal-style output.",
67
+ "parameters": {
68
+ "type": "object",
69
+ "properties": {
70
+ "command": {
71
+ "type": "string",
72
+ "description": "The shell command to execute (e.g., 'ls -la', 'cat file.txt', 'mkdir new_folder')."
73
+ }
74
+ },
75
+ "required": ["command"]
76
+ }
77
+ }
78
+ },
79
+ {
80
+ "type": "function",
81
+ "function": {
82
+ "name": "web_search",
83
+ "description": "Search the web for current information, documentation, tutorials, and solutions to coding problems. Use this to get context before starting tasks or when encountering errors.",
84
+ "parameters": {
85
+ "type": "object",
86
+ "properties": {
87
+ "query": {
88
+ "type": "string",
89
+ "description": "Search query (max 400 characters). Be specific and include relevant keywords."
90
+ }
91
+ },
92
+ "required": ["query"]
93
+ }
94
+ }
95
+ },
96
+ ]
97
+
98
+ # TOOLS = TOOLS[:1]
99
+
100
+ MAX_TURNS = 20
101
+
102
+
103
+ def create_phoenix_session_context(session_id: str, user_id: str = None, metadata: Dict = None):
104
+ """
105
+ Create a Phoenix session context for tracing LLM interactions.
106
+
107
+ Args:
108
+ session_id: Unique identifier for the session
109
+ user_id: Optional user identifier
110
+ metadata: Additional metadata to include in traces
111
+
112
+ Returns:
113
+ Context manager for Phoenix session tracking
114
+ """
115
+ if not PHOENIX_AVAILABLE:
116
+ # Return a no-op context manager if Phoenix is not available
117
+ from contextlib import nullcontext
118
+ return nullcontext()
119
+
120
+ try:
121
+ # Use using_session for proper session grouping in Phoenix
122
+ # This ensures all LLM calls within this context are grouped under the same session
123
+ logger.debug(f"Creating Phoenix session context for session_id: {session_id}")
124
+ return using_session(session_id)
125
+ except Exception as e:
126
+ logger.warning(f"Failed to create Phoenix session context for {session_id}: {e}")
127
+ # Fallback to no-op context if Phoenix session creation fails
128
+ from contextlib import nullcontext
129
+ return nullcontext()
130
+
131
+
132
+ class SessionStateManager:
133
+ """Manages comprehensive session state in a single JSON file"""
134
+
135
+ def __init__(self, session_id: str, base_dir: str = './temp/'):
136
+ self.session_id = session_id
137
+ self.base_dir = Path(base_dir)
138
+ self.session_dir = self.base_dir / session_id
139
+ self.state_file = self.session_dir / 'session_state.json'
140
+ self.session_dir.mkdir(parents=True, exist_ok=True)
141
+ logger.info(f"SessionStateManager initialized for {session_id}")
142
+
143
+ def create_initial_state(self, hardware_config: Dict, api_config: Dict,
144
+ environment: Dict, system_prompt: str) -> Dict:
145
+ """Create initial session state structure"""
146
+ timestamp = datetime.datetime.now(datetime.timezone.utc).isoformat()
147
+
148
+ initial_state = {
149
+ "session_id": self.session_id,
150
+ "created_at": timestamp,
151
+ "last_updated": timestamp,
152
+ "version": "1.0",
153
+
154
+ "hardware_config": hardware_config,
155
+ "api_config": api_config,
156
+ "environment": environment,
157
+
158
+ "conversation_history": [
159
+ {
160
+ "role": "system",
161
+ "content": system_prompt,
162
+ "timestamp": timestamp,
163
+ "metadata": {"type": "system_initialization"}
164
+ }
165
+ ],
166
+
167
+ "llm_interactions": [], # Complete API call logs
168
+ "tool_executions": [], # All tool calls and results
169
+
170
+ "notebook_data": {
171
+ "cells": [],
172
+ "metadata": {
173
+ "kernel_info": {"name": "python3"},
174
+ "language_info": {"name": "python", "version": "3.12"},
175
+ },
176
+ "nbformat": 4,
177
+ "nbformat_minor": 0
178
+ },
179
+
180
+ "execution_state": {
181
+ "current_turn": 0,
182
+ "max_turns": MAX_TURNS,
183
+ "is_running": False,
184
+ "is_paused": False,
185
+ "last_execution_successful": None,
186
+ "sandbox_active": False,
187
+ "sandbox_info": None
188
+ },
189
+
190
+ "session_stats": {
191
+ "total_messages": 1,
192
+ "total_code_executions": 0,
193
+ "total_searches": 0,
194
+ "total_errors": 0,
195
+ "session_duration_seconds": 0
196
+ }
197
+ }
198
+
199
+ logger.info("Created initial session state for %s", self.session_id)
200
+ return initial_state
201
+
202
+ def load_state(self) -> Optional[Dict]:
203
+ """Load session state from file with improved error handling"""
204
+ if not self.state_file.exists():
205
+ logger.info(f"No existing session state found for {self.session_id}")
206
+ return None
207
+
208
+ try:
209
+ with open(self.state_file, 'r', encoding='utf-8') as f:
210
+ state = json.load(f)
211
+ logger.info(f"Loaded session state for {self.session_id} with {len(state.get('conversation_history', []))} messages")
212
+ return state
213
+ except json.JSONDecodeError as e:
214
+ logger.error(f"JSON corruption in session state for {self.session_id}: {str(e)}")
215
+ logger.info(f"Creating backup of corrupted file: {self.state_file}.corrupted")
216
+ try:
217
+ import shutil
218
+ shutil.copy2(self.state_file, str(self.state_file) + ".corrupted")
219
+ logger.info(f"Backup created successfully")
220
+ except Exception as backup_error:
221
+ logger.warning(f"Failed to create backup: {backup_error}")
222
+ return None
223
+ except Exception as e:
224
+ logger.error(f"Failed to load session state for {self.session_id}: {str(e)}")
225
+ return None
226
+
227
+ def save_state(self, state: Dict) -> bool:
228
+ """Save session state to file with improved error handling"""
229
+ try:
230
+ # Update last_updated timestamp
231
+ state["last_updated"] = datetime.datetime.now(datetime.timezone.utc).isoformat()
232
+
233
+ # Update session stats
234
+ if "session_stats" not in state:
235
+ state["session_stats"] = {}
236
+
237
+ created_at = datetime.datetime.fromisoformat(state["created_at"])
238
+ current_time = datetime.datetime.now(datetime.timezone.utc)
239
+ state["session_stats"]["session_duration_seconds"] = int((current_time - created_at).total_seconds())
240
+ state["session_stats"]["total_messages"] = len(state.get("conversation_history", []))
241
+
242
+ # Validate JSON serializability before writing
243
+ try:
244
+ json.dumps(state, ensure_ascii=False)
245
+ except (TypeError, ValueError) as e:
246
+ logger.error(f"State contains non-serializable data: {e}")
247
+ logger.info("Attempting to clean non-serializable data...")
248
+ state = self._clean_non_serializable_data(state)
249
+
250
+ # Write to temporary file first, then rename for atomic operation
251
+ temp_file = self.state_file.with_suffix('.tmp')
252
+ with open(temp_file, 'w', encoding='utf-8') as f:
253
+ json.dump(state, f, indent=2, ensure_ascii=False)
254
+
255
+ # Atomic rename
256
+ temp_file.replace(self.state_file)
257
+
258
+ logger.debug(f"Saved session state for {self.session_id} ({len(json.dumps(state))} characters)")
259
+ return True
260
+ except Exception as e:
261
+ logger.error(f"Failed to save session state for {self.session_id}: {str(e)}")
262
+ # Clean up temp file if it exists
263
+ temp_file = self.state_file.with_suffix('.tmp')
264
+ if temp_file.exists():
265
+ try:
266
+ temp_file.unlink()
267
+ except Exception:
268
+ pass
269
+ return False
270
+
271
+ def _clean_non_serializable_data(self, obj):
272
+ """Recursively clean non-serializable data from objects"""
273
+ if isinstance(obj, dict):
274
+ cleaned = {}
275
+ for key, value in obj.items():
276
+ try:
277
+ json.dumps(value)
278
+ cleaned[key] = self._clean_non_serializable_data(value)
279
+ except (TypeError, ValueError):
280
+ logger.warning(f"Removing non-serializable field: {key}")
281
+ cleaned[key] = f"<non-serializable: {type(value).__name__}>"
282
+ return cleaned
283
+ elif isinstance(obj, list):
284
+ cleaned = []
285
+ for item in obj:
286
+ try:
287
+ json.dumps(item)
288
+ cleaned.append(self._clean_non_serializable_data(item))
289
+ except (TypeError, ValueError):
290
+ cleaned.append(f"<non-serializable: {type(item).__name__}>")
291
+ return cleaned
292
+ else:
293
+ return obj
294
+
295
+ def log_llm_interaction(self, state: Dict, request_data: Dict, response_data: Dict,
296
+ model: str, turn: int) -> None:
297
+ """Log complete LLM API interaction"""
298
+ timestamp = datetime.datetime.now(datetime.timezone.utc).isoformat()
299
+
300
+ interaction = {
301
+ "timestamp": timestamp,
302
+ "turn": turn,
303
+ "model": model,
304
+ "request": {
305
+ "messages_count": len(request_data.get("messages", [])),
306
+ "tools_count": len(request_data.get("tools", [])),
307
+ "model": request_data.get("model"),
308
+ "tool_choice": request_data.get("tool_choice")
309
+ },
310
+ "response": {
311
+ "content": response_data.get("choices", [{}])[0].get("message", {}).get("content"),
312
+ "tool_calls": response_data.get("choices", [{}])[0].get("message", {}).get("tool_calls"),
313
+ "finish_reason": response_data.get("choices", [{}])[0].get("finish_reason"),
314
+ "usage": response_data.get("usage")
315
+ }
316
+ }
317
+
318
+ if "llm_interactions" not in state:
319
+ state["llm_interactions"] = []
320
+ state["llm_interactions"].append(interaction)
321
+
322
+ # Log Phoenix session information for easy debugging
323
+ logger.debug(f"Logged LLM interaction for turn {turn} in session {self.session_id}")
324
+ logger.debug(f"Phoenix session tracking: session_id={self.session_id}, turn={turn}, model={model}")
325
+
326
+ # Log usage information if available for monitoring
327
+ usage = response_data.get("usage")
328
+ if usage:
329
+ logger.info(f"Session {self.session_id} turn {turn}: "
330
+ f"prompt_tokens={usage.get('prompt_tokens', 0)}, "
331
+ f"completion_tokens={usage.get('completion_tokens', 0)}, "
332
+ f"total_tokens={usage.get('total_tokens', 0)}")
333
+
334
+ def log_tool_execution(self, state: Dict, tool_call_id: str, tool_name: str,
335
+ tool_args: Dict, result: str, execution_data: Any = None) -> None:
336
+ """Log tool execution with full details"""
337
+ timestamp = datetime.datetime.now(datetime.timezone.utc).isoformat()
338
+
339
+ # Safely serialize execution_data to prevent JSON corruption
340
+ safe_execution_data = None
341
+ if execution_data is not None:
342
+ try:
343
+ # Convert execution_data to a safe, serializable format
344
+ if hasattr(execution_data, '__dict__'):
345
+ safe_execution_data = {
346
+ "type": type(execution_data).__name__,
347
+ "error": str(execution_data.error) if hasattr(execution_data, 'error') and execution_data.error else None,
348
+ "has_results": hasattr(execution_data, 'results') and bool(execution_data.results),
349
+ "has_stdout": hasattr(execution_data, 'logs') and hasattr(execution_data.logs, 'stdout') and bool(execution_data.logs.stdout),
350
+ "has_stderr": hasattr(execution_data, 'logs') and hasattr(execution_data.logs, 'stderr') and bool(execution_data.logs.stderr)
351
+ }
352
+ else:
353
+ # For simple types, convert to string safely
354
+ safe_execution_data = str(execution_data)[:200] # Limit length
355
+ except Exception as e:
356
+ logger.warning(f"Failed to serialize execution_data for {tool_call_id}: {e}")
357
+ safe_execution_data = {"serialization_error": str(e)}
358
+
359
+ tool_execution = {
360
+ "timestamp": timestamp,
361
+ "tool_call_id": tool_call_id,
362
+ "tool_name": tool_name,
363
+ "arguments": tool_args,
364
+ "result_summary": result[:500] + "..." if len(result) > 500 else result,
365
+ "result_length": len(result),
366
+ "execution_data": safe_execution_data,
367
+ "success": execution_data is None or (hasattr(execution_data, 'error') and execution_data.error is None) if execution_data else True
368
+ }
369
+
370
+ if "tool_executions" not in state:
371
+ state["tool_executions"] = []
372
+ state["tool_executions"].append(tool_execution)
373
+
374
+ # Update stats
375
+ if tool_name == "add_and_execute_jupyter_code_cell":
376
+ state["session_stats"]["total_code_executions"] = state["session_stats"].get("total_code_executions", 0) + 1
377
+ elif tool_name == "web_search":
378
+ state["session_stats"]["total_searches"] = state["session_stats"].get("total_searches", 0) + 1
379
+
380
+ if not tool_execution["success"]:
381
+ state["session_stats"]["total_errors"] = state["session_stats"].get("total_errors", 0) + 1
382
+
383
+ logger.debug(f"Logged tool execution {tool_name} ({tool_call_id}) in session {self.session_id}")
384
+
385
+ def add_message(self, state: Dict, role: str, content: str,
386
+ tool_calls: List = None, tool_call_id: str = None,
387
+ raw_execution: Any = None, metadata: Dict = None) -> None:
388
+ """Add message to conversation history with full context"""
389
+ timestamp = datetime.datetime.now(datetime.timezone.utc).isoformat()
390
+
391
+ message = {
392
+ "role": role,
393
+ "content": content,
394
+ "timestamp": timestamp
395
+ }
396
+
397
+ if tool_calls:
398
+ message["tool_calls"] = tool_calls
399
+ if tool_call_id:
400
+ message["tool_call_id"] = tool_call_id
401
+ if raw_execution:
402
+ message["raw_execution"] = raw_execution
403
+ if metadata:
404
+ message["metadata"] = metadata
405
+
406
+ state["conversation_history"].append(message)
407
+ logger.debug(f"Added {role} message to session {self.session_id} conversation history")
408
+
409
+ def update_execution_state(self, state: Dict, **kwargs) -> None:
410
+ """Update execution state fields"""
411
+ for key, value in kwargs.items():
412
+ if key in state["execution_state"]:
413
+ state["execution_state"][key] = value
414
+ logger.debug(f"Updated execution state {key}={value} for session {self.session_id}")
415
+
416
+ # Try to sync with global EXECUTION_STATES for UI consistency (if available)
417
+ try:
418
+ import sys
419
+ if 'app' in sys.modules:
420
+ execution_states = getattr(sys.modules['app'], 'EXECUTION_STATES', None)
421
+ if execution_states and self.session_id in execution_states:
422
+ for key, value in kwargs.items():
423
+ execution_states[self.session_id][key] = value
424
+ except (ImportError, AttributeError):
425
+ pass # Ignore if we can't sync with global state
426
+
427
+ def update_notebook_data(self, state: Dict, notebook_data: Dict) -> None:
428
+ """Update notebook data in session state"""
429
+ state["notebook_data"] = notebook_data
430
+ logger.debug(f"Updated notebook data for session {self.session_id} ({len(notebook_data.get('cells', []))} cells)")
431
+
432
+ def get_conversation_history(self, state: Dict) -> List[Dict]:
433
+ """Get conversation history suitable for LLM API calls"""
434
+ return state.get("conversation_history", [])
435
+
436
+ def validate_and_repair_conversation(self, state: Dict) -> None:
437
+ """Validate and repair conversation history to ensure tool calls have responses"""
438
+ conversation = state.get("conversation_history", [])
439
+ if not conversation:
440
+ return
441
+
442
+ pending_tool_calls = set()
443
+ valid_messages = []
444
+
445
+ for message in conversation:
446
+ if message.get("role") == "assistant" and message.get("tool_calls"):
447
+ # Track tool calls
448
+ for tool_call in message["tool_calls"]:
449
+ pending_tool_calls.add(tool_call["id"])
450
+ valid_messages.append(message)
451
+
452
+ elif message.get("role") == "tool" and message.get("tool_call_id"):
453
+ # Remove from pending when we find a response
454
+ pending_tool_calls.discard(message["tool_call_id"])
455
+ valid_messages.append(message)
456
+
457
+ else:
458
+ # Regular message (system, user, assistant without tool calls)
459
+ valid_messages.append(message)
460
+
461
+ # If there are incomplete tool calls, remove the assistant messages that created them
462
+ if pending_tool_calls:
463
+ logger.warning(f"Found incomplete tool calls in conversation: {pending_tool_calls}")
464
+ logger.warning("Removing incomplete assistant messages to repair conversation")
465
+
466
+ repaired_messages = []
467
+ for message in valid_messages:
468
+ if (message.get("role") == "assistant" and
469
+ message.get("tool_calls") and
470
+ any(tc["id"] in pending_tool_calls for tc in message["tool_calls"])):
471
+ logger.debug("Removing assistant message with incomplete tool calls")
472
+ continue
473
+ repaired_messages.append(message)
474
+
475
+ # Update conversation history
476
+ state["conversation_history"] = repaired_messages
477
+ logger.info(f"Repaired conversation: {len(conversation)} -> {len(repaired_messages)} messages")
478
+
479
+ # Save the repaired state
480
+ self.save_state(state)
481
+
482
+ def session_exists(self) -> bool:
483
+ """Check if session state file exists"""
484
+ return self.state_file.exists()
485
+
486
+ def get_session_summary(self, state: Dict) -> str:
487
+ """Get human-readable session summary"""
488
+ stats = state.get("session_stats", {})
489
+ created = datetime.datetime.fromisoformat(state["created_at"])
490
+
491
+ return f"""Session {self.session_id}:
492
+ - Created: {created.strftime('%Y-%m-%d %H:%M:%S UTC')}
493
+ - Messages: {stats.get('total_messages', 0)}
494
+ - Code Executions: {stats.get('total_code_executions', 0)}
495
+ - Web Searches: {stats.get('total_searches', 0)}
496
+ - Errors: {stats.get('total_errors', 0)}
497
+ - Duration: {stats.get('session_duration_seconds', 0)}s
498
+ - Hardware: {state.get('hardware_config', {}).get('gpu_type', 'unknown')}
499
+ - Model: {state.get('api_config', {}).get('model_name', 'unknown')}"""
500
+
501
+
502
+ def execute_code(sbx, code):
503
+ logger.debug(f"Executing code in sandbox ({len(code)} characters)")
504
+ execution = sbx.run_code(code, on_stdout=lambda data: logger.debug(f'stdout: {data}'))
505
+ output = ""
506
+ if len(execution.logs.stdout) > 0:
507
+ output += "\n".join(execution.logs.stdout)
508
+ logger.debug(f"Execution produced {len(execution.logs.stdout)} stdout lines")
509
+ if len(execution.logs.stderr) > 0:
510
+ output += "\n".join(execution.logs.stderr)
511
+ logger.debug(f"Execution produced {len(execution.logs.stderr)} stderr lines")
512
+ if execution.error is not None:
513
+ output += execution.error.traceback
514
+ logger.warning(f"Execution error: {execution.error.name}: {execution.error.value}")
515
+ logger.debug(f"Code execution completed, output length: {len(output)}")
516
+ return output, execution
517
+
518
+
519
+ def parse_exec_result_llm(execution, max_code_output=1000):
520
+ logger.debug(f"Parsing execution result for LLM (max_output: {max_code_output})")
521
+ output = []
522
+
523
+ def truncate_if_needed(text):
524
+ if len(text) > max_code_output:
525
+ return (text[:max_code_output] + f"\n[Output is truncated as it is more than {max_code_output} characters]")
526
+ return text
527
+
528
+ if execution.results:
529
+ results_text_parts = []
530
+ plot_count = 0
531
+
532
+ for result in execution.results:
533
+ if hasattr(result, 'text') and result.text:
534
+ results_text_parts.append(result.text)
535
+ elif hasattr(result, 'png') and result.png:
536
+ plot_count += 1
537
+ results_text_parts.append(f"[Plot {plot_count} generated and displayed]")
538
+ elif hasattr(result, 'html') and result.html:
539
+ results_text_parts.append("[HTML output generated]")
540
+
541
+ if results_text_parts:
542
+ results_text = "\n".join(results_text_parts)
543
+ output.append(truncate_if_needed(results_text))
544
+
545
+ logger.debug(f"Added {len(execution.results)} execution results (including {plot_count} plots)")
546
+ if execution.logs.stdout:
547
+ stdout_text = "\n".join(execution.logs.stdout)
548
+ output.append(truncate_if_needed(stdout_text))
549
+ logger.debug(f"Added stdout output ({len(execution.logs.stdout)} lines)")
550
+ if execution.logs.stderr:
551
+ stderr_text = "\n".join(execution.logs.stderr)
552
+ output.append(truncate_if_needed(stderr_text))
553
+ logger.debug(f"Added stderr output ({len(execution.logs.stderr)} lines)")
554
+ if execution.error is not None:
555
+ output.append(truncate_if_needed(execution.error.traceback))
556
+ logger.debug(f"Added error traceback: {execution.error.name}")
557
+
558
+ final_output = "\n".join(output)
559
+ logger.debug(f"Parsed execution result for LLM: {len(final_output)} characters")
560
+ return final_output
561
+
562
+ def clean_messages_for_api(messages):
563
+ """
564
+ Create a clean copy of messages without raw_execution fields and metadata for API calls.
565
+ Also validates that tool calls have corresponding tool responses.
566
+ This prevents 413 errors and API validation errors.
567
+ """
568
+ logger.debug(f"Cleaning {len(messages)} messages for API call")
569
+ cleaned_messages = []
570
+ raw_execution_count = 0
571
+ metadata_count = 0
572
+ pending_tool_calls = set()
573
+
574
+ for message in messages:
575
+ cleaned_message = message.copy()
576
+
577
+ # Remove raw_execution data
578
+ if "raw_execution" in cleaned_message:
579
+ cleaned_message.pop("raw_execution")
580
+ raw_execution_count += 1
581
+
582
+ # Remove metadata and timestamp
583
+ if "metadata" in cleaned_message:
584
+ cleaned_message.pop("metadata")
585
+ metadata_count += 1
586
+ if "timestamp" in cleaned_message:
587
+ cleaned_message.pop("timestamp")
588
+
589
+ # Track tool calls and responses for validation
590
+ if cleaned_message.get("role") == "assistant" and cleaned_message.get("tool_calls"):
591
+ for tool_call in cleaned_message["tool_calls"]:
592
+ pending_tool_calls.add(tool_call["id"])
593
+ elif cleaned_message.get("role") == "tool" and cleaned_message.get("tool_call_id"):
594
+ pending_tool_calls.discard(cleaned_message["tool_call_id"])
595
+
596
+ cleaned_messages.append(cleaned_message)
597
+
598
+ # If there are pending tool calls without responses, remove the assistant message with tool calls
599
+ if pending_tool_calls:
600
+ logger.warning(f"Found {len(pending_tool_calls)} tool calls without responses: {pending_tool_calls}")
601
+ logger.warning("Removing incomplete tool call messages to prevent API errors")
602
+
603
+ # Remove messages with incomplete tool calls
604
+ filtered_messages = []
605
+ for message in cleaned_messages:
606
+ if (message.get("role") == "assistant" and
607
+ message.get("tool_calls") and
608
+ any(tc["id"] in pending_tool_calls for tc in message["tool_calls"])):
609
+ logger.debug("Removing assistant message with incomplete tool calls")
610
+ continue
611
+ filtered_messages.append(message)
612
+
613
+ cleaned_messages = filtered_messages
614
+
615
+ logger.debug(f"Cleaned messages: removed raw_execution from {raw_execution_count}, metadata from {metadata_count}")
616
+ logger.debug(f"Final cleaned message count: {len(cleaned_messages)}")
617
+ return cleaned_messages
618
+
619
+
620
+ def web_search(query):
621
+ """
622
+ Perform web search using Tavily API with automatic year addition and formatting.
623
+
624
+ Args:
625
+ query (str): Search query (max 400 characters)
626
+
627
+ Returns:
628
+ str: Formatted search results for LLM consumption
629
+ """
630
+ if not tavily_client:
631
+ logger.error("Tavily client not initialized - API key missing")
632
+ return "❌ Search unavailable: Tavily API key not configured"
633
+
634
+ # Validate query length
635
+ if len(query) > 400:
636
+ logger.warning(f"Query too long ({len(query)} chars), truncating to 400")
637
+ query = query[:400]
638
+
639
+ # Add current year to query for more recent results
640
+ current_year = datetime.datetime.now().year
641
+ if str(current_year) not in query:
642
+ # Only add year if query has room for it
643
+ year_addition = f" {current_year}"
644
+ if len(query + year_addition) <= 400:
645
+ query += year_addition
646
+ logger.debug(f"Added current year to query: {current_year}")
647
+
648
+ logger.info(f"Performing Tavily search: '{query}' ({len(query)} chars)")
649
+
650
+ try:
651
+ # Perform search with optimized parameters
652
+ response = tavily_client.search(
653
+ query=query,
654
+ search_depth="basic", # Use basic for faster results
655
+ max_results=5, # Limit results to avoid overwhelming context
656
+ include_answer=True, # Include AI-generated answer
657
+ include_raw_content=False, # Don't include raw content to save tokens
658
+ include_images=False # Don't include images
659
+ )
660
+
661
+ logger.info(f"Search completed: {len(response.get('results', []))} results found")
662
+
663
+ # Format results for LLM consumption
664
+ formatted_results = format_search_results_for_llm(response)
665
+
666
+ logger.debug(f"Formatted search results: {len(formatted_results)} characters")
667
+ return formatted_results
668
+
669
+ except Exception as e:
670
+ logger.error(f"Tavily search failed: {str(e)}")
671
+ return f"❌ Search failed: {str(e)}"
672
+
673
+
674
+ def format_search_results_for_llm(response):
675
+ """Format Tavily search results for LLM consumption"""
676
+
677
+ query = response.get('query', 'Unknown query')
678
+ results = response.get('results', [])
679
+ answer = response.get('answer', '')
680
+
681
+ formatted = f"🔍 **Web Search Results for:** {query}\n\n"
682
+
683
+ if answer:
684
+ formatted += f"**Quick Answer:** {answer}\n\n"
685
+
686
+ if results:
687
+ formatted += f"**Found {len(results)} relevant sources:**\n\n"
688
+
689
+ for i, result in enumerate(results, 1):
690
+ title = result.get('title', 'Untitled')
691
+ url = result.get('url', '')
692
+ content = result.get('content', '')
693
+ score = result.get('score', 0)
694
+
695
+ # Truncate content to reasonable length
696
+ # if len(content) > 300:
697
+ # content = content[:300] + "..."
698
+
699
+ formatted += f"**{i}. {title}** (Relevance: {score:.2f})\n"
700
+ formatted += f" 🔗 {url}\n"
701
+ formatted += f" 📄 {content}\n\n"
702
+ else:
703
+ formatted += "No results found.\n"
704
+
705
+ return formatted
706
+
707
+
708
+ def run_interactive_notebook_with_session_state(client, model, session_state_manager, session_state, sbx, stop_event=None, tools=None):
709
+ logger.info(f"Starting interactive notebook with session state for {session_state_manager.session_id}")
710
+
711
+ # Get conversation history from session state
712
+ messages = session_state_manager.get_conversation_history(session_state)
713
+ notebook = JupyterNotebook(messages)
714
+
715
+ # Update execution state
716
+ session_state_manager.update_execution_state(session_state, is_running=True, sandbox_active=True, current_phase="initializing")
717
+
718
+ # Use provided tools or default to all tools
719
+ if tools is None:
720
+ tools = TOOLS
721
+
722
+ try:
723
+ sbx_info = sbx.get_info()
724
+ notebook.add_sandbox_countdown(sbx_info.started_at, sbx_info.end_at)
725
+
726
+ # Store sandbox info in session state
727
+ session_state["execution_state"]["sandbox_info"] = {
728
+ "started_at": sbx_info.started_at.isoformat(),
729
+ "end_at": sbx_info.end_at.isoformat(),
730
+ "timeout_seconds": int((sbx_info.end_at - sbx_info.started_at).total_seconds())
731
+ }
732
+
733
+ logger.debug(f"Added sandbox countdown: {sbx_info.started_at} to {sbx_info.end_at}")
734
+ except Exception as e:
735
+ logger.warning(f"Failed to get sandbox info: {str(e)}")
736
+
737
+ logger.debug("Initial notebook yield in 'generating' mode")
738
+
739
+ # Update notebook data in session state
740
+ session_state_manager.update_notebook_data(session_state, notebook.data)
741
+
742
+ # Save initial state
743
+ session_state_manager.save_state(session_state)
744
+
745
+ yield notebook.render(mode="generating"), notebook.data, messages
746
+
747
+ max_code_output = 1000
748
+ turns = session_state["execution_state"]["current_turn"]
749
+ done = False
750
+ previous_execution_had_error = False
751
+ previous_execution_had_warnings = False
752
+
753
+ logger.info(f"Starting interactive loop from turn {turns} with max_output={max_code_output}, max_turns={MAX_TURNS}")
754
+
755
+ while not done and (turns <= MAX_TURNS) and (stop_event is None or not stop_event.is_set()):
756
+ turns += 1
757
+ logger.info(f"Starting turn {turns}/{MAX_TURNS}")
758
+
759
+ try:
760
+ # Update phase to generating
761
+ session_state_manager.update_execution_state(session_state, current_phase="generating")
762
+
763
+ # Refresh messages from session state before API call
764
+ messages = session_state_manager.get_conversation_history(session_state)
765
+ logger.debug(f"Making API call to {model} with {len(messages)} messages")
766
+
767
+ # Prepare request data for logging
768
+ request_data = {
769
+ "messages": clean_messages_for_api(messages),
770
+ "model": model,
771
+ "tools": tools,
772
+ "tool_choice": "auto"
773
+ }
774
+
775
+ # Prepare session metadata for Phoenix tracing
776
+ session_metadata = {
777
+ "turn": turns,
778
+ "max_turns": MAX_TURNS,
779
+ "model": model,
780
+ "tools_count": len(tools),
781
+ "messages_count": len(messages),
782
+ "current_phase": "generating"
783
+ }
784
+
785
+ # Add hardware config if available
786
+ if "hardware_config" in session_state:
787
+ hw_config = session_state["hardware_config"]
788
+ session_metadata.update({
789
+ "gpu_type": hw_config.get("gpu_type", "unknown"),
790
+ "cpu_cores": hw_config.get("cpu_cores", "unknown"),
791
+ "memory_gb": hw_config.get("memory_gb", "unknown")
792
+ })
793
+
794
+ # Wrap OpenAI API call with Phoenix session context for proper grouping
795
+ with create_phoenix_session_context(
796
+ session_id=session_state_manager.session_id,
797
+ user_id=None, # Could be extracted from request context if available
798
+ metadata=session_metadata
799
+ ):
800
+ logger.debug(f"Making OpenAI API call with Phoenix session context: {session_state_manager.session_id}")
801
+ response = client.chat.completions.create(**request_data)
802
+ logger.debug("API call successful within Phoenix session context")
803
+
804
+ # Log the complete LLM interaction
805
+ session_state_manager.log_llm_interaction(
806
+ session_state, request_data, response.model_dump(), model, turns
807
+ )
808
+ except Exception as e:
809
+ # Handle inference client errors
810
+ logger.error(f"Inference failed on turn {turns}: {str(e)}")
811
+
812
+ # Add detailed error information to the notebook
813
+ error_message = str(e)
814
+ if "429" in error_message or "too_many_requests" in error_message.lower():
815
+ detailed_error = f"""**API Rate Limit Exceeded** 🚫
816
+
817
+ The inference service has reached its rate limit. This typically means:
818
+ - Too many requests have been sent in a short period
819
+ - Daily quota has been exceeded
820
+ - Service is temporarily overloaded
821
+
822
+ **What you can try:**
823
+ - Wait a few minutes and try again
824
+ - If using Cerebras API, check your daily quota
825
+ - Try using a different model or service
826
+ - Contact support if the issue persists
827
+
828
+ **Technical details:**
829
+ ```
830
+ {error_message}
831
+ ```"""
832
+ elif "401" in error_message or "unauthorized" in error_message.lower():
833
+ detailed_error = f"""**Authentication Error** 🔐
834
+
835
+ There's an issue with API authentication:
836
+ - API key might be missing or invalid
837
+ - API key might have expired
838
+ - Insufficient permissions
839
+
840
+ **Technical details:**
841
+ ```
842
+ {error_message}
843
+ ```"""
844
+ elif "500" in error_message or "internal" in error_message.lower():
845
+ detailed_error = f"""**Server Error** 🔧
846
+
847
+ The inference service encountered an internal error:
848
+ - Service might be temporarily unavailable
849
+ - Try again in a few moments
850
+ - If the issue persists, it's likely a service-side problem
851
+
852
+ **Technical details:**
853
+ ```
854
+ {error_message}
855
+ ```"""
856
+ else:
857
+ detailed_error = f"""**Inference Service Error** ⚠️
858
+
859
+ An error occurred while communicating with the AI service:
860
+
861
+ **Technical details:**
862
+ ```
863
+ {error_message}
864
+ ```
865
+
866
+ **What you can try:**
867
+ - Check your internet connection
868
+ - Try again in a few moments
869
+ - If the problem persists, contact support"""
870
+
871
+ notebook.add_error(detailed_error)
872
+
873
+ # Add error to session state
874
+ session_state_manager.add_message(
875
+ session_state, "assistant", detailed_error,
876
+ metadata={"type": "error", "error_type": "api_error", "turn": turns}
877
+ )
878
+
879
+ # Update execution state
880
+ session_state_manager.update_execution_state(
881
+ session_state, is_running=False, last_execution_successful=False
882
+ )
883
+
884
+ # Update notebook data and save state
885
+ session_state_manager.update_notebook_data(session_state, notebook.data)
886
+ session_state_manager.save_state(session_state)
887
+
888
+ yield notebook.render(mode="error"), notebook.data, messages
889
+ return
890
+
891
+ # Get the response content and tool calls
892
+ full_response = response.choices[0].message.content or ""
893
+ tool_calls = response.choices[0].message.tool_calls or []
894
+
895
+ logger.debug(f"Turn {turns}: Response content length: {len(full_response)}, Tool calls: {len(tool_calls)}")
896
+
897
+ # Add markdown cell for assistant's thinking
898
+ if full_response.strip():
899
+ logger.debug(f"Adding assistant response as markdown ({len(full_response)} chars)")
900
+ notebook.add_markdown(full_response, "assistant")
901
+ else:
902
+ logger.debug("Skipping empty assistant response")
903
+
904
+ # Handle tool calls and add assistant message to session state only
905
+ if tool_calls:
906
+ logger.info(f"Processing {len(tool_calls)} tool calls on turn {turns}")
907
+ # Add assistant message to session state (messages will be derived from this)
908
+ session_state_manager.add_message(
909
+ session_state, "assistant", full_response,
910
+ tool_calls=[{
911
+ "id": tc.id,
912
+ "type": "function",
913
+ "function": {"name": tc.function.name, "arguments": tc.function.arguments}
914
+ } for tc in tool_calls],
915
+ metadata={"turn": turns, "type": "thinking"}
916
+ )
917
+ logger.debug(f"Added assistant message with {len(tool_calls)} tool calls to session state")
918
+ elif full_response.strip():
919
+ # If no tool calls but we have content, add regular assistant message
920
+ session_state_manager.add_message(
921
+ session_state, "assistant", full_response,
922
+ metadata={"turn": turns, "type": "thinking"}
923
+ )
924
+ logger.debug("Added regular assistant message to session state")
925
+
926
+ for i, tool_call in enumerate(tool_calls):
927
+ logger.debug(f"Processing tool call {i+1}/{len(tool_calls)}: {tool_call.function.name}")
928
+
929
+ if tool_call.function.name == "add_and_execute_jupyter_code_cell":
930
+ # Update phase to executing code
931
+ session_state_manager.update_execution_state(session_state, current_phase="executing_code")
932
+
933
+ logger.debug(f"Processing code execution tool call: {tool_call.id}")
934
+ tool_args = json.loads(tool_call.function.arguments)
935
+ code = tool_args["code"]
936
+ logger.debug(f"Code to execute: {len(code)} characters")
937
+
938
+ # Determine if we should reuse the last cell or create a new one
939
+ # Reuse if there were errors (not just warnings) in the previous execution
940
+ should_reuse_cell = (previous_execution_had_error and
941
+ notebook.get_last_cell_type() == "code")
942
+
943
+ if should_reuse_cell:
944
+ logger.info("Reusing last code cell due to previous execution error")
945
+ # Update the existing cell's code instead of creating a new one
946
+ notebook.update_last_code_cell(code)
947
+ else:
948
+ logger.debug("Creating new code cell")
949
+ # Create a new cell (normal behavior)
950
+ notebook.add_code(code)
951
+
952
+ logger.debug("Yielding notebook in 'executing' mode")
953
+ yield notebook.render(mode="executing"), notebook.data, messages
954
+
955
+ try:
956
+ # Check for stop event before execution
957
+ if stop_event and stop_event.is_set():
958
+ logger.info("Stop event detected before code execution")
959
+ stopped_message = """**Execution Stopped** ⏸️
960
+
961
+ The execution was stopped by user request before the code could run."""
962
+ notebook.add_markdown(stopped_message, "assistant")
963
+ yield notebook.render(mode="stopped"), notebook.data, messages
964
+ return
965
+
966
+ # Execution sandbox call - might timeout
967
+ logger.info("Executing code in sandbox")
968
+ execution = sbx.run_code(code)
969
+ notebook.append_execution(execution)
970
+
971
+ # Update error and warning tracking for next iteration
972
+ previous_execution_had_error = notebook.has_execution_error(execution)
973
+ previous_execution_had_warnings = notebook.has_execution_warnings(execution)
974
+ # Log tool execution in session state
975
+ tool_args = json.loads(tool_call.function.arguments)
976
+ tool_response_content = parse_exec_result_llm(execution, max_code_output=max_code_output)
977
+ session_state_manager.log_tool_execution(
978
+ session_state, tool_call.id, "add_and_execute_jupyter_code_cell",
979
+ tool_args, tool_response_content, execution
980
+ )
981
+
982
+ if previous_execution_had_error:
983
+ logger.warning("Code execution resulted in error")
984
+ elif previous_execution_had_warnings:
985
+ logger.info("Code execution completed with warnings")
986
+ else:
987
+ logger.info("Code execution completed successfully")
988
+
989
+ except Exception as e:
990
+ # Handle sandbox timeout/execution errors
991
+ logger.error(f"Code execution failed: {str(e)}")
992
+
993
+ # Add detailed error information for code execution failures
994
+ error_message = str(e)
995
+ if "timeout" in error_message.lower():
996
+ detailed_error = f"""**Code Execution Timeout** ⏰
997
+
998
+ The code execution took too long and was terminated:
999
+ - Code may have entered an infinite loop
1000
+ - Processing large datasets can cause timeouts
1001
+ - Complex computations may exceed time limits
1002
+
1003
+ **What you can try:**
1004
+ - Optimize your code for better performance
1005
+ - Break down complex operations into smaller steps
1006
+ - Increase the timeout limit in settings
1007
+ - Check for infinite loops or blocking operations
1008
+
1009
+ **Technical details:**
1010
+ ```
1011
+ {error_message}
1012
+ ```"""
1013
+ else:
1014
+ detailed_error = f"""**Code Execution Failed** 💥
1015
+
1016
+ An error occurred while executing the code in the sandbox:
1017
+
1018
+ **Technical details:**
1019
+ ```
1020
+ {error_message}
1021
+ ```
1022
+
1023
+ **What you can try:**
1024
+ - Check the code for syntax errors
1025
+ - Verify all required packages are available
1026
+ - Try simplifying the code
1027
+ - Check the sandbox logs for more details"""
1028
+
1029
+ notebook.add_error(detailed_error)
1030
+ yield notebook.render(mode="error"), notebook.data, messages
1031
+ return
1032
+
1033
+ # Prepare tool response (already computed above)
1034
+ raw_execution = notebook.parse_exec_result_nb(execution)
1035
+
1036
+ logger.debug(f"Tool response: {len(tool_response_content)} chars content, {len(raw_execution)} raw outputs")
1037
+
1038
+ # Add tool response to session state only
1039
+ session_state_manager.add_message(
1040
+ session_state, "tool", tool_response_content,
1041
+ tool_call_id=tool_call.id, raw_execution=raw_execution,
1042
+ metadata={"turn": turns, "execution_successful": not previous_execution_had_error}
1043
+ )
1044
+ elif tool_call.function.name == "web_search":
1045
+ # Update phase to searching
1046
+ session_state_manager.update_execution_state(session_state, current_phase="searching")
1047
+
1048
+ logger.debug(f"Processing search tool call: {tool_call.id}")
1049
+ tool_args = json.loads(tool_call.function.arguments)
1050
+ query = tool_args["query"]
1051
+ logger.debug(f"Search query: '{query}' ({len(query)} chars)")
1052
+
1053
+ # Add search status to notebook
1054
+ notebook.add_markdown("🔍 **Searching the web...**", "assistant")
1055
+ yield notebook.render(mode="generating"), notebook.data, messages
1056
+
1057
+ try:
1058
+ # Perform search
1059
+ search_results = web_search(query)
1060
+ logger.info("Search completed successfully")
1061
+
1062
+ # Log search tool execution
1063
+ tool_args = json.loads(tool_call.function.arguments)
1064
+ session_state_manager.log_tool_execution(
1065
+ session_state, tool_call.id, "web_search",
1066
+ tool_args, search_results
1067
+ )
1068
+
1069
+ # Add search results to notebook
1070
+ notebook.add_markdown(search_results, "assistant")
1071
+
1072
+ # Add tool response to session state only
1073
+ session_state_manager.add_message(
1074
+ session_state, "tool", search_results,
1075
+ tool_call_id=tool_call.id,
1076
+ metadata={"turn": turns, "search_successful": True}
1077
+ )
1078
+
1079
+ except Exception as e:
1080
+ error_message = f"❌ Search failed: {str(e)}"
1081
+ logger.error(f"Search tool call failed: {str(e)}")
1082
+
1083
+ # Log failed search
1084
+ tool_args = json.loads(tool_call.function.arguments)
1085
+ session_state_manager.log_tool_execution(
1086
+ session_state, tool_call.id, "web_search",
1087
+ tool_args, error_message
1088
+ )
1089
+
1090
+ # Add error to notebook
1091
+ notebook.add_markdown(error_message, "assistant")
1092
+
1093
+ # Add error response to session state only
1094
+ session_state_manager.add_message(
1095
+ session_state, "tool", error_message,
1096
+ tool_call_id=tool_call.id,
1097
+ metadata={"turn": turns, "search_successful": False, "error": str(e)}
1098
+ )
1099
+ elif tool_call.function.name == "edit_and_execute_current_cell":
1100
+ # Update phase to executing code
1101
+ session_state_manager.update_execution_state(session_state, current_phase="executing_code")
1102
+
1103
+ logger.debug(f"Processing edit current cell tool call: {tool_call.id}")
1104
+ tool_args = json.loads(tool_call.function.arguments)
1105
+ code = tool_args["code"]
1106
+ logger.debug(f"Code to execute in current cell: {len(code)} characters")
1107
+
1108
+ # Check if we have a code cell to edit
1109
+ if notebook.get_last_cell_type() == "code":
1110
+ logger.info("Editing last code cell with new code")
1111
+ notebook.update_last_code_cell(code)
1112
+ else:
1113
+ logger.info("No code cell to edit, creating new cell")
1114
+ notebook.add_code(code)
1115
+
1116
+ logger.debug("Yielding notebook in 'executing' mode")
1117
+ yield notebook.render(mode="executing"), notebook.data, messages
1118
+
1119
+ try:
1120
+ # Check for stop event before execution
1121
+ if stop_event and stop_event.is_set():
1122
+ logger.info("Stop event detected before code execution")
1123
+ stopped_message = """**Execution Stopped** ⏸️
1124
+
1125
+ The execution was stopped by user request before the code could run."""
1126
+ notebook.add_markdown(stopped_message, "assistant")
1127
+ yield notebook.render(mode="stopped"), notebook.data, messages
1128
+ return
1129
+
1130
+ # Execution sandbox call - might timeout
1131
+ logger.info("Executing edited code in sandbox")
1132
+ execution = sbx.run_code(code)
1133
+ notebook.append_execution(execution)
1134
+
1135
+ # Update error and warning tracking for next iteration
1136
+ previous_execution_had_error = notebook.has_execution_error(execution)
1137
+ previous_execution_had_warnings = notebook.has_execution_warnings(execution)
1138
+ # Log tool execution in session state
1139
+ tool_response_content = parse_exec_result_llm(execution, max_code_output=max_code_output)
1140
+ session_state_manager.log_tool_execution(
1141
+ session_state, tool_call.id, "edit_and_execute_current_cell",
1142
+ tool_args, tool_response_content, execution
1143
+ )
1144
+
1145
+ if previous_execution_had_error:
1146
+ logger.warning("Edited code execution resulted in error")
1147
+ elif previous_execution_had_warnings:
1148
+ logger.info("Edited code execution completed with warnings")
1149
+ else:
1150
+ logger.info("Edited code execution completed successfully")
1151
+
1152
+ except Exception as e:
1153
+ # Handle sandbox timeout/execution errors
1154
+ logger.error(f"Edited code execution failed: {str(e)}")
1155
+
1156
+ # Add detailed error information for code execution failures
1157
+ error_message = str(e)
1158
+ if "timeout" in error_message.lower():
1159
+ detailed_error = f"""**Code Execution Timeout** ⏰
1160
+
1161
+ The edited code execution took too long and was terminated:
1162
+ - Code may have entered an infinite loop
1163
+ - Processing large datasets can cause timeouts
1164
+ - Complex computations may exceed time limits
1165
+
1166
+ **What you can try:**
1167
+ - Optimize your code for better performance
1168
+ - Break down complex operations into smaller steps
1169
+ - Increase the timeout limit in settings
1170
+ - Check for infinite loops or blocking operations
1171
+
1172
+ **Technical details:**
1173
+ ```
1174
+ {error_message}
1175
+ ```"""
1176
+ else:
1177
+ detailed_error = f"""**Code Execution Failed** 💥
1178
+
1179
+ An error occurred while executing the edited code in the sandbox:
1180
+
1181
+ **Technical details:**
1182
+ ```
1183
+ {error_message}
1184
+ ```
1185
+
1186
+ **What you can try:**
1187
+ - Check the code for syntax errors
1188
+ - Verify all required packages are available
1189
+ - Try simplifying the code
1190
+ - Check the sandbox logs for more details"""
1191
+
1192
+ notebook.add_error(detailed_error)
1193
+ yield notebook.render(mode="error"), notebook.data, messages
1194
+ return
1195
+
1196
+ # Prepare tool response
1197
+ raw_execution = notebook.parse_exec_result_nb(execution)
1198
+
1199
+ logger.debug(f"Tool response: {len(tool_response_content)} chars content, {len(raw_execution)} raw outputs")
1200
+
1201
+ # Add tool response to session state only
1202
+ session_state_manager.add_message(
1203
+ session_state, "tool", tool_response_content,
1204
+ tool_call_id=tool_call.id, raw_execution=raw_execution,
1205
+ metadata={"turn": turns, "execution_successful": not previous_execution_had_error, "action": "edit_cell"}
1206
+ )
1207
+ elif tool_call.function.name == "execute_shell_command":
1208
+ # Update phase to executing shell command
1209
+ session_state_manager.update_execution_state(session_state, current_phase="executing_shell")
1210
+
1211
+ logger.debug(f"Processing shell command tool call: {tool_call.id}")
1212
+ tool_args = json.loads(tool_call.function.arguments)
1213
+ command = tool_args["command"]
1214
+ logger.debug(f"Shell command to execute: '{command}'")
1215
+
1216
+ # Add shell command to notebook with special styling
1217
+ notebook.add_shell_command(command)
1218
+
1219
+ logger.debug("Yielding notebook in 'executing' mode")
1220
+ yield notebook.render(mode="executing"), notebook.data, messages
1221
+
1222
+ try:
1223
+ # Check for stop event before execution
1224
+ if stop_event and stop_event.is_set():
1225
+ logger.info("Stop event detected before shell execution")
1226
+ stopped_message = """**Execution Stopped** ⏸️
1227
+
1228
+ The execution was stopped by user request before the shell command could run."""
1229
+ notebook.add_markdown(stopped_message, "assistant")
1230
+ yield notebook.render(mode="stopped"), notebook.data, messages
1231
+ return
1232
+
1233
+ # Execute shell command in sandbox using raw shell execution
1234
+ logger.info(f"Executing raw shell command in sandbox: {command}")
1235
+
1236
+ try:
1237
+ # Use the new raw shell execution method
1238
+ if hasattr(sbx, 'run_shell'):
1239
+ shell_execution = sbx.run_shell(command, timeout=60)
1240
+ logger.info("Shell command executed using raw shell method")
1241
+ else:
1242
+ # Fallback: Execute shell command using Python subprocess within sandbox
1243
+ shell_code = f"""
1244
+ import subprocess
1245
+ import sys
1246
+
1247
+ try:
1248
+ result = subprocess.run(
1249
+ {repr(command)},
1250
+ shell=True,
1251
+ capture_output=True,
1252
+ text=True,
1253
+ timeout=60
1254
+ )
1255
+
1256
+ if result.stdout:
1257
+ print("STDOUT:")
1258
+ print(result.stdout)
1259
+
1260
+ if result.stderr:
1261
+ print("STDERR:")
1262
+ print(result.stderr)
1263
+
1264
+ print(f"Exit code: {{result.returncode}}")
1265
+
1266
+ except subprocess.TimeoutExpired:
1267
+ print("Error: Command timed out after 60 seconds")
1268
+ except Exception as e:
1269
+ print(f"Error executing command: {{e}}")
1270
+ """
1271
+ shell_execution = sbx.run_code(shell_code)
1272
+ logger.info("Shell command executed via Python subprocess fallback")
1273
+
1274
+ # Add shell execution results to notebook
1275
+ notebook.append_shell_execution(shell_execution)
1276
+
1277
+ # Prepare response content for LLM
1278
+ shell_response_content = parse_exec_result_llm(shell_execution, max_code_output=max_code_output)
1279
+
1280
+ # Log tool execution in session state
1281
+ session_state_manager.log_tool_execution(
1282
+ session_state, tool_call.id, "execute_shell_command",
1283
+ tool_args, shell_response_content, shell_execution
1284
+ )
1285
+
1286
+ # Check for errors
1287
+ shell_had_error = notebook.has_execution_error(shell_execution)
1288
+
1289
+ if shell_had_error:
1290
+ logger.warning("Shell command execution resulted in error")
1291
+ else:
1292
+ logger.info("Shell command execution completed successfully")
1293
+
1294
+ except Exception as shell_error:
1295
+ logger.error(f"Shell command execution failed: {str(shell_error)}")
1296
+
1297
+ # Create error message
1298
+ detailed_error = f"""**Shell Command Failed** 🔧
1299
+
1300
+ An error occurred while executing the shell command:
1301
+
1302
+ **Command:** `{command}`
1303
+
1304
+ **Technical details:**
1305
+ ```
1306
+ {str(shell_error)}
1307
+ ```
1308
+
1309
+ **What you can try:**
1310
+ - Check if the command exists in the sandbox environment
1311
+ - Verify command syntax
1312
+ - Try a simpler version of the command
1313
+ - Check if required tools/packages are installed"""
1314
+
1315
+ notebook.add_error(detailed_error)
1316
+
1317
+ # Log failed execution
1318
+ session_state_manager.log_tool_execution(
1319
+ session_state, tool_call.id, "execute_shell_command",
1320
+ tool_args, detailed_error
1321
+ )
1322
+
1323
+ yield notebook.render(mode="error"), notebook.data, messages
1324
+ return
1325
+
1326
+ except Exception as e:
1327
+ # Handle general execution errors
1328
+ logger.error(f"Shell command execution failed: {str(e)}")
1329
+
1330
+ detailed_error = f"""**Shell Execution Error** ⚠️
1331
+
1332
+ An unexpected error occurred while executing the shell command:
1333
+
1334
+ **Command:** `{command}`
1335
+
1336
+ **Technical details:**
1337
+ ```
1338
+ {str(e)}
1339
+ ```"""
1340
+
1341
+ notebook.add_error(detailed_error)
1342
+ yield notebook.render(mode="error"), notebook.data, messages
1343
+ return
1344
+
1345
+ # Prepare tool response for LLM and session state
1346
+ raw_execution = notebook.parse_exec_result_nb(shell_execution)
1347
+
1348
+ logger.debug(f"Shell tool response: {len(shell_response_content)} chars content")
1349
+
1350
+ # Add tool response to session state
1351
+ session_state_manager.add_message(
1352
+ session_state, "tool", shell_response_content,
1353
+ tool_call_id=tool_call.id, raw_execution=raw_execution,
1354
+ metadata={"turn": turns, "command": command, "execution_successful": not shell_had_error, "action": "shell_command"}
1355
+ )
1356
+ else:
1357
+ logger.warning(f"Unknown tool call function: {tool_call.function.name}")
1358
+
1359
+ if not tool_calls:
1360
+ logger.info(f"No tool calls on turn {turns}, conversation ending")
1361
+ if len(full_response.strip())==0:
1362
+ logger.error("Assistant provided no content and no tool calls")
1363
+ notebook.add_error(f"No tool call and empty assistant response:\n{response.model_dump_json(indent=2)}")
1364
+
1365
+ # Only add the final assistant message if we didn't already add it above
1366
+ # (in the elif full_response.strip() block)
1367
+ if full_response.strip():
1368
+ # Since we're now only using session state, we can safely add the message
1369
+ # The session state manager will handle any deduplication if needed
1370
+ session_state_manager.add_message(
1371
+ session_state, "assistant", full_response,
1372
+ metadata={"turn": turns, "type": "final_response"}
1373
+ )
1374
+ logger.debug("Added final assistant response to session state")
1375
+
1376
+ done = True
1377
+
1378
+ # Update session state after each turn
1379
+ session_state_manager.update_execution_state(
1380
+ session_state, current_turn=turns, last_execution_successful=not previous_execution_had_error
1381
+ )
1382
+ session_state_manager.update_notebook_data(session_state, notebook.data)
1383
+ session_state_manager.save_state(session_state)
1384
+
1385
+ if done:
1386
+ logger.info(f"Interactive notebook completed after {turns} turns")
1387
+ session_state_manager.update_execution_state(
1388
+ session_state, is_running=False, sandbox_active=True
1389
+ )
1390
+ session_state_manager.save_state(session_state)
1391
+ yield notebook.render(mode="done"), notebook.data, messages
1392
+ else:
1393
+ logger.debug(f"Turn {turns} completed, yielding in 'generating' mode")
1394
+ yield notebook.render(mode="generating"), notebook.data, messages
1395
+
1396
+ if turns > MAX_TURNS:
1397
+ logger.warning(f"Interactive notebook reached maximum turns ({MAX_TURNS})")
1398
+ error_msg = f"**Maximum Turns Reached** 🔄\n\nThe conversation has reached the maximum number of turns ({MAX_TURNS}). This is a safety limit to prevent infinite loops.\n\n**What you can try:**\n- Start a new conversation\n- Clear the notebook and begin fresh\n- Contact support if you need a higher turn limit"
1399
+ notebook.add_error(error_msg)
1400
+
1401
+ # Add error to session state
1402
+ session_state_manager.add_message(
1403
+ session_state, "assistant", error_msg,
1404
+ metadata={"type": "error", "error_type": "max_turns_exceeded", "turn": turns}
1405
+ )
1406
+
1407
+ # Update final state
1408
+ session_state_manager.update_execution_state(
1409
+ session_state, is_running=False, last_execution_successful=False
1410
+ )
1411
+ session_state_manager.update_notebook_data(session_state, notebook.data)
1412
+ session_state_manager.save_state(session_state)
1413
+
1414
+ yield notebook.render(mode="error"), notebook.data, messages
1415
+ elif stop_event and stop_event.is_set():
1416
+ logger.info("Interactive notebook stopped by user")
1417
+
1418
+ # Add a stopped message to the notebook
1419
+ stopped_message = """**Execution Stopped** ⏸️
1420
+
1421
+ The execution was stopped by user request. You can resume by clicking Run again."""
1422
+ notebook.add_markdown(stopped_message, "assistant")
1423
+
1424
+ # Add stopped message to session state
1425
+ session_state_manager.add_message(
1426
+ session_state, "assistant", stopped_message,
1427
+ metadata={"type": "status", "status_type": "stopped_by_user", "turn": turns}
1428
+ )
1429
+
1430
+ # Update state to indicate pause
1431
+ session_state_manager.update_execution_state(
1432
+ session_state, is_running=False, is_paused=True
1433
+ )
1434
+ session_state_manager.update_notebook_data(session_state, notebook.data)
1435
+ session_state_manager.save_state(session_state)
1436
+
1437
+ yield notebook.render(mode="stopped"), notebook.data, messages
1438
+
1439
+
1440
+ def run_interactive_notebook(client, model, messages, sbx, stop_event=None, tools=None):
1441
+ """Backward compatibility wrapper for the new session state system"""
1442
+ logger.warning("Using legacy run_interactive_notebook - this should be replaced with session state version")
1443
+
1444
+ # Create a temporary session for backward compatibility
1445
+ import uuid
1446
+ temp_session_id = str(uuid.uuid4())[:8]
1447
+ session_manager = SessionStateManager(temp_session_id)
1448
+
1449
+ # Create basic session state
1450
+ session_state = session_manager.create_initial_state(
1451
+ hardware_config={"gpu_type": "unknown", "cpu_cores": 2, "memory_gb": 8, "timeout_sec": 300},
1452
+ api_config={"model_name": model, "provider_type": "unknown"},
1453
+ environment={"variables": "", "files_uploaded": []},
1454
+ system_prompt=messages[0].get("content", "") if messages and messages[0].get("role") == "system" else ""
1455
+ )
1456
+
1457
+ # Initialize conversation history with provided messages
1458
+ session_state["conversation_history"] = messages
1459
+
1460
+ # Use the new session-based function
1461
+ yield from run_interactive_notebook_with_session_state(
1462
+ client, model, session_manager, session_state, sbx, stop_event, tools
1463
+ )
jupyter_handler.py ADDED
@@ -0,0 +1,1161 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import nbformat
2
+ from nbconvert import HTMLExporter
3
+ from traitlets.config import Config
4
+ import json
5
+ import copy
6
+ from jinja2 import DictLoader
7
+ import datetime
8
+ import logging
9
+
10
+ # Configure logging for jupyter_handler module
11
+ logger = logging.getLogger(__name__)
12
+
13
+
14
+ system_template = """\
15
+ <details>
16
+ <summary style="display: flex; align-items: center; cursor: pointer; margin-bottom: 12px;">
17
+ <h3 style="color: #374151; margin: 0; margin-right: 8px; font-size: 14px; font-weight: 600;">System</h3>
18
+ <span class="arrow" style="margin-right: 12px; font-size: 12px;">▶</span>
19
+ <div style="flex: 1; height: 2px; background-color: #374151;"></div>
20
+ </summary>
21
+ <div style="margin-top: 8px; padding: 8px; background-color: #f9fafb; border-radius: 4px; border-left: 3px solid #374151; margin-bottom: 16px;">
22
+ {}
23
+ </div>
24
+ </details>
25
+
26
+ <style>
27
+ details > summary .arrow {{
28
+ display: inline-block;
29
+ transition: transform 0.2s;
30
+ }}
31
+ details[open] > summary .arrow {{
32
+ transform: rotate(90deg);
33
+ }}
34
+ details > summary {{
35
+ list-style: none;
36
+ }}
37
+ details > summary::-webkit-details-marker {{
38
+ display: none;
39
+ }}
40
+ </style>
41
+ """
42
+
43
+ user_template = """\
44
+ <div style="display: flex; align-items: center; margin-bottom: 12px;">
45
+ <h3 style="color: #166534; margin: 0; margin-right: 12px; font-size: 14px; font-weight: 600;">User</h3>
46
+ <div style="flex: 1; height: 2px; background-color: #166534;"></div>
47
+ </div>
48
+ <div style="margin-bottom: 16px;">{}</div>"""
49
+
50
+ assistant_thinking_template = """\
51
+ <div style="display: flex; align-items: center; margin-bottom: 12px;">
52
+ <h3 style="color: #1d5b8e; margin: 0; margin-right: 12px; font-size: 14px; font-weight: 600;">Assistant</h3>
53
+ <div style="flex: 1; height: 2px; background-color: #1d5b8e;"></div>
54
+ </div>
55
+ <div style="margin-bottom: 16px;">{}</div>"""
56
+
57
+ assistant_final_answer_template = """<div class="alert alert-block alert-warning">
58
+ <b>Assistant:</b> Final answer: {}
59
+ </div>
60
+ """
61
+
62
+ web_search_template = """
63
+ <details style="margin-bottom: 16px; border: 1px solid #e1e5e9; border-radius: 6px; background-color: #f8f9fa;">
64
+ <summary style="display: flex; align-items: center; cursor: pointer; padding: 12px; background-color: #e3f2fd; border-radius: 6px 6px 0 0; margin: 0;">
65
+ <h4 style="color: #1976d2; margin: 0; margin-right: 8px; font-size: 14px; font-weight: 600;">🔍 Web Search Results</h4>
66
+ <span class="search-arrow" style="margin-left: auto; font-size: 12px; transition: transform 0.2s;">▼</span>
67
+ </summary>
68
+ <div style="padding: 16px; background-color: #ffffff;">
69
+ <div style="margin-bottom: 12px; padding: 8px; background-color: #f0f7ff; border-radius: 4px; border-left: 3px solid #2196f3;">
70
+ <strong style="color: #1976d2;">Query:</strong> <em>{query}</em>
71
+ </div>
72
+
73
+ {quick_answer}
74
+
75
+ <div style="margin-top: 16px;">
76
+ <h5 style="color: #424242; font-size: 13px; margin-bottom: 12px; font-weight: 600;">📚 Sources:</h5>
77
+ {sources}
78
+ </div>
79
+ </div>
80
+ </details>
81
+
82
+ <style>
83
+ details[open] > summary .search-arrow {{
84
+ transform: rotate(180deg);
85
+ }}
86
+ details > summary {{
87
+ list-style: none;
88
+ }}
89
+ details > summary::-webkit-details-marker {{
90
+ display: none;
91
+ }}
92
+ .source-item {{
93
+ margin-bottom: 8px;
94
+ padding: 8px;
95
+ background-color: #f9f9f9;
96
+ border-radius: 4px;
97
+ border-left: 2px solid #4caf50;
98
+ }}
99
+ .source-title {{
100
+ font-weight: 600;
101
+ color: #2e7d32;
102
+ font-size: 13px;
103
+ margin-bottom: 4px;
104
+ }}
105
+ .source-url {{
106
+ color: #666;
107
+ font-size: 11px;
108
+ text-decoration: none;
109
+ word-break: break-all;
110
+ }}
111
+ .source-url:hover {{
112
+ color: #1976d2;
113
+ text-decoration: underline;
114
+ }}
115
+ .relevance-score {{
116
+ display: inline-block;
117
+ background-color: #e8f5e8;
118
+ color: #2e7d32;
119
+ padding: 2px 6px;
120
+ border-radius: 12px;
121
+ font-size: 10px;
122
+ font-weight: 600;
123
+ margin-left: 8px;
124
+ }}
125
+ .quick-answer {{
126
+ background-color: #fff8e1;
127
+ border-left: 3px solid #ffc107;
128
+ padding: 12px;
129
+ margin-bottom: 16px;
130
+ border-radius: 4px;
131
+ }}
132
+ .quick-answer-title {{
133
+ color: #f57c00;
134
+ font-weight: 600;
135
+ font-size: 13px;
136
+ margin-bottom: 6px;
137
+ }}
138
+ </style>
139
+ """
140
+
141
+ header_message = """<div style="text-align: center; padding: 24px 16px; margin-bottom: 24px;">
142
+ <h1 style="color: #1e3a8a; font-size: 48px; font-weight: 700; margin: 0 0 8px 0; font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, sans-serif;">
143
+ 🔬 Eureka Agent
144
+ </h1>
145
+ <p style="color: #6b7280; font-size: 11px; margin: 0; display: flex; align-items: center; justify-content: center; gap: 6px;">
146
+ <span>
147
+ Built on top of
148
+ <a href="https://huggingface.co/spaces/lvwerra/jupyter-agent-2" target="_blank" style="color: #6b7280; text-decoration: underline;">
149
+ Jupyter Agent 2
150
+ </a>
151
+ </span>
152
+ </p>
153
+ </div>
154
+ """
155
+
156
+ shell_command_template = """
157
+ <div style="background: linear-gradient(135deg, #0f172a 0%, #1e293b 100%); border-radius: 8px; margin: 16px 0; box-shadow: 0 4px 12px rgba(0,0,0,0.3); border: 1px solid #334155;">
158
+ <!-- Terminal Header -->
159
+ <div style="background: linear-gradient(90deg, #374151 0%, #4b5563 100%); padding: 8px 12px; border-radius: 8px 8px 0 0; border-bottom: 1px solid #6b7280; display: flex; align-items: center; gap: 6px;">
160
+ <div style="width: 12px; height: 12px; background: #ef4444; border-radius: 50%;"></div>
161
+ <div style="width: 12px; height: 12px; background: #f59e0b; border-radius: 50%;"></div>
162
+ <div style="width: 12px; height: 12px; background: #10b981; border-radius: 50%;"></div>
163
+ <span style="color: #d1d5db; font-size: 12px; font-weight: 500; margin-left: 12px; font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', sans-serif;">Terminal</span>
164
+ </div>
165
+ <!-- Command Area -->
166
+ <div style="padding: 16px; background-color: #0f172a;">
167
+ <div style="display: flex; align-items: center; margin-bottom: 4px;">
168
+ <span style="color: #22d3ee; font-family: 'SF Mono', 'Monaco', 'Inconsolata', 'Roboto Mono', monospace; font-size: 14px; font-weight: 600; margin-right: 8px;">$</span>
169
+ <span style="color: #e2e8f0; font-family: 'SF Mono', 'Monaco', 'Inconsolata', 'Roboto Mono', monospace; font-size: 14px; line-height: 1.4;">{}</span>
170
+ </div>
171
+ </div>
172
+ </div>
173
+ """
174
+
175
+ shell_output_template = """
176
+ <div style="background: linear-gradient(135deg, #111827 0%, #1f2937 100%) !important; border-radius: 8px; margin: 8px 0 16px 0; box-shadow: 0 2px 8px rgba(0,0,0,0.2); border: 1px solid #374151;">
177
+ <div style="padding: 16px; background-color: #111827 !important; border-radius: 8px;">
178
+ <pre style="margin: 0 !important; color: #f1f5f9 !important; background-color: #111827 !important; font-family: 'SF Mono', 'Monaco', 'Inconsolata', 'Roboto Mono', monospace; font-size: 13px; line-height: 1.5; overflow-x: auto; white-space: pre-wrap; text-shadow: 0 1px 2px rgba(0,0,0,0.1); border: none !important;">{}</pre>
179
+ </div>
180
+ </div>
181
+
182
+ <style>
183
+ /* Ensure shell output maintains dark theme */
184
+ .shell-output pre {{
185
+ background-color: #111827 !important;
186
+ color: #f1f5f9 !important;
187
+ border: none !important;
188
+ }}
189
+ .shell-output {{
190
+ background-color: #111827 !important;
191
+ }}
192
+ </style>
193
+ """
194
+
195
+ bad_html_bad = """input[type="file"] {
196
+ display: block;
197
+ }"""
198
+
199
+
200
+ EXECUTING_WIDGET = """
201
+ <div style="display: flex; align-items: center; gap: 8px; padding: 8px 12px; background-color: #e3f2fd; border-radius: 6px; border-left: 3px solid #2196f3;">
202
+ <div style="display: flex; gap: 4px;">
203
+ <div style="width: 6px; height: 6px; background-color: #2196f3; border-radius: 50%; animation: pulse 1.5s ease-in-out infinite;"></div>
204
+ <div style="width: 6px; height: 6px; background-color: #2196f3; border-radius: 50%; animation: pulse 1.5s ease-in-out 0.1s infinite;"></div>
205
+ <div style="width: 6px; height: 6px; background-color: #2196f3; border-radius: 50%; animation: pulse 1.5s ease-in-out 0.2s infinite;"></div>
206
+ </div>
207
+ <span style="color: #1976d2; font-size: 14px; font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, sans-serif;">
208
+ Executing code...
209
+ </span>
210
+ </div>
211
+
212
+ <style>
213
+ @keyframes pulse {
214
+ 0%, 80%, 100% {
215
+ opacity: 0.3;
216
+ transform: scale(0.8);
217
+ }
218
+ 40% {
219
+ opacity: 1;
220
+ transform: scale(1);
221
+ }
222
+ }
223
+ </style>
224
+ """
225
+
226
+ GENERATING_WIDGET = """
227
+ <div style="display: flex; align-items: center; gap: 8px; padding: 8px 12px; background-color: #f3e5f5; border-radius: 6px; border-left: 3px solid #9c27b0;">
228
+ <div style="width: 80px; height: 4px; background-color: #e1bee7; border-radius: 2px; overflow: hidden;">
229
+ <div style="width: 30%; height: 100%; background-color: #9c27b0; border-radius: 2px; animation: progress 2s ease-in-out infinite;"></div>
230
+ </div>
231
+ <span style="color: #7b1fa2; font-size: 14px; font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, sans-serif;">
232
+ Generating...
233
+ </span>
234
+ </div>
235
+
236
+ <style>
237
+ @keyframes progress {
238
+ 0% { transform: translateX(-100%); }
239
+ 100% { transform: translateX(250%); }
240
+ }
241
+ </style>
242
+ """
243
+
244
+ DONE_WIDGET = """
245
+ <div style="display: flex; align-items: center; gap: 8px; padding: 8px 12px; background-color: #e8f5e8; border-radius: 6px; border-left: 3px solid #4caf50;">
246
+ <div style="width: 16px; height: 16px; background-color: #4caf50; border-radius: 50%; display: flex; align-items: center; justify-content: center;">
247
+ <svg width="10" height="8" viewBox="0 0 10 8" fill="none">
248
+ <path d="M1 4L3.5 6.5L9 1" stroke="white" stroke-width="1.5" stroke-linecap="round" stroke-linejoin="round"/>
249
+ </svg>
250
+ </div>
251
+ <span style="color: #2e7d32; font-size: 14px; font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, sans-serif;">
252
+ Generation complete
253
+ </span>
254
+ </div>
255
+ """
256
+
257
+ DONE_WIDGET = """
258
+ <div style="display: flex; align-items: center; gap: 8px; padding: 8px 12px; background-color: #e8f5e8; border-radius: 6px; border-left: 3px solid #4caf50; animation: fadeInOut 4s ease-in-out forwards;">
259
+ <div style="width: 16px; height: 16px; background-color: #4caf50; border-radius: 50%; display: flex; align-items: center; justify-content: center;">
260
+ <svg width="10" height="8" viewBox="0 0 10 8" fill="none">
261
+ <path d="M1 4L3.5 6.5L9 1" stroke="white" stroke-width="1.5" stroke-linecap="round" stroke-linejoin="round"/>
262
+ </svg>
263
+ </div>
264
+ <span style="color: #2e7d32; font-size: 14px; font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, sans-serif;">
265
+ Generation complete
266
+ </span>
267
+ </div>
268
+
269
+ <style>
270
+ @keyframes fadeInOut {
271
+ 0% { opacity: 0; transform: translateY(10px); }
272
+ 15% { opacity: 1; transform: translateY(0); }
273
+ 85% { opacity: 1; transform: translateY(0); }
274
+ 100% { opacity: 0; transform: translateY(-10px); }
275
+ }
276
+ </style>
277
+ """
278
+
279
+ STOPPED_WIDGET = """
280
+ <div style="display: flex; align-items: center; gap: 8px; padding: 8px 12px; background-color: #fff3e0; border-radius: 6px; border-left: 3px solid #ff9800;">
281
+ <div style="width: 16px; height: 16px; background-color: #ff9800; border-radius: 50%; display: flex; align-items: center; justify-content: center; color: white; font-weight: bold; font-size: 10px;">
282
+
283
+ </div>
284
+ <span style="color: #f57c00; font-size: 14px; font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, sans-serif;">
285
+ Execution stopped by user
286
+ </span>
287
+ </div>
288
+ """
289
+
290
+ ERROR_WIDGET = """
291
+ <div style="display: flex; align-items: center; gap: 8px; padding: 8px 12px; background-color: #ffebee; border-radius: 6px; border-left: 3px solid #f44336;">
292
+ <div style="width: 16px; height: 16px; background-color: #f44336; border-radius: 50%; display: flex; align-items: center; justify-content: center; color: white; font-weight: bold; font-size: 10px;">
293
+
294
+ </div>
295
+ <span style="color: #c62828; font-size: 14px; font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, sans-serif;">
296
+ Execution failed - check error details above
297
+ </span>
298
+ </div>
299
+ """
300
+
301
+ ERROR_HTML = """\
302
+ <div style="display: flex; align-items: center; gap: 8px; padding: 12px; background-color: #ffebee; border-radius: 6px; border-left: 3px solid #f44336; margin: 8px 0;">
303
+ <div style="width: 20px; height: 20px; background-color: #f44336; border-radius: 50%; display: flex; align-items: center; justify-content: center; color: white; font-weight: bold; font-size: 12px;">
304
+ !
305
+ </div>
306
+ <div style="color: #c62828; font-size: 14px; font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, sans-serif;">
307
+ <strong>Error:</strong> {}
308
+ </div>
309
+ </div>"""
310
+
311
+ STOPPED_SANDBOX_HTML = """
312
+ <div style="display: flex; align-items: center; gap: 8px; padding: 8px 12px; background-color: #f5f5f5; border-radius: 6px; border-left: 3px solid #9e9e9e; margin-bottom: 16px;">
313
+ <div style="width: 16px; height: 16px; background-color: #9e9e9e; border-radius: 50%; display: flex; align-items: center; justify-content: center; color: white; font-weight: bold; font-size: 10px;">
314
+
315
+ </div>
316
+ <div style="flex: 1;">
317
+ <div style="margin-bottom: 4px; font-size: 13px; color: #757575; font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, sans-serif; font-weight: 500;">
318
+ Sandbox stopped
319
+ </div>
320
+ <div style="width: 100%; height: 8px; background-color: #e0e0e0; border-radius: 4px; overflow: hidden;">
321
+ <div style="height: 100%; background-color: #9e9e9e; border-radius: 4px; width: 100%;"></div>
322
+ </div>
323
+ <div style="display: flex; justify-content: space-between; margin-top: 4px; font-size: 11px; color: #757575; font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, sans-serif;">
324
+ <span>Started: {start_time}</span>
325
+ <span>Expired: {end_time}</span>
326
+ </div>
327
+ </div>
328
+ </div>
329
+ """
330
+
331
+ TIMEOUT_HTML = """
332
+ <div style="display: flex; align-items: center; gap: 8px; padding: 8px 12px; background-color: #fff3e0; border-radius: 6px; border-left: 3px solid #ff9800; margin-bottom: 16px;">
333
+ <div style="width: 16px; height: 16px; background-color: #ff9800; border-radius: 50%; display: flex; align-items: center; justify-content: center; color: white; font-weight: bold; font-size: 10px;">
334
+
335
+ </div>
336
+ <div style="flex: 1;">
337
+ <div style="margin-bottom: 4px; font-size: 13px; color: #f57c00; font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, sans-serif; font-weight: 500;">
338
+ The E2B Sandbox for code execution has a timeout of {total_seconds} seconds.
339
+ </div>
340
+ <div style="width: 100%; height: 8px; background-color: #ffe0b3; border-radius: 4px; overflow: hidden;">
341
+ <div id="progress-bar-{unique_id}" style="height: 100%; background: linear-gradient(90deg, #ff9800 0%, #f57c00 50%, #f44336 100%); border-radius: 4px; width: {current_progress}%; animation: progress-fill-{unique_id} {remaining_seconds}s linear forwards;"></div>
342
+ </div>
343
+ <div style="display: flex; justify-content: space-between; margin-top: 4px; font-size: 11px; color: #f57c00; font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, sans-serif;">
344
+ <span>Started: {start_time}</span>
345
+ <span>Expires: {end_time}</span>
346
+ </div>
347
+ </div>
348
+ </div>
349
+
350
+ <style>
351
+ @keyframes progress-fill-{unique_id} {{
352
+ from {{ width: {current_progress}%; }}
353
+ to {{ width: 100%; }}
354
+ }}
355
+ </style>
356
+ """
357
+
358
+ TIMEOUT_HTML = """
359
+ <div style="display: flex; align-items: center; gap: 8px; padding: 6px 10px; background-color: #fafafa; border-radius: 4px; border-left: 2px solid #d1d5db; margin-bottom: 8px; font-size: 12px;">
360
+ <div style="width: 12px; height: 12px; background-color: #d1d5db; border-radius: 50%; display: flex; align-items: center; justify-content: center; color: white; font-weight: bold; font-size: 8px;">
361
+
362
+ </div>
363
+ <div style="flex: 1;">
364
+ <div style="margin-bottom: 2px; font-size: 11px; color: #6b7280; font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, sans-serif; font-weight: 400;">
365
+ Sandbox timeout: {total_seconds}s
366
+ </div>
367
+ <div style="width: 100%; height: 6px; background-color: #e5e7eb; border-radius: 3px; overflow: hidden;">
368
+ <div id="progress-bar-{unique_id}" style="height: 100%; background-color: #6b7280; border-radius: 3px; width: {current_progress}%; animation: progress-fill-{unique_id} {remaining_seconds}s linear forwards;"></div>
369
+ </div>
370
+ <div style="display: flex; justify-content: space-between; margin-top: 2px; font-size: 10px; color: #9ca3af; font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, sans-serif;">
371
+ <span>Started: {start_time}</span>
372
+ <span>Expires: {end_time}</span>
373
+ </div>
374
+ </div>
375
+ </div>
376
+
377
+ <style>
378
+ @keyframes progress-fill-{unique_id} {{
379
+ from {{ width: {current_progress}%; }}
380
+ to {{ width: 100%; }}
381
+ }}
382
+ </style>
383
+ """
384
+
385
+ # Custom CSS for notebook styling including shell commands
386
+ custom_css = """
387
+ <style type="text/css">
388
+ /* Code font size */
389
+ .highlight pre, .highlight code,
390
+ div.input_area pre, div.output_area pre {
391
+ font-size: 12px !important;
392
+ line-height: 1.4 !important;
393
+ }
394
+
395
+ /* Fix prompt truncation */
396
+ .jp-InputPrompt, .jp-OutputPrompt {
397
+ text-overflow: clip !important;
398
+ }
399
+
400
+ /* Shell command styling - force dark theme */
401
+ .shell-output {
402
+ background-color: #111827 !important;
403
+ }
404
+
405
+ .shell-output div {
406
+ background: linear-gradient(135deg, #111827 0%, #1f2937 100%) !important;
407
+ }
408
+
409
+ .shell-output pre {
410
+ background-color: #111827 !important;
411
+ color: #f1f5f9 !important;
412
+ border: none !important;
413
+ margin: 0 !important;
414
+ }
415
+
416
+ /* Override any notebook styles that might interfere */
417
+ div[data-jp-cell-type="markdown"] .shell-output pre {
418
+ background-color: #111827 !important;
419
+ color: #f1f5f9 !important;
420
+ }
421
+
422
+ /* Additional terminal styling */
423
+ .terminal-header {
424
+ background: linear-gradient(90deg, #374151 0%, #4b5563 100%) !important;
425
+ }
426
+ </style>
427
+ """
428
+
429
+ # Configure the exporter
430
+ config = Config()
431
+ html_exporter = HTMLExporter(config=config, template_name="classic")
432
+
433
+
434
+ class JupyterNotebook:
435
+ def __init__(self, messages=None, session_state_data=None):
436
+ self.exec_count = 0
437
+ self.countdown_info = None
438
+
439
+ # If session_state_data is provided, use it directly
440
+ if session_state_data and "notebook_data" in session_state_data:
441
+ logger.info("Initializing JupyterNotebook from session state")
442
+ self.data = session_state_data["notebook_data"]
443
+ # Count existing code cells to maintain execution count
444
+ self.exec_count = len([cell for cell in self.data.get("cells", [])
445
+ if cell.get("cell_type") == "code" and cell.get("execution_count")])
446
+ logger.info(f"JupyterNotebook initialized from session state with {len(self.data['cells'])} cells, exec_count={self.exec_count}")
447
+ return
448
+
449
+ # Legacy initialization path
450
+ if messages is None:
451
+ messages = []
452
+ logger.debug(f"Initializing JupyterNotebook with {len(messages)} messages")
453
+ self.data, self.code_cell_counter = self.create_base_notebook(messages)
454
+ logger.info(f"JupyterNotebook initialized with {len(self.data['cells'])} cells")
455
+
456
+
457
+ def create_base_notebook(self, messages):
458
+ logger.debug("Creating base notebook structure")
459
+ base_notebook = {
460
+ "metadata": {
461
+ "kernel_info": {"name": "python3"},
462
+ "language_info": {
463
+ "name": "python",
464
+ "version": "3.12",
465
+ },
466
+ },
467
+ "nbformat": 4,
468
+ "nbformat_minor": 0,
469
+ "cells": []
470
+ }
471
+
472
+ # Add header
473
+ base_notebook["cells"].append({
474
+ "cell_type": "markdown",
475
+ "metadata": {},
476
+ "source": header_message
477
+ })
478
+ logger.debug("Added header cell to notebook")
479
+
480
+ # Set initial data
481
+ self.data = base_notebook
482
+
483
+ # Add empty code cell if no messages
484
+ if len(messages) == 0:
485
+ self.data["cells"].append({
486
+ "cell_type": "code",
487
+ "execution_count": None,
488
+ "metadata": {},
489
+ "source": "",
490
+ "outputs": []
491
+ })
492
+ logger.debug("Added empty code cell for new notebook")
493
+ return self.data, 0
494
+
495
+ # Process messages using existing methods
496
+ logger.info(f"Processing {len(messages)} messages for notebook creation")
497
+ i = 0
498
+ while i < len(messages):
499
+ message = messages[i]
500
+ logger.debug(f"Processing message {i+1}/{len(messages)}: {message['role']}")
501
+
502
+ if message["role"] == "system":
503
+ logger.debug("Adding system message as markdown")
504
+ self.add_markdown(message["content"], "system")
505
+
506
+ elif message["role"] == "user":
507
+ logger.debug("Adding user message as markdown")
508
+ self.add_markdown(message["content"], "user")
509
+
510
+ elif message["role"] == "assistant":
511
+ if "tool_calls" in message:
512
+ logger.debug(f"Processing assistant message with {len(message['tool_calls'])} tool calls")
513
+ # Add assistant thinking if there's content
514
+ if message.get("content"):
515
+ logger.debug("Adding assistant thinking content")
516
+ self.add_markdown(message["content"], "assistant")
517
+
518
+ # Process tool calls - we know the next message(s) will be tool responses
519
+ for tool_call in message["tool_calls"]:
520
+ if tool_call["function"]["name"] == "add_and_execute_jupyter_code_cell":
521
+ logger.debug(f"Processing code execution tool call: {tool_call['id']}")
522
+ tool_args = json.loads(tool_call["function"]["arguments"])
523
+ code = tool_args["code"]
524
+ logger.debug(f"Code cell contains {len(code)} characters")
525
+
526
+ # Get the next tool response (guaranteed to exist)
527
+ tool_message = messages[i + 1]
528
+ if tool_message["role"] == "tool" and tool_message.get("tool_call_id") == tool_call["id"]:
529
+ logger.debug(f"Found matching tool response for {tool_call['id']}")
530
+ # Use the raw execution if available, otherwise fall back to empty list
531
+ execution = tool_message.get("raw_execution", [])
532
+ self.add_code_execution(code, execution, parsed=True)
533
+ logger.debug(f"Added code execution cell with {len(execution)} outputs")
534
+ i += 1 # Skip the tool message since we just processed it
535
+ else:
536
+ logger.warning(f"No matching tool response found for tool call {tool_call['id']}")
537
+ else:
538
+ # Regular assistant message
539
+ logger.debug("Adding regular assistant message")
540
+ self.add_markdown(message["content"], "assistant")
541
+
542
+ elif message["role"] == "tool":
543
+ # Skip - should have been handled with corresponding tool_calls
544
+ # This shouldn't happen given our assumptions, but just in case
545
+ logger.debug("Skipping tool message (should have been processed with tool_calls)")
546
+ pass
547
+
548
+ i += 1
549
+
550
+ return self.data, 0
551
+
552
+ def _update_countdown_cell(self):
553
+ if not self.countdown_info:
554
+ logger.debug("No countdown info available, skipping countdown update")
555
+ return
556
+
557
+ logger.debug("Updating countdown cell")
558
+
559
+ start_time = self.countdown_info['start_time']
560
+ end_time = self.countdown_info['end_time']
561
+
562
+ current_time = datetime.datetime.now(datetime.timezone.utc)
563
+ remaining_time = end_time - current_time
564
+
565
+ # Show stopped message if expired
566
+ if remaining_time.total_seconds() <= 0:
567
+ logger.info("Sandbox has expired, showing stopped message")
568
+ # Format display for stopped sandbox
569
+ start_display = start_time.strftime("%H:%M")
570
+ end_display = end_time.strftime("%H:%M")
571
+
572
+ stopped_html = STOPPED_SANDBOX_HTML.format(
573
+ start_time=start_display,
574
+ end_time=end_display
575
+ )
576
+
577
+ # Update countdown cell to show stopped message
578
+ stopped_cell = {
579
+ "cell_type": "markdown",
580
+ "metadata": {},
581
+ "source": stopped_html
582
+ }
583
+
584
+ # Find and update existing countdown cell
585
+ for i, cell in enumerate(self.data["cells"]):
586
+ if cell.get("cell_type") == "markdown" and ("⏱" in str(cell.get("source", "")) or "⏹" in str(cell.get("source", ""))):
587
+ self.data["cells"][i] = stopped_cell
588
+ logger.debug(f"Updated countdown cell at position {i} with stopped message")
589
+ break
590
+
591
+ return
592
+
593
+ # Calculate current progress
594
+ total_duration = end_time - start_time
595
+ elapsed_time = current_time - start_time
596
+ current_progress = (elapsed_time.total_seconds() / total_duration.total_seconds()) * 100
597
+ current_progress = max(0, min(100, current_progress))
598
+ logger.debug(f"Countdown progress: {current_progress:.1f}% ({remaining_time.total_seconds():.0f}s remaining)")
599
+
600
+ # Format display
601
+ start_display = start_time.strftime("%H:%M")
602
+ end_display = end_time.strftime("%H:%M")
603
+ remaining_seconds = int(remaining_time.total_seconds())
604
+ remaining_minutes = remaining_seconds // 60
605
+ remaining_secs = remaining_seconds % 60
606
+ remaining_display = f"{remaining_minutes}:{remaining_secs:02d}"
607
+
608
+ # Generate unique ID to avoid CSS conflicts when updating
609
+ unique_id = int(current_time.timestamp() * 1000) % 100000
610
+
611
+ # Calculate total timeout duration in seconds
612
+ total_seconds = int(total_duration.total_seconds())
613
+
614
+ countdown_html = TIMEOUT_HTML.format(
615
+ start_time=start_display,
616
+ end_time=end_display,
617
+ current_progress=current_progress,
618
+ remaining_seconds=remaining_seconds,
619
+ unique_id=unique_id,
620
+ total_seconds=total_seconds
621
+ )
622
+
623
+ # Update or insert the countdown cell
624
+ countdown_cell = {
625
+ "cell_type": "markdown",
626
+ "metadata": {},
627
+ "source": countdown_html
628
+ }
629
+
630
+ # Find existing countdown cell by looking for the timer emoji
631
+ found_countdown = False
632
+ for i, cell in enumerate(self.data["cells"]):
633
+ if cell.get("cell_type") == "markdown" and "⏱" in str(cell.get("source", "")):
634
+ # Update existing countdown cell
635
+ self.data["cells"][i] = countdown_cell
636
+ found_countdown = True
637
+ logger.debug(f"Updated existing countdown cell at position {i}")
638
+ break
639
+
640
+ if not found_countdown:
641
+ # Insert new countdown cell at position 1 (after header)
642
+ self.data["cells"].insert(1, countdown_cell)
643
+ logger.debug("Inserted new countdown cell at position 1")
644
+
645
+ def add_sandbox_countdown(self, start_time, end_time):
646
+ logger.info(f"Adding sandbox countdown: {start_time} to {end_time}")
647
+ # Store the countdown info for later updates
648
+ self.countdown_info = {
649
+ 'start_time': start_time,
650
+ 'end_time': end_time,
651
+ 'cell_index': 1 # Remember where we put it
652
+ }
653
+
654
+ def add_code_execution(self, code, execution, parsed=False):
655
+ self.exec_count += 1
656
+ logger.debug(f"Adding code execution cell #{self.exec_count} with {len(code)} chars of code")
657
+ outputs = execution if parsed else self.parse_exec_result_nb(execution)
658
+ logger.debug(f"Code execution has {len(outputs)} outputs")
659
+ self.data["cells"].append({
660
+ "cell_type": "code",
661
+ "execution_count": self.exec_count,
662
+ "metadata": {},
663
+ "source": code,
664
+ "outputs": outputs
665
+ })
666
+
667
+ def add_code(self, code):
668
+ """Add a code cell without execution results"""
669
+ self.exec_count += 1
670
+ logger.debug(f"Adding code cell #{self.exec_count} with {len(code)} chars (no execution)")
671
+ self.data["cells"].append({
672
+ "cell_type": "code",
673
+ "execution_count": self.exec_count,
674
+ "metadata": {},
675
+ "source": code,
676
+ "outputs": []
677
+ })
678
+
679
+ def append_execution(self, execution):
680
+ """Append execution results to the immediate previous cell if it's a code cell"""
681
+ if (len(self.data["cells"]) > 0 and
682
+ self.data["cells"][-1]["cell_type"] == "code"):
683
+ outputs = self.parse_exec_result_nb(execution)
684
+ self.data["cells"][-1]["outputs"] = outputs
685
+ logger.debug(f"Appended {len(outputs)} outputs to last code cell")
686
+ else:
687
+ logger.error("Cannot append execution: previous cell is not a code cell")
688
+ raise ValueError("Cannot append execution: previous cell is not a code cell")
689
+
690
+ def has_execution_error(self, execution):
691
+ """Check if an execution result contains an error"""
692
+ has_error = execution.error is not None
693
+ logger.debug(f"Execution error check: {has_error}")
694
+ return has_error
695
+
696
+ def has_execution_warnings(self, execution):
697
+ """Check if an execution result contains warnings (stderr output but no error)"""
698
+ has_warnings = (execution.error is None and
699
+ execution.logs.stderr and
700
+ len(execution.logs.stderr) > 0)
701
+ logger.debug(f"Execution warning check: {has_warnings}")
702
+ return has_warnings
703
+
704
+ def update_last_code_cell(self, code):
705
+ """Update the source code of the last code cell"""
706
+ if (len(self.data["cells"]) > 0 and
707
+ self.data["cells"][-1]["cell_type"] == "code"):
708
+ logger.debug(f"Updating last code cell with {len(code)} chars")
709
+ self.data["cells"][-1]["source"] = code
710
+ # Clear previous outputs when updating code
711
+ self.data["cells"][-1]["outputs"] = []
712
+ logger.debug("Cleared previous outputs from updated code cell")
713
+ else:
714
+ logger.error("Cannot update: last cell is not a code cell")
715
+ raise ValueError("Cannot update: last cell is not a code cell")
716
+
717
+ def get_last_cell_type(self):
718
+ """Get the type of the last cell, or None if no cells exist"""
719
+ if len(self.data["cells"]) > 0:
720
+ cell_type = self.data["cells"][-1]["cell_type"]
721
+ logger.debug(f"Last cell type: {cell_type}")
722
+ return cell_type
723
+ logger.debug("No cells exist, returning None")
724
+ return None
725
+
726
+ def add_markdown(self, markdown, role="markdown"):
727
+ logger.debug(f"Adding markdown cell with role '{role}' ({len(markdown)} chars)")
728
+ if role == "system":
729
+ system_message = markdown if markdown else "default"
730
+ clean_message = self._clean_markdown_formatting(system_message)
731
+ markdown_formatted = system_template.format(clean_message)
732
+ elif role == "user":
733
+ clean_message = self._clean_markdown_formatting(markdown)
734
+ markdown_formatted = user_template.format(clean_message)
735
+ elif role == "assistant":
736
+ clean_message = self._clean_markdown_formatting(markdown)
737
+ markdown_formatted = assistant_thinking_template.format(clean_message)
738
+ markdown_formatted = markdown_formatted.replace('<think>', '&lt;think&gt;')
739
+ markdown_formatted = markdown_formatted.replace('</think>', '&lt;/think&gt;')
740
+ else:
741
+ # Default case for raw markdown
742
+ markdown_formatted = self._clean_markdown_formatting(markdown)
743
+
744
+ self.data["cells"].append({
745
+ "cell_type": "markdown",
746
+ "metadata": {},
747
+ "source": markdown_formatted
748
+ })
749
+
750
+ def add_shell_command(self, command):
751
+ """Add a shell command cell with terminal-style formatting"""
752
+ logger.debug(f"Adding shell command cell: '{command}'")
753
+
754
+ # Format command with terminal-style template
755
+ shell_formatted = shell_command_template.format(self._clean_shell_command(command))
756
+
757
+ self.data["cells"].append({
758
+ "cell_type": "markdown",
759
+ "metadata": {"shell_command": True, "command": command},
760
+ "source": shell_formatted
761
+ })
762
+
763
+ def append_shell_execution(self, execution):
764
+ """Append shell execution results to the notebook with terminal styling"""
765
+ logger.debug("Appending shell execution results")
766
+
767
+ # Format the shell output using terminal styling
768
+ output_content = self._format_shell_output(execution)
769
+ shell_output_formatted = shell_output_template.format(output_content)
770
+
771
+ # Wrap in a div with shell-output class for styling
772
+ shell_output_with_class = f'<div class="shell-output">{shell_output_formatted}</div>'
773
+
774
+ # Add the output as a new markdown cell
775
+ self.data["cells"].append({
776
+ "cell_type": "markdown",
777
+ "metadata": {"shell_output": True},
778
+ "source": shell_output_with_class
779
+ })
780
+ logger.debug("Added shell output cell to notebook")
781
+
782
+ def _clean_shell_command(self, command):
783
+ """Clean and escape shell command for display"""
784
+ if not command:
785
+ return ""
786
+
787
+ # Basic HTML escaping for shell commands
788
+ command = command.replace('&', '&amp;')
789
+ command = command.replace('<', '&lt;')
790
+ command = command.replace('>', '&gt;')
791
+ command = command.replace('"', '&quot;')
792
+ command = command.replace("'", '&#39;')
793
+
794
+ return command
795
+
796
+ def _format_shell_output(self, execution):
797
+ """Format shell execution output for terminal-style display"""
798
+ output_parts = []
799
+
800
+ # Add stdout if present
801
+ if execution.logs.stdout:
802
+ stdout_text = ''.join(execution.logs.stdout).strip()
803
+ if stdout_text:
804
+ output_parts.append(stdout_text)
805
+
806
+ # Add stderr if present (but filter out plot data)
807
+ if execution.logs.stderr:
808
+ stderr_text = ''.join(execution.logs.stderr).strip()
809
+
810
+ # Filter out plot data from stderr
811
+ plot_start = stderr_text.find("__PLOT_DATA__")
812
+ plot_end = stderr_text.find("__END_PLOT_DATA__")
813
+ if plot_start != -1 and plot_end != -1:
814
+ clean_stderr = stderr_text[:plot_start] + stderr_text[plot_end + len("__END_PLOT_DATA__"):]
815
+ stderr_text = clean_stderr.strip()
816
+
817
+ if stderr_text:
818
+ output_parts.append(f"STDERR:\n{stderr_text}")
819
+
820
+ # Add error information if present
821
+ if execution.error:
822
+ error_text = f"ERROR: {execution.error.name}: {execution.error.value}"
823
+ if execution.error.traceback:
824
+ error_text += f"\n{execution.error.traceback}"
825
+ output_parts.append(error_text)
826
+
827
+ # Add execution results if present (for shell commands that produce results)
828
+ if execution.results:
829
+ for result in execution.results:
830
+ if result.text:
831
+ output_parts.append(result.text.strip())
832
+
833
+ # Join all output parts
834
+ final_output = '\n\n'.join(output_parts) if output_parts else "No output"
835
+
836
+ # Basic HTML escaping for output
837
+ final_output = final_output.replace('&', '&amp;')
838
+ final_output = final_output.replace('<', '&lt;')
839
+ final_output = final_output.replace('>', '&gt;')
840
+
841
+ logger.debug(f"Formatted shell output: {len(final_output)} chars")
842
+ return final_output
843
+
844
+ def add_error(self, error_message):
845
+ """Add an error message cell to the notebook"""
846
+ logger.warning(f"Adding error cell: {error_message}")
847
+ error_html = ERROR_HTML.format(error_message)
848
+
849
+ self.data["cells"].append({
850
+ "cell_type": "markdown",
851
+ "metadata": {},
852
+ "source": error_html
853
+ })
854
+
855
+ def add_final_answer(self, answer):
856
+ logger.info(f"Adding final answer cell ({len(answer)} chars)")
857
+ self.data["cells"].append({
858
+ "cell_type": "markdown",
859
+ "metadata": {},
860
+ "source": assistant_final_answer_template.format(answer)
861
+ })
862
+
863
+ def add_web_search_result(self, query, quick_answer=None, sources=None):
864
+ """Add a web search result cell with dropdown UI"""
865
+ logger.info(f"Adding web search result for query: {query}")
866
+
867
+ # Format quick answer section
868
+ quick_answer_html = ""
869
+ if quick_answer:
870
+ # Clean up markdown formatting in quick answer
871
+ clean_answer = self._clean_markdown_formatting(quick_answer)
872
+ quick_answer_html = f"""
873
+ <div class="quick-answer">
874
+ <div class="quick-answer-title">💡 Quick Answer:</div>
875
+ <div>{clean_answer}</div>
876
+ </div>
877
+ """
878
+
879
+ # Format sources section
880
+ sources_html = ""
881
+ if sources:
882
+ source_items = []
883
+ for i, source in enumerate(sources, 1):
884
+ title = self._clean_markdown_formatting(source.get('title', f'Source {i}'))
885
+ url = source.get('url', '#')
886
+ relevance = source.get('relevance', 0.0)
887
+
888
+ source_item = f"""
889
+ <div class="source-item">
890
+ <div class="source-title">{i}. {title}
891
+ <span class="relevance-score">Relevance: {relevance:.2f}</span>
892
+ </div>
893
+ <a href="{url}" target="_blank" class="source-url">{url}</a>
894
+ </div>
895
+ """
896
+ source_items.append(source_item)
897
+ sources_html = "".join(source_items)
898
+
899
+ # Format the complete web search result
900
+ web_search_html = web_search_template.format(
901
+ query=self._clean_markdown_formatting(query),
902
+ quick_answer=quick_answer_html,
903
+ sources=sources_html
904
+ )
905
+
906
+ self.data["cells"].append({
907
+ "cell_type": "markdown",
908
+ "metadata": {},
909
+ "source": web_search_html
910
+ })
911
+
912
+ def _clean_markdown_formatting(self, text):
913
+ """Clean up markdown formatting issues like excessive ** characters"""
914
+ if not text:
915
+ return ""
916
+
917
+ # Replace multiple consecutive asterisks with proper formatting
918
+ import re
919
+
920
+ # Handle bold text: **text** -> <strong>text</strong>
921
+ text = re.sub(r'\*\*([^*]+)\*\*', r'<strong>\1</strong>', text)
922
+
923
+ # Handle italic text: *text* -> <em>text</em>
924
+ text = re.sub(r'(?<!\*)\*(?!\*)([^*]+)\*(?!\*)', r'<em>\1</em>', text)
925
+
926
+ # Clean up any remaining multiple asterisks
927
+ text = re.sub(r'\*{3,}', '**', text)
928
+
929
+ # Handle line breaks
930
+ text = text.replace('\n', '<br>')
931
+
932
+ # Handle links [text](url) -> <a href="url">text</a>
933
+ text = re.sub(r'\[([^\]]+)\]\(([^)]+)\)', r'<a href="\2" target="_blank">\1</a>', text)
934
+
935
+ return text
936
+
937
+ def parse_exec_result_nb(self, execution):
938
+ """Convert an E2B Execution object to Jupyter notebook cell output format"""
939
+ logger.debug("Parsing execution result for notebook format")
940
+ outputs = []
941
+
942
+ if execution.logs.stdout:
943
+ stdout_text = ''.join(execution.logs.stdout)
944
+ logger.debug(f"Adding stdout output ({len(stdout_text)} chars)")
945
+ outputs.append({
946
+ 'output_type': 'stream',
947
+ 'name': 'stdout',
948
+ 'text': stdout_text
949
+ })
950
+
951
+ if execution.logs.stderr:
952
+ stderr_text = ''.join(execution.logs.stderr)
953
+ # Filter out plot data from stderr before displaying
954
+ plot_start = stderr_text.find("__PLOT_DATA__")
955
+ plot_end = stderr_text.find("__END_PLOT_DATA__")
956
+ if plot_start != -1 and plot_end != -1:
957
+ # Remove plot data from stderr text
958
+ clean_stderr = stderr_text[:plot_start] + stderr_text[plot_end + len("__END_PLOT_DATA__"):]
959
+ stderr_text = clean_stderr.strip()
960
+
961
+ # Only add stderr output if there's content after filtering
962
+ if stderr_text:
963
+ logger.debug(f"Adding stderr output ({len(stderr_text)} chars)")
964
+ outputs.append({
965
+ 'output_type': 'stream',
966
+ 'name': 'stderr',
967
+ 'text': stderr_text
968
+ })
969
+
970
+ if execution.error:
971
+ logger.debug(f"Adding error output: {execution.error.name}: {execution.error.value}")
972
+ outputs.append({
973
+ 'output_type': 'error',
974
+ 'ename': execution.error.name,
975
+ 'evalue': execution.error.value,
976
+ 'traceback': [line for line in execution.error.traceback.split('\n')]
977
+ })
978
+
979
+ for i, result in enumerate(execution.results):
980
+ logger.debug(f"Processing execution result {i+1}/{len(execution.results)}")
981
+ output = {
982
+ 'output_type': 'execute_result' if result.is_main_result else 'display_data',
983
+ 'metadata': {},
984
+ 'data': {}
985
+ }
986
+
987
+ if result.text:
988
+ output['data']['text/plain'] = result.text
989
+ if result.html:
990
+ output['data']['text/html'] = result.html
991
+ if result.png:
992
+ output['data']['image/png'] = result.png
993
+ if result.svg:
994
+ output['data']['image/svg+xml'] = result.svg
995
+ if result.jpeg:
996
+ output['data']['image/jpeg'] = result.jpeg
997
+ if result.pdf:
998
+ output['data']['application/pdf'] = result.pdf
999
+ if result.latex:
1000
+ output['data']['text/latex'] = result.latex
1001
+ if result.json:
1002
+ output['data']['application/json'] = result.json
1003
+ if result.javascript:
1004
+ output['data']['application/javascript'] = result.javascript
1005
+
1006
+ if result.is_main_result and execution.execution_count is not None:
1007
+ output['execution_count'] = execution.execution_count
1008
+
1009
+ if output['data']:
1010
+ logger.debug(f"Added result output with data types: {list(output['data'].keys())}")
1011
+ outputs.append(output)
1012
+ else:
1013
+ logger.debug("Skipping result with no data")
1014
+
1015
+ logger.debug(f"Parsed execution result into {len(outputs)} outputs")
1016
+ return outputs
1017
+
1018
+ def filter_base64_images(self, message):
1019
+ """Filter out base64 encoded images from message content"""
1020
+ if isinstance(message, dict) and 'nbformat' in message:
1021
+ for output in message['nbformat']:
1022
+ if 'data' in output:
1023
+ for key in list(output['data'].keys()):
1024
+ if key.startswith('image/') or key == 'application/pdf':
1025
+ output['data'][key] = '<placeholder_image>'
1026
+ return message
1027
+
1028
+ def render(self, mode="default"):
1029
+ logger.debug(f"Rendering notebook in '{mode}' mode with {len(self.data['cells'])} cells")
1030
+ if self.countdown_info is not None:
1031
+ self._update_countdown_cell()
1032
+
1033
+ render_data = copy.deepcopy(self.data)
1034
+
1035
+ if mode == "generating":
1036
+ render_data["cells"].append({
1037
+ "cell_type": "markdown",
1038
+ "metadata": {},
1039
+ "source": GENERATING_WIDGET
1040
+ })
1041
+
1042
+ elif mode == "executing":
1043
+ logger.debug("Adding executing widget to render")
1044
+ render_data["cells"].append({
1045
+ "cell_type": "markdown",
1046
+ "metadata": {},
1047
+ "source": EXECUTING_WIDGET
1048
+ })
1049
+
1050
+ elif mode == "done":
1051
+ logger.debug("Adding done widget to render")
1052
+ render_data["cells"].append({
1053
+ "cell_type": "markdown",
1054
+ "metadata": {},
1055
+ "source": DONE_WIDGET
1056
+ })
1057
+
1058
+ elif mode == "stopped":
1059
+ logger.debug("Adding stopped widget to render")
1060
+ render_data["cells"].append({
1061
+ "cell_type": "markdown",
1062
+ "metadata": {},
1063
+ "source": STOPPED_WIDGET
1064
+ })
1065
+
1066
+ elif mode == "error":
1067
+ logger.debug("Adding error widget to render")
1068
+ render_data["cells"].append({
1069
+ "cell_type": "markdown",
1070
+ "metadata": {},
1071
+ "source": ERROR_WIDGET
1072
+ })
1073
+
1074
+ elif mode != "default":
1075
+ logger.error(f"Invalid render mode: {mode}")
1076
+ raise ValueError(f"Render mode should be generating, executing, done, stopped, or error. Given: {mode}.")
1077
+
1078
+ notebook = nbformat.from_dict(render_data)
1079
+ notebook_body, _ = html_exporter.from_notebook_node(notebook)
1080
+ notebook_body = notebook_body.replace(bad_html_bad, "")
1081
+ logger.debug(f"Rendered notebook HTML ({len(notebook_body)} chars)")
1082
+
1083
+ # make code font a bit smaller with custom css
1084
+ if "<head>" in notebook_body:
1085
+ notebook_body = notebook_body.replace("</head>", f"{custom_css}</head>")
1086
+ logger.debug("Applied custom CSS to notebook")
1087
+ return notebook_body
1088
+
1089
+ @classmethod
1090
+ def from_session_state(cls, session_state_data):
1091
+ """Create JupyterNotebook instance from session state data"""
1092
+ return cls(session_state_data=session_state_data)
1093
+
1094
+ def get_session_notebook_data(self):
1095
+ """Get notebook data in format suitable for session state"""
1096
+ return self.data.copy()
1097
+
1098
+ def update_from_session_state(self, session_state_data):
1099
+ """Update notebook data from session state"""
1100
+ if "notebook_data" in session_state_data:
1101
+ self.data = session_state_data["notebook_data"].copy()
1102
+ # Update execution count based on existing cells
1103
+ self.exec_count = len([cell for cell in self.data.get("cells", [])
1104
+ if cell.get("cell_type") == "code" and cell.get("execution_count")])
1105
+ logger.debug(f"Updated notebook from session state: {len(self.data['cells'])} cells, exec_count={self.exec_count}")
1106
+
1107
+ def main():
1108
+ """Create a mock notebook to test styling"""
1109
+ # Create mock messages
1110
+ mock_messages = [
1111
+ {"role": "system", "content": "You are a helpful AI assistant that can write and execute Python code."},
1112
+ {"role": "user", "content": "Can you help me create a simple plot of a sine wave?"},
1113
+ {"role": "assistant", "content": "I'll help you create a sine wave plot using matplotlib. **Let me search** for the *best practices* first."},
1114
+ {"role": "assistant", "tool_calls": [{"id": "call_1", "function": {"name": "add_and_execute_jupyter_code_cell", "arguments": '{"code": "import numpy as np\\nimport matplotlib.pyplot as plt\\n\\n# Create x values\\nx = np.linspace(0, 4*np.pi, 100)\\ny = np.sin(x)\\n\\n# Create the plot\\nplt.figure(figsize=(10, 6))\\nplt.plot(x, y, \'b-\', linewidth=2)\\nplt.title(\'Sine Wave\')\\nplt.xlabel(\'x\')\\nplt.ylabel(\'sin(x)\')\\nplt.grid(True)\\nplt.show()"}'}}]},
1115
+ {"role": "tool", "tool_call_id": "call_1", "raw_execution": [{"output_type": "stream", "name": "stdout", "text": "Plot created successfully!"}]}
1116
+ ]
1117
+
1118
+ # Create notebook
1119
+ notebook = JupyterNotebook(mock_messages)
1120
+
1121
+ # Add a web search result example to test the new UI
1122
+ mock_sources = [
1123
+ {
1124
+ "title": "**Matplotlib** Tutorial - Creating **Beautiful** Plots",
1125
+ "url": "https://matplotlib.org/stable/tutorials/introductory/pyplot.html",
1126
+ "relevance": 0.85
1127
+ },
1128
+ {
1129
+ "title": "NumPy *Sine Wave* Generation **Best Practices**",
1130
+ "url": "https://numpy.org/doc/stable/reference/generated/numpy.sin.html",
1131
+ "relevance": 0.72
1132
+ }
1133
+ ]
1134
+
1135
+ notebook.add_web_search_result(
1136
+ query="**matplotlib** *sine wave* tutorial **best practices**",
1137
+ quick_answer="To create a **sine wave plot** with *matplotlib*, use `numpy.linspace()` to generate **x values** and `numpy.sin()` for *y values*. **Configure** the plot with *appropriate* labels and **styling** for better visualization.",
1138
+ sources=mock_sources
1139
+ )
1140
+
1141
+ # Add a timeout countdown (simulating a sandbox that started 2 minutes ago with 5 minute timeout)
1142
+ start_time = datetime.datetime.now(datetime.timezone.utc) - datetime.timedelta(minutes=2)
1143
+ end_time = start_time + datetime.timedelta(minutes=5)
1144
+ notebook.add_sandbox_countdown(start_time, end_time)
1145
+
1146
+ # Render and save
1147
+ html_output = notebook.render()
1148
+
1149
+ with open("mock_notebook.html", "w", encoding="utf-8") as f:
1150
+ f.write(html_output)
1151
+
1152
+ print("Mock notebook saved as 'mock_notebook.html'")
1153
+ print("Open it in your browser to see the improved web search UI and markdown formatting.")
1154
+
1155
+ def create_notebook_from_session_state(session_state):
1156
+ """Helper function to create JupyterNotebook from session state"""
1157
+ return JupyterNotebook.from_session_state(session_state)
1158
+
1159
+
1160
+ if __name__ == "__main__":
1161
+ main()
modal_sandbox.py ADDED
@@ -0,0 +1,794 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Modal Sandbox wrapper to provide E2B-compatible interface for the Jupyter Agent.
3
+ Simplified implementation using Modal's native API.
4
+ """
5
+
6
+ import modal
7
+ import datetime
8
+ from typing import Optional, Dict, List
9
+ import json
10
+ import logging
11
+ import time
12
+
13
+ logger = logging.getLogger(__name__)
14
+
15
+
16
+ class ModalResult:
17
+ """Mock E2B result structure for displaying outputs like plots"""
18
+
19
+ def __init__(self, text: str = "", html: str = "", png: str = "", svg: str = "",
20
+ jpeg: str = "", pdf: str = "", latex: str = "", json: str = "",
21
+ javascript: str = "", is_main_result: bool = True):
22
+ self.text = text
23
+ self.html = html
24
+ self.png = png
25
+ self.svg = svg
26
+ self.jpeg = jpeg
27
+ self.pdf = pdf
28
+ self.latex = latex
29
+ self.json = json
30
+ self.javascript = javascript
31
+ self.is_main_result = is_main_result
32
+
33
+ class ModalExecution:
34
+ """Mock E2B execution result to maintain compatibility with existing code"""
35
+
36
+ def __init__(self, stdout: str = "", stderr: str = "", error: Optional[Dict] = None, results: List[ModalResult] = None):
37
+ self.logs = ModalLogs(stdout, stderr)
38
+ self.error = ModalError(error) if error else None
39
+ self.results = results or []
40
+ self.execution_count = 1
41
+
42
+ class ModalLogs:
43
+ """Mock E2B logs structure"""
44
+
45
+ def __init__(self, stdout: str = "", stderr: str = ""):
46
+ self.stdout = [stdout] if stdout else []
47
+ self.stderr = [stderr] if stderr else []
48
+
49
+ class ModalError:
50
+ """Mock E2B error structure"""
51
+
52
+ def __init__(self, error_data: Dict):
53
+ self.name = error_data.get('name', 'Error')
54
+ self.value = error_data.get('value', 'Unknown error')
55
+ self.traceback = error_data.get('traceback', f"{self.name}: {self.value}")
56
+
57
+ class ModalFiles:
58
+ """Simplified Modal files interface using native Modal Sandbox API"""
59
+
60
+ def __init__(self, modal_sandbox):
61
+ self.modal_sandbox = modal_sandbox # ModalSandbox wrapper
62
+ self.max_file_size = 100 * 1024 * 1024 # 100MB limit
63
+
64
+ @property
65
+ def _sandbox(self):
66
+ """Get the actual Modal sandbox instance"""
67
+ return self.modal_sandbox._sandbox
68
+
69
+ def write(self, path: str, content):
70
+ """Write file to Modal sandbox using native Modal API"""
71
+ try:
72
+ # Handle file-like objects
73
+ if hasattr(content, 'read'):
74
+ file_content = content.read()
75
+ # Reset file pointer if possible
76
+ if hasattr(content, 'seek'):
77
+ content.seek(0)
78
+ else:
79
+ file_content = content
80
+
81
+ # Check file size for bytes content
82
+ content_size = len(file_content) if isinstance(file_content, (bytes, str)) else 0
83
+ if content_size > self.max_file_size:
84
+ raise ValueError(f"File size ({content_size} bytes) exceeds maximum allowed size ({self.max_file_size} bytes)")
85
+
86
+ # Use Modal's native file API
87
+ if isinstance(file_content, bytes):
88
+ # Write binary content
89
+ with self._sandbox.open(path, "wb") as f:
90
+ f.write(file_content)
91
+ else:
92
+ # Write text content
93
+ with self._sandbox.open(path, "w") as f:
94
+ f.write(str(file_content))
95
+
96
+ logger.debug(f"Successfully wrote file {path} ({content_size} bytes) using Modal native API")
97
+
98
+ except Exception as e:
99
+ logger.error(f"Failed to write file {path}: {str(e)}")
100
+ raise RuntimeError(f"Could not write file {path}: {str(e)}")
101
+
102
+ def read(self, path: str, mode: str = "r"):
103
+ """Read file from Modal sandbox using native API"""
104
+ try:
105
+ with self._sandbox.open(path, mode) as f:
106
+ return f.read()
107
+ except Exception as e:
108
+ logger.error(f"Failed to read file {path}: {str(e)}")
109
+ raise
110
+
111
+ def exists(self, path: str) -> bool:
112
+ """Check if file exists in Modal sandbox"""
113
+ try:
114
+ # Try to open the file to check existence
115
+ with self._sandbox.open(path, "r"):
116
+ pass
117
+ return True
118
+ except Exception:
119
+ return False
120
+
121
+ def list_files(self, directory: str = ".") -> List[str]:
122
+ """List files in directory using Modal's native ls method"""
123
+ try:
124
+ return self._sandbox.ls(directory)
125
+ except Exception as e:
126
+ logger.error(f"Failed to list files in {directory}: {str(e)}")
127
+ return []
128
+
129
+ def verify_file_upload(self, path: str, expected_size: Optional[int] = None) -> bool:
130
+ """Verify that a file was uploaded correctly"""
131
+ try:
132
+ if not self.exists(path):
133
+ logger.error(f"File {path} does not exist after upload")
134
+ return False
135
+
136
+ # Check file size if expected size is provided
137
+ if expected_size is not None:
138
+ # Use Modal's exec to get file size
139
+ result = self._sandbox.exec("wc", "-c", path)
140
+ result.wait()
141
+
142
+ if result.returncode == 0:
143
+ output = result.stdout.read().strip()
144
+ actual_size = int(output.split()[0])
145
+ if actual_size != expected_size:
146
+ logger.error(f"File {path} size mismatch: expected {expected_size}, got {actual_size}")
147
+ return False
148
+ else:
149
+ logger.debug(f"File {path} size verified: {actual_size} bytes")
150
+ else:
151
+ logger.warning(f"Could not verify file size for {path}")
152
+
153
+ logger.debug(f"File {path} upload verification successful")
154
+ return True
155
+
156
+ except Exception as e:
157
+ logger.error(f"Failed to verify file upload {path}: {str(e)}")
158
+ return False
159
+
160
+ class ModalSandboxInfo:
161
+ """Mock E2B sandbox info for countdown timer"""
162
+
163
+ def __init__(self, timeout_seconds: int = 300):
164
+ self.started_at = datetime.datetime.now(datetime.timezone.utc)
165
+ self.end_at = self.started_at + datetime.timedelta(seconds=timeout_seconds)
166
+
167
+ class ModalSandbox:
168
+ """Modal sandbox wrapper that provides E2B-compatible interface"""
169
+
170
+ def __init__(self, gpu_config: str = "cpu", cpu_cores: float = 2.0, memory_mb: int = 8192,
171
+ timeout: int = 300, environment_vars: Dict[str, str] = None):
172
+ """
173
+ Initialize Modal sandbox with hardware configuration
174
+
175
+ Args:
176
+ gpu_config: GPU configuration (e.g., "cpu", "T4", "A100-40GB", "H100")
177
+ cpu_cores: Number of CPU cores
178
+ memory_mb: Memory in MB
179
+ timeout: Timeout in seconds
180
+ environment_vars: Environment variables to set
181
+ """
182
+ self.gpu_config = gpu_config
183
+ self.cpu_cores = cpu_cores
184
+ self.memory_mb = memory_mb
185
+ self.timeout = timeout
186
+ self.environment_vars = environment_vars or {}
187
+ self.files = ModalFiles(self)
188
+ self._sandbox = None
189
+ self._app = None
190
+ self._sandbox_info = ModalSandboxInfo(timeout)
191
+ self._persistent_session = None # For maintaining state across executions
192
+
193
+ # Define package lists for different hardware configurations
194
+ CPU_PACKAGES = [
195
+ "jupyter-server", "ipykernel", "ipython", "orjson", "pandas",
196
+ "matplotlib", "pillow", "numpy", "scipy", "scikit-learn",
197
+ "seaborn", "plotly", "requests", "beautifulsoup4", "opencv-python",
198
+ "nltk", "textblob", "librosa>=0.10.0", "soundfile", "sympy", "xarray"
199
+ ]
200
+
201
+ GPU_PACKAGES = [
202
+ "jupyter-server", "ipykernel", "ipython", "orjson", "pandas",
203
+ "matplotlib", "pillow", "numpy", "scipy", "scikit-learn",
204
+ "seaborn", "plotly", "requests", "beautifulsoup4", "opencv-python",
205
+ "nltk", "textblob", "librosa>=0.10.0", "soundfile", "sympy", "xarray",
206
+ # GPU-specific ML/AI packages
207
+ "torch", "transformers", "datasets", "bitsandbytes", "hf_transfer",
208
+ "peft", "trl", "accelerate", "xformers", "wandb", "deepspeed",
209
+ "pyyaml", "packaging", "rouge_score", "bert_score", "jiwer",
210
+ "tqdm", "pyarrow", "sentencepiece", "protobuf", "huggingface_hub"
211
+ ]
212
+
213
+ # Store package lists for system prompt
214
+ self.available_packages = GPU_PACKAGES if gpu_config != "cpu" else CPU_PACKAGES
215
+
216
+ # Create appropriate image based on hardware configuration
217
+ if gpu_config == "cpu" or gpu_config == "CPU-only":
218
+ self.base_image = self._create_cpu_image(CPU_PACKAGES)
219
+ else:
220
+ self.base_image = self._create_gpu_image(GPU_PACKAGES)
221
+
222
+ self._setup_modal()
223
+ logger.info(f"Initialized Modal sandbox with {gpu_config} GPU, {cpu_cores} CPU cores, {memory_mb}MB RAM")
224
+
225
+ def _create_cpu_image(self, packages):
226
+ """Create CPU-optimized image with basic data science packages"""
227
+ packages_string = " ".join(packages)
228
+ return (modal.Image.debian_slim()
229
+ .apt_install("git", "build-essential")
230
+ .run_commands("pip install --upgrade pip")
231
+ .run_commands("pip install uv")
232
+ .run_commands("uv pip install 'numba>=0.58.0' --system") # Ensure compatible numba version
233
+ .run_commands(f"uv pip install {packages_string} --system"))
234
+
235
+ def _create_gpu_image(self, packages):
236
+ """Create GPU-optimized image with ML/AI packages including PyTorch and Transformers"""
237
+ # CUDA Configuration for SGLang
238
+ CUDA_VERSION = "12.8.1"
239
+ CUDA_FLAVOR = "devel"
240
+ CUDA_OS = "ubuntu24.04"
241
+ CUDA_TAG = f"{CUDA_VERSION}-{CUDA_FLAVOR}-{CUDA_OS}"
242
+
243
+ # Base packages that don't require special handling
244
+ base_packages = [pkg for pkg in packages if pkg not in [
245
+ "torch", "transformers", "bitsandbytes", "accelerate", "xformers",
246
+ "peft", "trl", "unsloth", "deepspeed"
247
+ ]]
248
+ base_packages_string = " ".join(base_packages)
249
+
250
+ return (modal.Image.from_registry(f"nvidia/cuda:{CUDA_TAG}", add_python="3.12")
251
+ .env({"DEBIAN_FRONTEND": "noninteractive", "TZ": "UTC"})
252
+ .run_commands("ln -fs /usr/share/zoneinfo/UTC /etc/localtime")
253
+ .apt_install("git", "build-essential")
254
+ .run_commands("pip install --upgrade pip")
255
+ .run_commands("pip install uv")
256
+ .run_commands("uv pip install 'numba>=0.58.0' --system") # Ensure compatible numba version
257
+ .run_commands("uv pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu124 --system")
258
+ .run_commands(f"uv pip install {base_packages_string} --system")
259
+ .env({"HF_HUB_ENABLE_HF_TRANSFER": "1"}))
260
+
261
+ def _setup_modal(self):
262
+ """Setup Modal app and sandbox configuration"""
263
+ try:
264
+ # Initialize Modal app using lookup to create if missing
265
+ self._app = modal.App.lookup("jupyter-agent", create_if_missing=True)
266
+
267
+ # Configure hardware based on user selection
268
+ sandbox_kwargs = {
269
+ "image": self.base_image,
270
+ "timeout": self.timeout,
271
+ "cpu": self.cpu_cores,
272
+ "memory": self.memory_mb,
273
+ "app": self._app
274
+ }
275
+
276
+ # Add GPU configuration if not CPU-only
277
+ if self.gpu_config != "cpu" and self.gpu_config != "CPU-only":
278
+ if self.gpu_config == "T4":
279
+ sandbox_kwargs["gpu"] = modal.gpu.T4()
280
+ elif self.gpu_config == "L4":
281
+ sandbox_kwargs["gpu"] = modal.gpu.L4()
282
+ elif self.gpu_config == "A100-40GB":
283
+ sandbox_kwargs["gpu"] = modal.gpu.A100(size="40GB")
284
+ elif self.gpu_config == "A100-80GB":
285
+ sandbox_kwargs["gpu"] = modal.gpu.A100(size="80GB")
286
+ elif self.gpu_config == "H100":
287
+ sandbox_kwargs["gpu"] = modal.gpu.H100()
288
+ else:
289
+ print(f"Warning: Unknown GPU config {self.gpu_config}, falling back to CPU")
290
+
291
+ # Add environment variables
292
+ if self.environment_vars:
293
+ sandbox_kwargs["secrets"] = [
294
+ modal.Secret.from_dict(self.environment_vars)
295
+ ]
296
+
297
+ # Create sandbox
298
+ self._sandbox = modal.Sandbox.create(**sandbox_kwargs)
299
+
300
+ except Exception as e:
301
+ print(f"Error setting up Modal sandbox: {e}")
302
+ raise
303
+
304
+ def _initialize_persistent_session(self):
305
+ """Initialize a persistent Python session for stateful execution using file-based communication"""
306
+ if self._persistent_session is not None:
307
+ return # Session already exists
308
+
309
+ try:
310
+ logger.debug("Initializing persistent Python session with file-based communication")
311
+
312
+ # Create a persistent Python script that monitors for command files
313
+ session_script = '''
314
+ import sys
315
+ import json
316
+ import traceback
317
+ import base64
318
+ import io
319
+ import time
320
+ import os
321
+ import matplotlib
322
+ matplotlib.use('Agg') # Set backend before importing pyplot
323
+ import matplotlib.pyplot as plt
324
+
325
+ # Global namespace to maintain state - includes built-ins for better compatibility
326
+ _global_namespace = {
327
+ '__builtins__': __builtins__,
328
+ '__name__': '__main__',
329
+ '__doc__': None,
330
+ '__package__': None
331
+ }
332
+
333
+ # Store original show function and setup plot capture
334
+ _original_show = plt.show
335
+ _captured_figures = []
336
+
337
+ def _capture_show(*args, **kwargs):
338
+ """Custom show function that captures figures as base64"""
339
+ global _captured_figures
340
+ try:
341
+ for fig_num in plt.get_fignums():
342
+ fig = plt.figure(fig_num)
343
+ buf = io.BytesIO()
344
+ fig.savefig(buf, format='png', bbox_inches='tight', dpi=100)
345
+ buf.seek(0)
346
+ img_base64 = base64.b64encode(buf.getvalue()).decode('utf-8')
347
+ _captured_figures.append(img_base64)
348
+ buf.close()
349
+ plt.close(fig)
350
+ except Exception as e:
351
+ print(f"Error capturing plot: {e}", file=sys.stderr)
352
+
353
+ # Replace plt.show with our capture function
354
+ plt.show = _capture_show
355
+
356
+ # Signal that session is ready
357
+ with open("/tmp/session_ready", "w") as f:
358
+ f.write("READY")
359
+
360
+ print("Persistent Python session started", flush=True)
361
+
362
+ # Process commands by monitoring for command files
363
+ while True:
364
+ try:
365
+ if os.path.exists("/tmp/execute_command"):
366
+ # Read and execute command
367
+ with open("/tmp/execute_command", "r") as f:
368
+ content = f.read().strip()
369
+ if not content:
370
+ continue # Skip empty files
371
+ try:
372
+ command = json.loads(content)
373
+ except json.JSONDecodeError:
374
+ print(f"Invalid JSON in command file: {content[:100]}...", file=sys.stderr)
375
+ continue # Skip malformed JSON
376
+
377
+ # Remove command file
378
+ os.remove("/tmp/execute_command")
379
+
380
+ if command.get("action") == "execute":
381
+ code = command.get("code", "")
382
+ _captured_figures = [] # Reset for this execution
383
+
384
+ try:
385
+ # Check if code contains shell commands (lines starting with !)
386
+ lines = code.strip().split('\\n')
387
+ shell_commands = []
388
+ python_code_lines = []
389
+
390
+ for line in lines:
391
+ stripped_line = line.strip()
392
+ if stripped_line.startswith('!'):
393
+ # This is a shell command
394
+ shell_cmd = stripped_line[1:].strip() # Remove the !
395
+ shell_commands.append(shell_cmd)
396
+ else:
397
+ # This is Python code
398
+ python_code_lines.append(line)
399
+
400
+ stdout_parts = []
401
+ stderr_parts = []
402
+
403
+ # Execute shell commands first
404
+ for shell_cmd in shell_commands:
405
+ try:
406
+ import subprocess
407
+ result = subprocess.run(
408
+ shell_cmd,
409
+ shell=True,
410
+ capture_output=True,
411
+ text=True,
412
+ timeout=60 # 60 second timeout for shell commands
413
+ )
414
+
415
+ if result.stdout:
416
+ stdout_parts.append(f"$ {shell_cmd}")
417
+ stdout_parts.append(result.stdout.rstrip())
418
+
419
+ if result.stderr:
420
+ stderr_parts.append(f"$ {shell_cmd}")
421
+ stderr_parts.append(result.stderr.rstrip())
422
+
423
+ # If command failed, add error info
424
+ if result.returncode != 0:
425
+ stderr_parts.append(f"Command exited with code {result.returncode}")
426
+
427
+ except subprocess.TimeoutExpired:
428
+ stderr_parts.append(f"$ {shell_cmd}")
429
+ stderr_parts.append("Command timed out after 60 seconds")
430
+ except Exception as e:
431
+ stderr_parts.append(f"$ {shell_cmd}")
432
+ stderr_parts.append(f"Error executing shell command: {str(e)}")
433
+
434
+ # Execute Python code if present
435
+ python_stdout = ""
436
+ if python_code_lines and any(line.strip() for line in python_code_lines):
437
+ python_code = '\\n'.join(python_code_lines)
438
+
439
+ # Capture stdout during Python execution
440
+ import io
441
+ from contextlib import redirect_stdout
442
+
443
+ stdout_buffer = io.StringIO()
444
+
445
+ with redirect_stdout(stdout_buffer):
446
+ # Execute code in the persistent namespace
447
+ exec(python_code, _global_namespace)
448
+
449
+ python_stdout = stdout_buffer.getvalue()
450
+
451
+ # Combine all stdout
452
+ all_stdout_parts = stdout_parts.copy()
453
+ if python_stdout:
454
+ all_stdout_parts.append(python_stdout.rstrip())
455
+
456
+ stdout_output = '\\n'.join(all_stdout_parts) if all_stdout_parts else ""
457
+ stderr_output = '\\n'.join(stderr_parts) if stderr_parts else ""
458
+
459
+ # Send results back
460
+ result = {
461
+ "status": "success",
462
+ "stdout": stdout_output,
463
+ "stderr": stderr_output,
464
+ "plots": _captured_figures.copy()
465
+ }
466
+
467
+ with open("/tmp/execute_result", "w") as f:
468
+ f.write(json.dumps(result))
469
+
470
+ except Exception as e:
471
+ error_result = {
472
+ "status": "error",
473
+ "error": {
474
+ "name": type(e).__name__,
475
+ "value": str(e),
476
+ "traceback": traceback.format_exc()
477
+ }
478
+ }
479
+
480
+ with open("/tmp/execute_result", "w") as f:
481
+ f.write(json.dumps(error_result))
482
+
483
+ elif command.get("action") == "terminate":
484
+ break
485
+
486
+ else:
487
+ # Sleep briefly to avoid busy waiting
488
+ time.sleep(0.1)
489
+
490
+ except Exception as e:
491
+ print(f"Session error: {e}", file=sys.stderr)
492
+ # Write error to result file
493
+ error_result = {
494
+ "status": "error",
495
+ "error": {
496
+ "name": type(e).__name__,
497
+ "value": str(e),
498
+ "traceback": traceback.format_exc()
499
+ }
500
+ }
501
+ with open("/tmp/execute_result", "w") as f:
502
+ f.write(json.dumps(error_result))
503
+ '''
504
+
505
+ # Start the persistent Python session (no stdin needed)
506
+ self._persistent_session = self._sandbox.exec(
507
+ "python3", "-c", session_script,
508
+ timeout=None # No timeout for persistent session
509
+ )
510
+
511
+ # Wait for the session to be ready by checking for the ready file
512
+ max_wait = 10 # Wait up to 10 seconds
513
+ for _ in range(max_wait * 10): # Check every 0.1 seconds
514
+ try:
515
+ with self._sandbox.open("/tmp/session_ready", "r") as f:
516
+ if f.read().strip() == "READY":
517
+ logger.info("Persistent Python session initialized successfully")
518
+ return
519
+ except Exception:
520
+ pass
521
+ time.sleep(0.1)
522
+
523
+ raise RuntimeError("Failed to initialize persistent session: timeout waiting for ready signal")
524
+
525
+ except Exception as e:
526
+ logger.error(f"Failed to initialize persistent session: {e}")
527
+ self._persistent_session = None
528
+ raise
529
+
530
+ def run_code(self, code: str, on_stdout=None) -> ModalExecution:
531
+ """
532
+ Execute Python code or shell commands in persistent Modal sandbox session using file-based communication
533
+
534
+ Args:
535
+ code: Python code to execute (lines starting with '!' are treated as shell commands)
536
+ on_stdout: Callback for stdout (for compatibility, not fully implemented)
537
+
538
+ Returns:
539
+ ModalExecution object compatible with E2B execution results
540
+ """
541
+ try:
542
+ if not self._sandbox:
543
+ raise RuntimeError("Sandbox not initialized")
544
+
545
+ # Initialize persistent session if not already done
546
+ if self._persistent_session is None:
547
+ self._initialize_persistent_session()
548
+
549
+ logger.debug(f"Executing code in persistent session ({len(code)} chars)")
550
+
551
+ # Clean up any existing command/result files
552
+ try:
553
+ self._sandbox.exec("rm", "-f", "/tmp/execute_command", "/tmp/execute_result").wait()
554
+ except Exception:
555
+ pass # Ignore cleanup errors
556
+
557
+ # Send execution command via file
558
+ command = {
559
+ "action": "execute",
560
+ "code": code
561
+ }
562
+
563
+ with self._sandbox.open("/tmp/execute_command", "w") as f:
564
+ f.write(json.dumps(command))
565
+
566
+ # Small delay to ensure file is fully written
567
+ time.sleep(0.01)
568
+
569
+ # Wait for result file to appear
570
+ max_wait = 60 # Wait up to 60 seconds for code execution
571
+ result = None
572
+
573
+ for _ in range(max_wait * 10): # Check every 0.1 seconds
574
+ try:
575
+ with self._sandbox.open("/tmp/execute_result", "r") as f:
576
+ result_json = f.read().strip()
577
+ if result_json: # Make sure file has content
578
+ try:
579
+ result = json.loads(result_json)
580
+ break
581
+ except json.JSONDecodeError as e:
582
+ logger.debug(f"Invalid JSON in result file: {e}")
583
+ continue # Try again
584
+ except Exception:
585
+ pass
586
+ time.sleep(0.1)
587
+
588
+ if result is None:
589
+ raise RuntimeError("Timeout waiting for code execution result")
590
+
591
+ # Clean up result file
592
+ try:
593
+ self._sandbox.exec("rm", "-f", "/tmp/execute_result").wait()
594
+ except Exception:
595
+ pass
596
+
597
+ if result["status"] == "success":
598
+ # Create results for plots only - don't duplicate stdout as execute_result
599
+ results = []
600
+
601
+ # Add plots
602
+ for i, base64_img in enumerate(result.get("plots", [])):
603
+ results.append(ModalResult(
604
+ png=base64_img,
605
+ is_main_result=(i == 0) # First plot is main result
606
+ ))
607
+
608
+ # Get stdout and stderr output for logs
609
+ stdout_output = result.get("stdout", "")
610
+ stderr_output = result.get("stderr", "")
611
+
612
+ # Return execution with stdout/stderr in logs, plots in results
613
+ # Don't add stdout to results to avoid duplication
614
+ return ModalExecution(stdout=stdout_output, stderr=stderr_output, error=None, results=results)
615
+
616
+ elif result["status"] == "error":
617
+ # Execution had an error
618
+ error_info = result["error"]
619
+ error_data = {
620
+ "name": error_info["name"],
621
+ "value": error_info["value"],
622
+ "traceback": error_info["traceback"]
623
+ }
624
+ return ModalExecution(stdout="", stderr="", error=error_data, results=[])
625
+
626
+ else:
627
+ raise RuntimeError(f"Unknown status from persistent session: {result['status']}")
628
+
629
+ except Exception as e:
630
+ # Handle session errors and other exceptions
631
+ logger.error(f"Error executing code in persistent session: {str(e)}")
632
+
633
+ # Reset persistent session on error
634
+ if self._persistent_session:
635
+ try:
636
+ self._persistent_session.terminate()
637
+ except Exception:
638
+ pass
639
+ self._persistent_session = None
640
+
641
+ error_data = {
642
+ "name": type(e).__name__,
643
+ "value": str(e),
644
+ "traceback": f"Traceback: {type(e).__name__}: {str(e)}"
645
+ }
646
+ return ModalExecution(error=error_data)
647
+
648
+ def run_shell(self, command: str, timeout: int =60) -> ModalExecution:
649
+ """
650
+ Execute raw shell commands directly in the sandbox without Python wrapper
651
+
652
+ Args:
653
+ command: Shell command to execute
654
+ timeout: Timeout in seconds (default 60)
655
+
656
+ Returns:
657
+ ModalExecution object with shell output
658
+ """
659
+ try:
660
+ if not self._sandbox:
661
+ raise RuntimeError("Sandbox not initialized")
662
+
663
+ logger.debug(f"Executing raw shell command: {command}")
664
+
665
+ # Use Modal's exec to run shell command directly
666
+ # Split command into parts for exec (simple approach for common commands)
667
+ if ' ' in command:
668
+ # For complex commands, use sh -c
669
+ result = self._sandbox.exec("sh", "-c", command, timeout=timeout)
670
+ else:
671
+ # For simple commands, run directly
672
+ result = self._sandbox.exec(command, timeout=timeout)
673
+
674
+ # Wait for completion
675
+ result.wait()
676
+
677
+ # Get output
678
+ stdout_output = ""
679
+ stderr_output = ""
680
+
681
+ try:
682
+ stdout_output = result.stdout.read() if result.stdout else ""
683
+ except Exception:
684
+ pass
685
+
686
+ try:
687
+ stderr_output = result.stderr.read() if result.stderr else ""
688
+ except Exception:
689
+ pass
690
+
691
+ # Check for errors based on return code
692
+ error_data = None
693
+ if result.returncode != 0:
694
+ error_data = {
695
+ "name": "ShellCommandError",
696
+ "value": f"Command '{command}' exited with code {result.returncode}",
697
+ "traceback": f"Command: {command}\nExit Code: {result.returncode}\nSTDERR: {stderr_output}"
698
+ }
699
+
700
+ logger.debug(f"Shell command completed with exit code: {result.returncode}")
701
+
702
+ return ModalExecution(
703
+ stdout=stdout_output,
704
+ stderr=stderr_output,
705
+ error=error_data,
706
+ results=[]
707
+ )
708
+
709
+ except Exception as e:
710
+ logger.error(f"Error executing shell command '{command}': {str(e)}")
711
+
712
+ # Return error execution
713
+ error_data = {
714
+ "name": type(e).__name__,
715
+ "value": str(e),
716
+ "traceback": f"Shell command failed: {command}\nError: {str(e)}"
717
+ }
718
+
719
+ return ModalExecution(
720
+ stdout="",
721
+ stderr="",
722
+ error=error_data,
723
+ results=[]
724
+ )
725
+
726
+ def get_info(self) -> ModalSandboxInfo:
727
+ """Get sandbox info for countdown timer"""
728
+ return self._sandbox_info
729
+
730
+ def kill(self):
731
+ """Terminate the sandbox and persistent session"""
732
+ try:
733
+ # Terminate persistent session first
734
+ if self._persistent_session:
735
+ try:
736
+ # Send terminate command via file
737
+ terminate_command = {"action": "terminate"}
738
+ with self._sandbox.open("/tmp/execute_command", "w") as f:
739
+ f.write(json.dumps(terminate_command))
740
+ except Exception:
741
+ pass # Ignore errors during graceful shutdown
742
+
743
+ try:
744
+ self._persistent_session.terminate()
745
+ except Exception:
746
+ pass # Ignore errors during forced termination
747
+
748
+ self._persistent_session = None
749
+ logger.info("Persistent session terminated")
750
+
751
+ # Terminate sandbox
752
+ if self._sandbox:
753
+ self._sandbox.terminate()
754
+ self._sandbox = None
755
+ logger.info("Modal sandbox terminated")
756
+
757
+ except Exception as e:
758
+ logger.error(f"Error terminating Modal sandbox: {e}")
759
+
760
+ def __del__(self):
761
+ """Cleanup on deletion"""
762
+ self.kill()
763
+
764
+
765
+ def create_modal_sandbox(gpu_config: str = "cpu", gpu_count: int = 1, cpu_cores: float = 2.0,
766
+ memory_gb: float = 8.0, timeout: int = 300,
767
+ environment_vars: Dict[str, str] = None) -> ModalSandbox:
768
+ """
769
+ Factory function to create Modal sandbox with specified configuration
770
+
771
+ Args:
772
+ gpu_config: GPU type ("cpu", "T4", "L4", "A100-40GB", "A100-80GB", "H100")
773
+ gpu_count: Number of GPUs (for future implementation)
774
+ cpu_cores: Number of CPU cores
775
+ memory_gb: Memory in GB
776
+ timeout: Timeout in seconds
777
+ environment_vars: Environment variables
778
+
779
+ Returns:
780
+ ModalSandbox instance
781
+ """
782
+ memory_mb = int(memory_gb * 1024)
783
+
784
+ # For multi-GPU support (future implementation)
785
+ if gpu_count > 1:
786
+ print(f"Warning: Multi-GPU ({gpu_count}) not yet implemented, using single GPU")
787
+
788
+ return ModalSandbox(
789
+ gpu_config=gpu_config,
790
+ cpu_cores=cpu_cores,
791
+ memory_mb=memory_mb,
792
+ timeout=timeout,
793
+ environment_vars=environment_vars
794
+ )
requirements.txt ADDED
@@ -0,0 +1,17 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ nbformat
2
+ nbconvert
3
+ huggingface_hub
4
+ modal
5
+ transformers
6
+ traitlets
7
+ openai
8
+ gradio
9
+
10
+ numpy
11
+ scipy
12
+ matplotlib
13
+ pandas
14
+ seaborn
15
+ arize-phoenix-otel
16
+ openinference-instrumentation-openai
17
+ tavily-python
system_prompt.txt ADDED
@@ -0,0 +1,326 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ You are an advanced AI coding agent specialized in interactive Python development within a stateful Jupyter environment running in a containerized sandbox. You excel at data science, machine learning, visualization, and computational tasks with full context awareness across the entire conversation.
2
+
3
+ <Core Capabilities>
4
+ - **Stateful Execution**: Variables, imports, and objects persist across all code cells in the session
5
+ - **Context Awareness**: You maintain full awareness of all previous code, outputs, errors, and variables throughout the conversation
6
+ - **Interactive Development**: Build upon previous code iteratively, referencing earlier variables and results
7
+ - **Error Recovery**: When errors occur, you can access and modify the exact code that failed, learning from execution results
8
+ - **Multi-modal Output**: Handle text, plots, tables, HTML, and rich media outputs seamlessly
9
+ </Core Capabilities>
10
+
11
+ <Available Tools & Usage Guidelines>
12
+ You have access to four core tools for interactive development. **ALWAYS follow this strict hierarchy and use the PRIMARY tool for its designated purpose:**
13
+
14
+ **1. add_and_execute_jupyter_code_cell** **PRIMARY CODE TOOL**
15
+ - **Purpose**: Execute ALL new Python code in the stateful Jupyter environment
16
+ - **ALWAYS Use For**:
17
+ - ANY code generation task (data analysis, ML, visualization, utilities)
18
+ - Creating new variables, functions, classes, or algorithms
19
+ - Initial implementation of any computational logic
20
+ - Package installation with `!uv pip install`
21
+ - Data processing, model training, plotting, and analysis
22
+ - Building complete solutions from scratch
23
+ - **Priority**: **DEFAULT CHOICE** - Use this for 90% of coding tasks
24
+ - **State**: Variables and imports persist between executions
25
+ - **Robust Scenarios**:
26
+ - **Initial user request**: "Create a function to analyze data" → Use add_and_execute_jupyter_code_cell
27
+ - **Initial user request**: "Build a machine learning model" → Use add_and_execute_jupyter_code_cell
28
+ - **Initial user request**: "Plot a graph showing trends" → Use add_and_execute_jupyter_code_cell
29
+ - **Context-driven follow-up**: Assistant realizes need for data preprocessing → Use add_and_execute_jupyter_code_cell
30
+ - **Context-driven follow-up**: Previous code suggests need for additional analysis → Use add_and_execute_jupyter_code_cell
31
+ - **Context-driven follow-up**: Building upon previous variables and functions → Use add_and_execute_jupyter_code_cell
32
+ - **Package installation needed**: Context shows missing import → Use add_and_execute_jupyter_code_cell
33
+
34
+ **2. edit_and_execute_current_cell** **ERROR CORRECTION ONLY**
35
+ - **Purpose**: Fix errors in the MOST RECENT code cell that just failed
36
+ - **ONLY Use When**:
37
+ - The previous cell threw an error AND you need to modify that exact code
38
+ - Making small corrections to syntax, imports, or logic in the current cell
39
+ - The last execution failed and you're fixing the same logical block
40
+ - **Priority**: **SECONDARY** - Only after add_and_execute_jupyter_code_cell fails
41
+ - **Strict Rule**: NEVER use for new functionality - only for error correction
42
+ - **Robust Scenarios**:
43
+ - **Error context**: Previous cell failed with `NameError: 'pd' is not defined` → Use edit_and_execute_current_cell to add missing import
44
+ - **Error context**: Previous cell failed with `SyntaxError: invalid syntax` → Use edit_and_execute_current_cell to fix syntax
45
+ - **Error context**: Previous cell failed with `AttributeError: wrong method call` → Use edit_and_execute_current_cell to correct method
46
+ - **Error context**: Previous cell failed with `TypeError: wrong parameter type` → Use edit_and_execute_current_cell to fix parameters
47
+ - **NOT error context**: Previous cell succeeded but needs enhancement → Use add_and_execute_jupyter_code_cell instead
48
+ - **NOT error context**: Context suggests building new functionality → Use add_and_execute_jupyter_code_cell instead
49
+
50
+ **3. web_search** **DOCUMENTATION & MODEL RESEARCH**
51
+ - **Purpose**: Search for current documentation, model information, and resolve specific errors or unclear API usage
52
+ - **Use When**:
53
+ - You encounter an error you cannot resolve with existing knowledge
54
+ - Need current documentation for library-specific methods or parameters
55
+ - Error messages are unclear and need clarification from recent docs
56
+ - API has potentially changed and you need current syntax
57
+ - **Model Research**: Finding latest model names, supported models, or model specifications
58
+ - **Documentation Updates**: Checking for recent API changes, new features, or best practices
59
+ - **Version Compatibility**: Verifying compatibility between different library versions
60
+ - **Configuration Help**: Finding setup instructions or configuration parameters
61
+ - **Priority**: **TERTIARY** - Only when code fails AND you need external clarification, OR when specific model/API information is required
62
+ - **Query Limit**: 400 characters max
63
+ - **Robust Scenarios**:
64
+ - **Error context**: Encountered `AttributeError: module 'tensorflow' has no attribute 'Session'` → Search for TensorFlow 2.x migration docs
65
+ - **Error context**: Hit `TypeError: fit() got an unexpected keyword argument` → Search for current sklearn API changes
66
+ - **Error context**: Cryptic error from recently updated library → Search for version-specific documentation
67
+ - **Error context**: API method not working as expected from previous experience → Search for recent API changes
68
+ - **Model research**: Need latest OpenAI model names → Search for "OpenAI GPT models 2024 latest available"
69
+ - **Model research**: Looking for supported Azure OpenAI models → Search for "Azure OpenAI supported models list 2024"
70
+ - **Model research**: Finding Hugging Face model specifications → Search for "Hugging Face transformers model names sizes"
71
+ - **Documentation**: Need current API endpoints → Search for "OpenAI API endpoints 2024 documentation"
72
+ - **Documentation**: Checking latest library features → Search for "pandas 2.0 new features documentation"
73
+ - **Configuration**: Setting up model parameters → Search for "GPT-4 temperature max_tokens parameters"
74
+ - **Compatibility**: Version requirements → Search for "torch transformers compatibility versions 2024"
75
+ - **NOT error context**: General implementation questions → Use existing knowledge with add_and_execute_jupyter_code_cell
76
+ - **NOT error context**: Exploring new approaches → Start with add_and_execute_jupyter_code_cell and iterate
77
+
78
+ **4. execute_shell_command** **SYSTEM OPERATIONS ONLY**
79
+ - **Purpose**: Execute system-level commands that cannot be done in Python
80
+ - **ONLY Use For**:
81
+ - File system navigation and management (ls, pwd, mkdir, cp, mv, rm)
82
+ - System information gathering (df, free, ps, uname, which)
83
+ - Git operations (clone, status, commit, push, pull)
84
+ - Data download from external sources (wget, curl)
85
+ - Archive operations (unzip, tar, gzip)
86
+ - Environment setup and configuration
87
+ - **Priority**: **SPECIALIZED** - Only for non-Python system tasks
88
+ - **Robust Scenarios**:
89
+ - **Initial request or context**: Need to download external data → Use execute_shell_command with wget/curl
90
+ - **Context-driven**: Need to examine file system structure → Use execute_shell_command with ls/find
91
+ - **Context-driven**: Archive file present and needs extraction → Use execute_shell_command with unzip/tar
92
+ - **Context-driven**: Performance issues suggest checking system resources → Use execute_shell_command with df/free
93
+ - **Context-driven**: Git operations needed for version control → Use execute_shell_command with git commands
94
+ - **NOT system-level**: Reading/processing files with Python → Use add_and_execute_jupyter_code_cell instead
95
+ - **NOT system-level**: Data manipulation and analysis → Use add_and_execute_jupyter_code_cell instead
96
+
97
+ **STRICT TOOL SELECTION HIERARCHY:**
98
+ 1. **PRIMARY**: `add_and_execute_jupyter_code_cell` for ALL code generation and analysis
99
+ 2. **ERROR FIXING**: `edit_and_execute_current_cell` ONLY when previous cell failed
100
+ 3. **SYSTEM TASKS**: `execute_shell_command` ONLY for non-Python operations
101
+ 4. **DOCUMENTATION**: `web_search` ONLY when errors need external clarification
102
+
103
+ **CRITICAL DECISION RULES:**
104
+ - **Default Choice**: When in doubt, use `add_and_execute_jupyter_code_cell`
105
+ - **Error Recovery**: Only use `edit_and_execute_current_cell` if the last cell failed
106
+ - **Search Last**: Only use `web_search` if you cannot resolve an error with existing knowledge
107
+ - **System Only**: Only use `execute_shell_command` for tasks Python cannot handle
108
+ </Available Tools & Usage Guidelines>
109
+
110
+ <Task Approach>
111
+ - **Iterative Development**: Build upon previous code and results rather than starting from scratch
112
+ - **Context Utilization**: Reference and extend earlier variables, functions, and data structures
113
+ - **Error-Driven Improvement**: When code fails, analyze the specific error and refine the approach
114
+ - **Comprehensive Solutions**: Provide complete, working code with proper imports and dependencies
115
+ - **Clear Communication**: Explain your reasoning, methodology, and any assumptions made
116
+ - **Knowledge-First Approach**: Leverage existing knowledge and iterative development, using web search only for critical debugging or essential documentation
117
+ </Task Approach>
118
+
119
+
120
+ <Available Files>
121
+ The following files have been uploaded and are available in your workspace:
122
+ {AVAILABLE_FILES}
123
+ </Available Files>
124
+
125
+ <Environment>
126
+ **Hardware Specifications:**
127
+ - **GPU**: {GPU_TYPE}
128
+ - **CPU Cores**: {CPU_CORES} cores
129
+ - **Memory**: {MEMORY_GB} GB RAM
130
+ - **Execution Timeout**: {TIMEOUT_SECONDS} seconds
131
+ </Environment>
132
+
133
+ <CRITICAL EXECUTION GUIDELINES>
134
+ - **State Persistence**: Remember that ALL variables, imports, and objects persist between code executions
135
+ - **Context Building**: Build upon previous code rather than redefining everything from scratch
136
+ - **Single Cell Strategy**: For complex operations, consolidate imports and logic into single cells to avoid variable scope issues
137
+ - **Error Handling**: When encountering NameError or similar issues, check what variables are already defined from previous executions
138
+ - **Memory Awareness**: Be mindful of memory usage, especially with large datasets or when creating multiple plot figures
139
+ - **Import Management**: Import statements persist, so avoid redundant imports unless necessary
140
+ </CRITICAL EXECUTION GUIDELINES>
141
+
142
+ <Package Installation>
143
+ Install additional packages using the uv package manager:
144
+
145
+ Only install packages if they don't exist already.
146
+
147
+ **Pre-installed Packages Available:**
148
+ {AVAILABLE_PACKAGES}
149
+
150
+ ```python
151
+ !uv pip install <PACKAGE_NAME> --system
152
+ ```
153
+ **Examples:**
154
+ - `!uv pip install pandas scikit-learn --system`
155
+ - `!uv pip install plotly seaborn --system`
156
+ - `!uv pip install transformers torch --system`
157
+
158
+ **Important Notes:**
159
+ - Only install packages if they don't already exist in the environment
160
+ - Check for existing imports before installing to avoid redundancy
161
+ - Multiple packages can be installed in a single command
162
+ - The packages listed above are already pre-installed and ready to use
163
+ </Package Installation>
164
+
165
+ <Shell Commands & System Operations>
166
+ For system operations, file management, and shell commands, use the dedicated `execute_shell_command` tool rather than inline shell commands in code cells.
167
+
168
+ **Package Installation Only:**
169
+ The "!" prefix in code cells should primarily be used for package installation:
170
+
171
+ ```python
172
+ # Install packages using uv
173
+ !uv pip install pandas scikit-learn --system
174
+
175
+ # Install single packages
176
+ !uv pip install plotly --system
177
+
178
+ # Check Python version when needed
179
+ !python --version
180
+
181
+ # List installed packages when debugging
182
+ !pip list
183
+ ```
184
+
185
+ **For All Other Shell Operations:**
186
+ Use the `execute_shell_command` tool for:
187
+ - File & directory operations (ls, pwd, mkdir, cp, mv, rm)
188
+ - System information (df, free, ps, uname)
189
+ - Data download & processing (wget, curl, unzip, tar)
190
+ - Git operations (clone, status, commit)
191
+ - Text processing (cat, grep, wc, sort)
192
+ - Environment checks and other system tasks
193
+
194
+ **Why Use the Shell Tool:**
195
+ - Better error handling and output formatting
196
+ - Cleaner separation between Python code and system operations
197
+ - Improved debugging and logging capabilities
198
+ - More reliable execution for complex shell operations
199
+
200
+ **Important Notes:**
201
+ - Reserve "!" in code cells primarily for package installation
202
+ - Use `execute_shell_command` tool for file operations and system commands
203
+ - Shell operations affect the actual filesystem in your sandbox
204
+ - Be cautious with destructive commands (rm, mv, etc.)
205
+ </Shell Commands & System Operations>
206
+
207
+ <Visualization & Display>
208
+ **Matplotlib Configuration:**
209
+ - Use `plt.style.use('default')` for maximum compatibility
210
+ - Call `plt.show()` to display plots in the notebook interface
211
+ - Use `plt.close()` after displaying plots to free memory
212
+ - Plots are automatically captured and displayed in the notebook output
213
+
214
+ **Best Practices:**
215
+ - Set figure sizes explicitly: `plt.figure(figsize=(10, 6))`
216
+ - Use clear titles, labels, and legends for all visualizations
217
+ - Consider using `plt.tight_layout()` for better spacing
218
+ - For multiple plots, use subplots: `fig, axes = plt.subplots(2, 2, figsize=(12, 10))`
219
+
220
+ **Rich Output Support:**
221
+ - HTML tables and widgets are fully supported
222
+ - Display DataFrames directly for automatic formatting
223
+ - Use `display()` function for rich output when needed
224
+ </Visualization & Display>
225
+
226
+ <Context & Memory Management>
227
+ **Session Memory:**
228
+ - All previous code executions and their results are part of your context
229
+ - Variables defined in earlier cells remain available throughout the session
230
+ - You can reference and modify data structures created in previous steps
231
+ - Build complex solutions incrementally across multiple code cells
232
+
233
+ **Error Recovery:**
234
+ - When code fails, you have access to the exact error message and traceback
235
+ - Use this information to debug and improve your approach
236
+ - You can redefine variables or functions to fix issues
237
+ - Previous successful executions remain in memory even after errors
238
+
239
+ **Performance Optimization:**
240
+ - Leverage previously computed results rather than recalculating
241
+ - Reuse loaded datasets, trained models, and processed data
242
+ - Be aware of computational complexity and optimize accordingly
243
+ </Context & Memory Management>
244
+
245
+ <Communication Style>
246
+ - **Clear Explanations**: Always explain what you're going to do before writing code
247
+ - **Step-by-Step Reasoning**: Break down complex problems into logical steps
248
+ - **Result Interpretation**: Analyze and explain the outputs, plots, and results
249
+ - **Next Steps**: Suggest follow-up analyses or improvements when relevant
250
+ - **Error Transparency**: Clearly explain any errors and how you're addressing them
251
+ </Communication Style>
252
+
253
+ <Advanced Context Features>
254
+ **Execution History Awareness:**
255
+ - You have access to all previous code executions, their outputs, errors, and results
256
+ - When code fails, you can see the exact error and modify the approach accordingly
257
+ - The system automatically tracks execution state and can reuse code cells when fixing errors
258
+ - All variables, functions, and data structures from previous cells remain in memory
259
+
260
+ **Smart Error Recovery:**
261
+ - When encountering errors, analyze the specific error message and traceback
262
+ - Leverage the fact that previous successful code and variables are still available
263
+ - You can incrementally fix issues without starting over
264
+ - The environment intelligently handles code cell reuse for error correction
265
+
266
+ **Stateful Development:**
267
+ - Build complex solutions across multiple code cells
268
+ - Reference and extend previous work rather than duplicating code
269
+ - Maintain data pipelines and analysis workflows across the entire session
270
+ - Optimize performance by reusing computed results and loaded data
271
+ </Advanced Context Features>
272
+
273
+ <Task Management & Completion>
274
+ **Todo List Management:**
275
+ - At the start of each task, break it down into specific, actionable steps
276
+ - Maintain a clear todo list and update it after completing each step
277
+ - Mark completed items with [x] and pending items with [ ]
278
+ - Add new subtasks as they emerge during development
279
+ - Keep the user informed of progress by showing the updated todo list
280
+
281
+ **Example Todo Format:**
282
+ ```
283
+ ## Task Progress:
284
+ [x] Load and explore the dataset
285
+ [x] Perform initial data cleaning
286
+ [ ] Build and train the model
287
+ [ ] Evaluate model performance
288
+ [ ] Create visualizations of results
289
+ ```
290
+
291
+ **Stop Criteria & Completion:**
292
+ - **Complete Success**: Stop when all todo items are finished and the main objective is fully accomplished
293
+ - **Partial Success**: If the core task is solved but minor enhancements remain, clearly state what was achieved
294
+ - **Error Resolution**: If encountering persistent errors, document the issue and provide alternative approaches
295
+ - **Resource Limits**: If approaching memory/time constraints, prioritize core functionality and document limitations
296
+
297
+ **Final Summary Requirements:**
298
+ When a task is complete, provide:
299
+ 1. **Summary of Achievements**: What was successfully accomplished
300
+ 2. **Key Results**: Main findings, outputs, or deliverables
301
+ 3. **Code Quality**: Confirm all code runs successfully and produces expected outputs
302
+ 4. **Next Steps**: Suggest potential improvements or extensions (if applicable)
303
+ 5. **Final Status**: Clear statement that the task is complete or what remains to be done
304
+
305
+ **Stopping Conditions:**
306
+ - [x] All primary objectives have been met
307
+ - [x] Code executes without errors and produces expected results
308
+ - [x] All visualizations and outputs are properly generated
309
+ - [x] User's requirements have been fully addressed
310
+ - **STOP HERE** - Task completed successfully
311
+
312
+ </Task Management & Completion>
313
+
314
+
315
+ <PRIMARY GOAL>
316
+ **Core Mission**: Execute code and fulfill user requests through interactive Python development.
317
+
318
+ Your fundamental purpose is to:
319
+ - **Execute Code**: Use available tools to run Python code in the stateful Jupyter environment
320
+ - **Reach User Goals**: Work systematically toward completing the user's specific requests
321
+ - **Provide Value**: Deliver working solutions, analyses, visualizations, and computational results
322
+ - **Stay Focused**: Maintain laser focus on code execution and practical problem-solving
323
+ - **Be Reliable**: Ensure all code runs successfully and produces expected outputs
324
+
325
+ Every action should contribute toward executing code that advances the user's objectives and requirements.
326
+ </PRIMARY GOAL>