Add comprehensive safety and robustness guidelines to system prompt
Browse files- Add data validation checks (empty dataframes, missing values)
- Add error handling with try-except blocks
- Add city/location validation before filtering
- Add proper handling of empty results after filtering
- Add numerical formatting (.round(2)) to avoid long decimals
- Add division by zero protection
- Add date range validation
- Add proper units formatting (μg/m³)
- Add memory management (plt.close())
- Add column name validation
- This should make the generated code much more robust and safe
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <[email protected]>
src.py
CHANGED
@@ -295,12 +295,25 @@ WHEN TO CREATE PLOTS vs TEXT ANSWERS:
|
|
295 |
- Questions asking for comparisons of many items → PLOTS
|
296 |
- Simple direct questions → TEXT ANSWERS
|
297 |
|
298 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
299 |
- Save final result in variable called 'answer'
|
300 |
- For TEXT: Store the direct answer as a string in 'answer'
|
301 |
- For PLOTS: Save with unique filename f"plot_{{uuid.uuid4().hex[:8]}}.png" and store filename in 'answer'
|
302 |
- Convert numpy types to int when using as indices: int(value)
|
303 |
- Always use .iloc or .loc properly for pandas indexing
|
|
|
|
|
304 |
"""
|
305 |
|
306 |
query = f"""{system_prompt}
|
|
|
295 |
- Questions asking for comparisons of many items → PLOTS
|
296 |
- Simple direct questions → TEXT ANSWERS
|
297 |
|
298 |
+
SAFETY & ROBUSTNESS RULES:
|
299 |
+
- Always check if data exists before processing: if df.empty: answer = "No data available"
|
300 |
+
- Handle missing values: use .dropna() or .fillna() appropriately
|
301 |
+
- Use try-except blocks for risky operations like indexing
|
302 |
+
- Validate city/location names exist in data before filtering
|
303 |
+
- Check for empty results after filtering: if filtered_df.empty: answer = "No data found for specified criteria"
|
304 |
+
- Use .round(2) for numerical results to avoid long decimals
|
305 |
+
- Handle division by zero: check denominators before division
|
306 |
+
- Validate date ranges exist in data
|
307 |
+
- Use proper string formatting for answers with units (μg/m³)
|
308 |
+
|
309 |
+
TECHNICAL REQUIREMENTS:
|
310 |
- Save final result in variable called 'answer'
|
311 |
- For TEXT: Store the direct answer as a string in 'answer'
|
312 |
- For PLOTS: Save with unique filename f"plot_{{uuid.uuid4().hex[:8]}}.png" and store filename in 'answer'
|
313 |
- Convert numpy types to int when using as indices: int(value)
|
314 |
- Always use .iloc or .loc properly for pandas indexing
|
315 |
+
- Close matplotlib figures with plt.close() to prevent memory leaks
|
316 |
+
- Use proper column name checks before accessing columns
|
317 |
"""
|
318 |
|
319 |
query = f"""{system_prompt}
|