Fix critical system prompt issues
Browse files- Remove 'return' statements from script context (causing syntax errors)
- Add library usage rules preferring numpy/sklearn over statsmodels
- Emphasize if/else logic flow instead of return statements
- Add graceful library import handling
- Guide toward simpler trend analysis with numpy.polyfit()
This fixes 'return outside function' and missing library errors.
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <[email protected]>
src.py
CHANGED
@@ -296,8 +296,13 @@ You can use these pre-installed libraries:
|
|
296 |
- statsmodels (statistical modeling, trend analysis)
|
297 |
- scikit-learn (machine learning, regression)
|
298 |
- geopandas (geospatial analysis)
|
299 |
-
|
300 |
-
|
|
|
|
|
|
|
|
|
|
|
301 |
|
302 |
OUTPUT TYPE REQUIREMENTS:
|
303 |
1. PLOT GENERATION (for "plot", "chart", "visualize", "show trend", "graph"):
|
@@ -320,32 +325,33 @@ OUTPUT TYPE REQUIREMENTS:
|
|
320 |
MANDATORY SAFETY & ROBUSTNESS RULES:
|
321 |
|
322 |
DATA VALIDATION (ALWAYS CHECK):
|
323 |
-
- Check if DataFrame exists and not empty: if df.empty: answer = "No data available"
|
324 |
-
- Validate required columns exist: if 'PM2.5' not in df.columns: answer = "Required data not available"
|
325 |
-
- Check for sufficient data: if len(df) < 10: answer = "Insufficient data for analysis"
|
326 |
- Remove invalid/missing values: df = df.dropna(subset=['PM2.5', 'city', 'Timestamp'])
|
327 |
-
-
|
328 |
|
329 |
OPERATION SAFETY (PREVENT CRASHES):
|
330 |
- Wrap risky operations in try-except blocks
|
331 |
- Check denominators before division: if denominator == 0: continue
|
332 |
- Validate indexing bounds: if idx >= len(array): continue
|
333 |
-
- Check for empty results after filtering: if result_df.empty: answer = "No data found"
|
334 |
- Convert data types explicitly: pd.to_numeric(), .astype(int), .astype(str)
|
335 |
- Handle timezone issues with datetime operations
|
|
|
336 |
|
337 |
PLOT GENERATION (MANDATORY FOR PLOTS):
|
338 |
-
- Check data exists before plotting: if plot_data.empty: answer = "No data to plot"
|
339 |
- Always create new figure: plt.figure(figsize=(12, 8))
|
340 |
- Add comprehensive labels: plt.title(), plt.xlabel(), plt.ylabel()
|
341 |
- Handle long city names: plt.xticks(rotation=45, ha='right')
|
342 |
- Use tight layout: plt.tight_layout()
|
343 |
-
- CRITICAL PLOT SAVING SEQUENCE:
|
344 |
1. filename = f"plot_{uuid.uuid4().hex[:8]}.png"
|
345 |
2. plt.savefig(filename, dpi=300, bbox_inches='tight')
|
346 |
3. plt.close()
|
347 |
4. answer = filename
|
348 |
-
-
|
349 |
|
350 |
CRITICAL CODING PRACTICES:
|
351 |
|
|
|
296 |
- statsmodels (statistical modeling, trend analysis)
|
297 |
- scikit-learn (machine learning, regression)
|
298 |
- geopandas (geospatial analysis)
|
299 |
+
|
300 |
+
LIBRARY USAGE RULES:
|
301 |
+
- For trend analysis: Use numpy.polyfit(x, y, 1) for simple linear trends
|
302 |
+
- For regression: Use sklearn.linear_model.LinearRegression() for robust regression
|
303 |
+
- For statistical modeling: Use statsmodels only if needed, otherwise use numpy/sklearn
|
304 |
+
- Always import libraries at the top: import numpy as np, from sklearn.linear_model import LinearRegression
|
305 |
+
- Handle missing libraries gracefully with try-except around imports
|
306 |
|
307 |
OUTPUT TYPE REQUIREMENTS:
|
308 |
1. PLOT GENERATION (for "plot", "chart", "visualize", "show trend", "graph"):
|
|
|
325 |
MANDATORY SAFETY & ROBUSTNESS RULES:
|
326 |
|
327 |
DATA VALIDATION (ALWAYS CHECK):
|
328 |
+
- Check if DataFrame exists and not empty: if df.empty: answer = "No data available"
|
329 |
+
- Validate required columns exist: if 'PM2.5' not in df.columns: answer = "Required data not available"
|
330 |
+
- Check for sufficient data: if len(df) < 10: answer = "Insufficient data for analysis"
|
331 |
- Remove invalid/missing values: df = df.dropna(subset=['PM2.5', 'city', 'Timestamp'])
|
332 |
+
- Use early exit pattern: if condition: answer = "error message"; else: continue with analysis
|
333 |
|
334 |
OPERATION SAFETY (PREVENT CRASHES):
|
335 |
- Wrap risky operations in try-except blocks
|
336 |
- Check denominators before division: if denominator == 0: continue
|
337 |
- Validate indexing bounds: if idx >= len(array): continue
|
338 |
+
- Check for empty results after filtering: if result_df.empty: answer = "No data found"
|
339 |
- Convert data types explicitly: pd.to_numeric(), .astype(int), .astype(str)
|
340 |
- Handle timezone issues with datetime operations
|
341 |
+
- NO return statements - this is script context, use if/else logic flow
|
342 |
|
343 |
PLOT GENERATION (MANDATORY FOR PLOTS):
|
344 |
+
- Check data exists before plotting: if plot_data.empty: answer = "No data to plot"
|
345 |
- Always create new figure: plt.figure(figsize=(12, 8))
|
346 |
- Add comprehensive labels: plt.title(), plt.xlabel(), plt.ylabel()
|
347 |
- Handle long city names: plt.xticks(rotation=45, ha='right')
|
348 |
- Use tight layout: plt.tight_layout()
|
349 |
+
- CRITICAL PLOT SAVING SEQUENCE (no return statements):
|
350 |
1. filename = f"plot_{uuid.uuid4().hex[:8]}.png"
|
351 |
2. plt.savefig(filename, dpi=300, bbox_inches='tight')
|
352 |
3. plt.close()
|
353 |
4. answer = filename
|
354 |
+
- Use if/else logic: if data_valid: create_plot(); answer = filename else: answer = "error"
|
355 |
|
356 |
CRITICAL CODING PRACTICES:
|
357 |
|