Fix critical system prompt issues
Browse files- Remove 'return' statements from script context (causing syntax errors)
- Add library usage rules preferring numpy/sklearn over statsmodels
- Emphasize if/else logic flow instead of return statements
- Add graceful library import handling
- Guide toward simpler trend analysis with numpy.polyfit()
This fixes 'return outside function' and missing library errors.
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <[email protected]>
src.py
CHANGED
|
@@ -296,8 +296,13 @@ You can use these pre-installed libraries:
|
|
| 296 |
- statsmodels (statistical modeling, trend analysis)
|
| 297 |
- scikit-learn (machine learning, regression)
|
| 298 |
- geopandas (geospatial analysis)
|
| 299 |
-
|
| 300 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 301 |
|
| 302 |
OUTPUT TYPE REQUIREMENTS:
|
| 303 |
1. PLOT GENERATION (for "plot", "chart", "visualize", "show trend", "graph"):
|
|
@@ -320,32 +325,33 @@ OUTPUT TYPE REQUIREMENTS:
|
|
| 320 |
MANDATORY SAFETY & ROBUSTNESS RULES:
|
| 321 |
|
| 322 |
DATA VALIDATION (ALWAYS CHECK):
|
| 323 |
-
- Check if DataFrame exists and not empty: if df.empty: answer = "No data available"
|
| 324 |
-
- Validate required columns exist: if 'PM2.5' not in df.columns: answer = "Required data not available"
|
| 325 |
-
- Check for sufficient data: if len(df) < 10: answer = "Insufficient data for analysis"
|
| 326 |
- Remove invalid/missing values: df = df.dropna(subset=['PM2.5', 'city', 'Timestamp'])
|
| 327 |
-
-
|
| 328 |
|
| 329 |
OPERATION SAFETY (PREVENT CRASHES):
|
| 330 |
- Wrap risky operations in try-except blocks
|
| 331 |
- Check denominators before division: if denominator == 0: continue
|
| 332 |
- Validate indexing bounds: if idx >= len(array): continue
|
| 333 |
-
- Check for empty results after filtering: if result_df.empty: answer = "No data found"
|
| 334 |
- Convert data types explicitly: pd.to_numeric(), .astype(int), .astype(str)
|
| 335 |
- Handle timezone issues with datetime operations
|
|
|
|
| 336 |
|
| 337 |
PLOT GENERATION (MANDATORY FOR PLOTS):
|
| 338 |
-
- Check data exists before plotting: if plot_data.empty: answer = "No data to plot"
|
| 339 |
- Always create new figure: plt.figure(figsize=(12, 8))
|
| 340 |
- Add comprehensive labels: plt.title(), plt.xlabel(), plt.ylabel()
|
| 341 |
- Handle long city names: plt.xticks(rotation=45, ha='right')
|
| 342 |
- Use tight layout: plt.tight_layout()
|
| 343 |
-
- CRITICAL PLOT SAVING SEQUENCE:
|
| 344 |
1. filename = f"plot_{uuid.uuid4().hex[:8]}.png"
|
| 345 |
2. plt.savefig(filename, dpi=300, bbox_inches='tight')
|
| 346 |
3. plt.close()
|
| 347 |
4. answer = filename
|
| 348 |
-
-
|
| 349 |
|
| 350 |
CRITICAL CODING PRACTICES:
|
| 351 |
|
|
|
|
| 296 |
- statsmodels (statistical modeling, trend analysis)
|
| 297 |
- scikit-learn (machine learning, regression)
|
| 298 |
- geopandas (geospatial analysis)
|
| 299 |
+
|
| 300 |
+
LIBRARY USAGE RULES:
|
| 301 |
+
- For trend analysis: Use numpy.polyfit(x, y, 1) for simple linear trends
|
| 302 |
+
- For regression: Use sklearn.linear_model.LinearRegression() for robust regression
|
| 303 |
+
- For statistical modeling: Use statsmodels only if needed, otherwise use numpy/sklearn
|
| 304 |
+
- Always import libraries at the top: import numpy as np, from sklearn.linear_model import LinearRegression
|
| 305 |
+
- Handle missing libraries gracefully with try-except around imports
|
| 306 |
|
| 307 |
OUTPUT TYPE REQUIREMENTS:
|
| 308 |
1. PLOT GENERATION (for "plot", "chart", "visualize", "show trend", "graph"):
|
|
|
|
| 325 |
MANDATORY SAFETY & ROBUSTNESS RULES:
|
| 326 |
|
| 327 |
DATA VALIDATION (ALWAYS CHECK):
|
| 328 |
+
- Check if DataFrame exists and not empty: if df.empty: answer = "No data available"
|
| 329 |
+
- Validate required columns exist: if 'PM2.5' not in df.columns: answer = "Required data not available"
|
| 330 |
+
- Check for sufficient data: if len(df) < 10: answer = "Insufficient data for analysis"
|
| 331 |
- Remove invalid/missing values: df = df.dropna(subset=['PM2.5', 'city', 'Timestamp'])
|
| 332 |
+
- Use early exit pattern: if condition: answer = "error message"; else: continue with analysis
|
| 333 |
|
| 334 |
OPERATION SAFETY (PREVENT CRASHES):
|
| 335 |
- Wrap risky operations in try-except blocks
|
| 336 |
- Check denominators before division: if denominator == 0: continue
|
| 337 |
- Validate indexing bounds: if idx >= len(array): continue
|
| 338 |
+
- Check for empty results after filtering: if result_df.empty: answer = "No data found"
|
| 339 |
- Convert data types explicitly: pd.to_numeric(), .astype(int), .astype(str)
|
| 340 |
- Handle timezone issues with datetime operations
|
| 341 |
+
- NO return statements - this is script context, use if/else logic flow
|
| 342 |
|
| 343 |
PLOT GENERATION (MANDATORY FOR PLOTS):
|
| 344 |
+
- Check data exists before plotting: if plot_data.empty: answer = "No data to plot"
|
| 345 |
- Always create new figure: plt.figure(figsize=(12, 8))
|
| 346 |
- Add comprehensive labels: plt.title(), plt.xlabel(), plt.ylabel()
|
| 347 |
- Handle long city names: plt.xticks(rotation=45, ha='right')
|
| 348 |
- Use tight layout: plt.tight_layout()
|
| 349 |
+
- CRITICAL PLOT SAVING SEQUENCE (no return statements):
|
| 350 |
1. filename = f"plot_{uuid.uuid4().hex[:8]}.png"
|
| 351 |
2. plt.savefig(filename, dpi=300, bbox_inches='tight')
|
| 352 |
3. plt.close()
|
| 353 |
4. answer = filename
|
| 354 |
+
- Use if/else logic: if data_valid: create_plot(); answer = filename else: answer = "error"
|
| 355 |
|
| 356 |
CRITICAL CODING PRACTICES:
|
| 357 |
|