Nipun Claude commited on
Commit
c372914
Β·
1 Parent(s): 3645162

Add support for DataFrame outputs for better tabular data display

Browse files

SYSTEM PROMPT IMPROVEMENTS:
- Add three output types: TEXT (simple answers), PLOTS (visualizations), DATAFRAMES (tabular data)
- Specify when to use each type:
* TEXT: Simple 1-2 value answers (Which city, What month)
* PLOTS: Visualization requests (Plot, Show chart, Visualize)
* DATAFRAMES: Tabular data like city rankings, improvement rates, comparisons
- Add guidance for proper DataFrame formatting and sorting

UI IMPROVEMENTS:
- Add DataFrame detection in response handler
- Display DataFrames with st.dataframe() for better formatting and interactivity
- Better than long text lists for tabular data like city pollution improvement rates
- Maintains existing text and plot display logic

This should solve the issue of long unreadable text outputs for tabular data like the pollution improvement rates example.

πŸ€– Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>

Files changed (2) hide show
  1. app.py +16 -1
  2. src.py +11 -5
app.py CHANGED
@@ -669,8 +669,12 @@ def show_custom_response(response):
669
  # Check if content is an image filename - don't display the filename text
670
  is_image_path = isinstance(content, str) and any(ext in content for ext in ['.png', '.jpg', '.jpeg'])
671
 
 
 
 
 
672
  # Assistant message with left alignment - reduced margins
673
- if not is_image_path:
674
  st.markdown(f"""
675
  <div style='display: flex; justify-content: flex-start; margin: 1rem 0;'>
676
  <div class='assistant-message'>
@@ -679,6 +683,17 @@ def show_custom_response(response):
679
  </div>
680
  </div>
681
  """, unsafe_allow_html=True)
 
 
 
 
 
 
 
 
 
 
 
682
 
683
  # Show generated code with Streamlit expander
684
  if response.get("gen_code"):
 
669
  # Check if content is an image filename - don't display the filename text
670
  is_image_path = isinstance(content, str) and any(ext in content for ext in ['.png', '.jpg', '.jpeg'])
671
 
672
+ # Check if content is a pandas DataFrame
673
+ import pandas as pd
674
+ is_dataframe = isinstance(content, pd.DataFrame)
675
+
676
  # Assistant message with left alignment - reduced margins
677
+ if not is_image_path and not is_dataframe:
678
  st.markdown(f"""
679
  <div style='display: flex; justify-content: flex-start; margin: 1rem 0;'>
680
  <div class='assistant-message'>
 
683
  </div>
684
  </div>
685
  """, unsafe_allow_html=True)
686
+ elif is_dataframe:
687
+ # Display DataFrame with nice formatting
688
+ st.markdown("""
689
+ <div style='display: flex; justify-content: flex-start; margin: 1rem 0;'>
690
+ <div class='assistant-message'>
691
+ <div class='assistant-info'>VayuChat</div>
692
+ Here are the results:
693
+ </div>
694
+ </div>
695
+ """, unsafe_allow_html=True)
696
+ st.dataframe(content, use_container_width=True)
697
 
698
  # Show generated code with Streamlit expander
699
  if response.get("gen_code"):
src.py CHANGED
@@ -289,11 +289,15 @@ df["Timestamp"] = pd.to_datetime(df["Timestamp"])
289
 
290
  IMPORTANT: Only generate Python code - no explanations, no thinking, just clean code.
291
 
292
- WHEN TO CREATE PLOTS vs TEXT ANSWERS:
293
- - Questions asking "Which", "What", specific values β†’ TEXT ANSWERS (store text in 'answer')
294
- - Questions asking "Plot", "Show", "Visualize", "Chart" β†’ PLOTS (store filename in 'answer')
295
- - Questions asking for comparisons of many items β†’ PLOTS
296
- - Simple direct questions β†’ TEXT ANSWERS
 
 
 
 
297
 
298
  SAFETY & ROBUSTNESS RULES:
299
  - Always check if data exists before processing: if df.empty: answer = "No data available"
@@ -316,9 +320,11 @@ TECHNICAL REQUIREMENTS:
316
  - Save final result in variable called 'answer'
317
  - For TEXT: Store the direct answer as a string in 'answer'
318
  - For PLOTS: Save with unique filename f"plot_{{uuid.uuid4().hex[:8]}}.png" and store filename in 'answer'
 
319
  - Always use .iloc or .loc properly for pandas indexing
320
  - Close matplotlib figures with plt.close() to prevent memory leaks
321
  - Use proper column name checks before accessing columns
 
322
  """
323
 
324
  query = f"""{system_prompt}
 
289
 
290
  IMPORTANT: Only generate Python code - no explanations, no thinking, just clean code.
291
 
292
+ WHEN TO USE DIFFERENT OUTPUT TYPES:
293
+ - Simple questions asking "Which city", "What month" (1-2 values) β†’ TEXT ANSWERS (store text in 'answer')
294
+ - Questions asking "Plot", "Show chart", "Visualize" β†’ PLOTS (store filename in 'answer')
295
+ - Questions with tabular data (lists of cities, rates, rankings, comparisons) β†’ DATAFRAMES (store dataframe in 'answer')
296
+ - Examples of DATAFRAME outputs:
297
+ * Lists of cities with values (pollution levels, improvement rates)
298
+ * Rankings or comparisons across multiple entities
299
+ * Any result that would be >5 rows of data
300
+ * Calculate/List/Compare operations with multiple results
301
 
302
  SAFETY & ROBUSTNESS RULES:
303
  - Always check if data exists before processing: if df.empty: answer = "No data available"
 
320
  - Save final result in variable called 'answer'
321
  - For TEXT: Store the direct answer as a string in 'answer'
322
  - For PLOTS: Save with unique filename f"plot_{{uuid.uuid4().hex[:8]}}.png" and store filename in 'answer'
323
+ - For DATAFRAMES: Store the pandas DataFrame directly in 'answer' (e.g., answer = result_df)
324
  - Always use .iloc or .loc properly for pandas indexing
325
  - Close matplotlib figures with plt.close() to prevent memory leaks
326
  - Use proper column name checks before accessing columns
327
+ - For dataframes, ensure proper column names and sorting for readability
328
  """
329
 
330
  query = f"""{system_prompt}