Spaces:

SustainabilityLabIITGN
/

VayuChat

Running

Nipun Claude commited on 6 days ago

Commit

c372914

1 Parent(s): 3645162

Add support for DataFrame outputs for better tabular data display

SYSTEM PROMPT IMPROVEMENTS:
- Add three output types: TEXT (simple answers), PLOTS (visualizations), DATAFRAMES (tabular data)
- Specify when to use each type:
* TEXT: Simple 1-2 value answers (Which city, What month)
* PLOTS: Visualization requests (Plot, Show chart, Visualize)
* DATAFRAMES: Tabular data like city rankings, improvement rates, comparisons
- Add guidance for proper DataFrame formatting and sorting

UI IMPROVEMENTS:
- Add DataFrame detection in response handler
- Display DataFrames with st.dataframe() for better formatting and interactivity
- Better than long text lists for tabular data like city pollution improvement rates
- Maintains existing text and plot display logic

This should solve the issue of long unreadable text outputs for tabular data like the pollution improvement rates example.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>

Files changed (2) hide show

app.py +16 -1
src.py +11 -5

app.py CHANGED Viewed

@@ -669,8 +669,12 @@ def show_custom_response(response):
         # Check if content is an image filename - don't display the filename text
         is_image_path = isinstance(content, str) and any(ext in content for ext in ['.png', '.jpg', '.jpeg'])
         # Assistant message with left alignment - reduced margins
-        if not is_image_path:
             st.markdown(f"""
             <div style='display: flex; justify-content: flex-start; margin: 1rem 0;'>
                 <div class='assistant-message'>
@@ -679,6 +683,17 @@ def show_custom_response(response):
                 </div>
             </div>
             """, unsafe_allow_html=True)
         # Show generated code with Streamlit expander
         if response.get("gen_code"):

         # Check if content is an image filename - don't display the filename text
         is_image_path = isinstance(content, str) and any(ext in content for ext in ['.png', '.jpg', '.jpeg'])
+        # Check if content is a pandas DataFrame
+        import pandas as pd
+        is_dataframe = isinstance(content, pd.DataFrame)
         # Assistant message with left alignment - reduced margins
+        if not is_image_path and not is_dataframe:
             st.markdown(f"""
             <div style='display: flex; justify-content: flex-start; margin: 1rem 0;'>
                 <div class='assistant-message'>
                 </div>
             </div>
             """, unsafe_allow_html=True)
+        elif is_dataframe:
+            # Display DataFrame with nice formatting
+            st.markdown("""
+            <div style='display: flex; justify-content: flex-start; margin: 1rem 0;'>
+                <div class='assistant-message'>
+                    <div class='assistant-info'>VayuChat</div>
+                    Here are the results:
+                </div>
+            </div>
+            """, unsafe_allow_html=True)
+            st.dataframe(content, use_container_width=True)
         # Show generated code with Streamlit expander
         if response.get("gen_code"):

src.py CHANGED Viewed

@@ -289,11 +289,15 @@ df["Timestamp"] = pd.to_datetime(df["Timestamp"])
 IMPORTANT: Only generate Python code - no explanations, no thinking, just clean code.
-WHEN TO CREATE PLOTS vs TEXT ANSWERS:
-- Questions asking "Which", "What", specific values → TEXT ANSWERS (store text in 'answer')
-- Questions asking "Plot", "Show", "Visualize", "Chart" → PLOTS (store filename in 'answer')
-- Questions asking for comparisons of many items → PLOTS
-- Simple direct questions → TEXT ANSWERS
 SAFETY & ROBUSTNESS RULES:
 - Always check if data exists before processing: if df.empty: answer = "No data available"
@@ -316,9 +320,11 @@ TECHNICAL REQUIREMENTS:
 - Save final result in variable called 'answer'
 - For TEXT: Store the direct answer as a string in 'answer'
 - For PLOTS: Save with unique filename f"plot_{{uuid.uuid4().hex[:8]}}.png" and store filename in 'answer'
 - Always use .iloc or .loc properly for pandas indexing
 - Close matplotlib figures with plt.close() to prevent memory leaks
 - Use proper column name checks before accessing columns
 """
         query = f"""{system_prompt}

 IMPORTANT: Only generate Python code - no explanations, no thinking, just clean code.
+WHEN TO USE DIFFERENT OUTPUT TYPES:
+- Simple questions asking "Which city", "What month" (1-2 values) → TEXT ANSWERS (store text in 'answer')
+- Questions asking "Plot", "Show chart", "Visualize" → PLOTS (store filename in 'answer')
+- Questions with tabular data (lists of cities, rates, rankings, comparisons) → DATAFRAMES (store dataframe in 'answer')
+- Examples of DATAFRAME outputs:
+  * Lists of cities with values (pollution levels, improvement rates)
+  * Rankings or comparisons across multiple entities
+  * Any result that would be >5 rows of data
+  * Calculate/List/Compare operations with multiple results
 SAFETY & ROBUSTNESS RULES:
 - Always check if data exists before processing: if df.empty: answer = "No data available"
 - Save final result in variable called 'answer'
 - For TEXT: Store the direct answer as a string in 'answer'
 - For PLOTS: Save with unique filename f"plot_{{uuid.uuid4().hex[:8]}}.png" and store filename in 'answer'
+- For DATAFRAMES: Store the pandas DataFrame directly in 'answer' (e.g., answer = result_df)
 - Always use .iloc or .loc properly for pandas indexing
 - Close matplotlib figures with plt.close() to prevent memory leaks
 - Use proper column name checks before accessing columns
+- For dataframes, ensure proper column names and sorting for readability
 """
         query = f"""{system_prompt}