driaforall
/

Dria-Agent-a-7B

@@ -168,26 +168,26 @@ Let's break down the code into these steps:
 ```python
 from datetime import datetime, timedelta
-# Get today's date and calculate tomorrow's date
-today = datetime.now()
-tomorrow = today + timedelta(days=1)
-tomorrow_str = tomorrow.strftime('%Y-%m-%d')
 # Define the time slots
-start_time = '10:00'
-end_time = '12:00'
 # Step 1: Check availability
-is_available = check_availability(tomorrow_str, start_time, end_time)
-# Step 2: Make appointment if available
 if is_available:
-    appointment_result = make_appointment(tomorrow_str, start_time, end_time, "Meeting with Thesis Supervisor")
-# Step 3: Add to reminders if appointment is made
-if appointment_result['appointment_made']:
-    reminder_text = f"Appointment made for {appointment_result['day']} from {appointment_result['start_time']} to {appointment_result['end_time']}."
-    add_to_reminders(reminder_text)
 ```
 This code will first determine if the specified time slot is available tomorrow. If it is, it will attempt to make the appointment and then add it to the reminders if successful.
@@ -203,21 +203,23 @@ We evaluate the model on the following benchmarks:
 Below are the BFCL results: evaluation results for ***Qwen2.5-Coder-3B-Instruct***, ***Dria-Agent-α-3B***, ***Dria-Agent-α-7B***, and ***gpt-4o-2024-11-20***
-| Metric                                | Qwen/Qwen2.5-3B-Instruct   | Dria-Agent-a-3B   | Dria-Agent-a-7B   | gpt-4o-2024-11-20 (Prompt)   |
-|---------------------------------------|-----------|-----------|-----------|-----------|
-| **Non-Live Simple AST**               | 75.50%    | 75.08%    | 77.83%    | 79.42%    |
-| **Non-Live Multiple AST**             | 90.00%    | 93.00%    | 94.50%    | 95.50%    |
-| **Non-Live Parallel AST**             | 80.00%    | 85.00%    | 87.00%    | 94.00%    |
-| **Non-Live Parallel Multiple AST**    | 78.50%    | 79.00%    | 88.00%    | 83.50%    |
-| **Non-Live Simple Exec**              | 82.07%    | 87.57%    | 80.00%    | 100.00%   |
-| **Non-Live Multiple Exec**            | 86.00%    | 85.14%    | 84.00%    | 94.00%    |
-| **Non-Live Parallel Exec**            | 82.00%    | 90.00%    | 70.00%    | 86.00%    |
-| **Non-Live Parallel Multiple Exec**   | 80.00%    | 88.00%    | 65.00%    | 77.50%    |
-| **Live Simple AST**                   | 68.22%    | 70.16%    | 82.95%    | 83.72%    |
-| **Live Multiple AST**                 | 66.00%    | 67.14%    | 78.25%    | 79.77%    |
-| **Live Parallel AST**                 | 62.50%    | 50.00%    | 81.25%    | 87.50%    |
-| **Live Parallel Multiple AST**        | 66.67%    | 70.83%    | 70.83%    | 70.83%    |
-| **Relevance Detection**               | 88.89%    | 100.00%   | 100.00%   | 83.33%    |
 and the MMLU-Pro and DPAB results:

 ```python
 from datetime import datetime, timedelta
+# Get tomorrow's date
+tomorrow = (datetime.now() + timedelta(days=1)).strftime("%Y-%m-%d")
 # Define the time slots
+start_time = "10:00"
+end_time = "12:00"
 # Step 1: Check availability
+is_available = check_availability(tomorrow, start_time, end_time)
 if is_available:
+    # Step 2: Make the appointment
+    appointment_details = make_appointment(tomorrow, start_time, end_time, "Meeting with thesis supervisor")
+    if appointment_details['appointment_made']:
+        # Step 3: Add to reminders
+        reminder_text = f"Appointment with thesis supervisor scheduled for {tomorrow} from {start_time} to {end_time}."
+        add_to_reminders(reminder_text)
+else:
+    appointment_details = {"day": tomorrow, "start_time": start_time, "end_time": end_time, "appointment_made": False}
 ```
 This code will first determine if the specified time slot is available tomorrow. If it is, it will attempt to make the appointment and then add it to the reminders if successful.
 Below are the BFCL results: evaluation results for ***Qwen2.5-Coder-3B-Instruct***, ***Dria-Agent-α-3B***, ***Dria-Agent-α-7B***, and ***gpt-4o-2024-11-20***
+| Metric                                | Qwen/Qwen2.5-3B-Instruct   | Dria-Agent-a-3B   | Dria-Agent-7B     | gpt-4o-2024-11-20 (Prompt)   |
+|---------------------------------------|----------------------------|-------------------|-------------------|---------------------------|
+| **Non-Live Simple AST**               | 75.50%                    | 75.08%           | 77.58%           | 79.42%                   |
+| **Non-Live Multiple AST**             | 90.00%                    | 93.00%           | 94.00%           | 95.50%                   |
+| **Non-Live Parallel AST**             | 80.00%                    | 85.00%           | 93.50%           | 94.00%                   |
+| **Non-Live Parallel Multiple AST**    | 78.50%                    | 79.00%           | 89.50%           | 83.50%                   |
+| **Non-Live Simple Exec**              | 82.07%                    | 87.57%           | 93.29%           | 100.00%                  |
+| **Non-Live Multiple Exec**            | 86.00%                    | 85.14%           | 88.00%           | 94.00%                   |
+| **Non-Live Parallel Exec**            | 82.00%                    | 90.00%           | 88.00%           | 86.00%                   |
+| **Non-Live Parallel Multiple Exec**   | 80.00%                    | 88.00%           | 72.50%           | 77.50%                   |
+| **Live Simple AST**                   | 68.22%                    | 70.16%           | 81.40%           | 83.72%                   |
+| **Live Multiple AST**                 | 66.00%                    | 67.14%           | 78.73%           | 79.77%                   |
+| **Live Parallel AST**                 | 62.50%                    | 50.00%           | 75.00%           | 87.50%                   |
+| **Live Parallel Multiple AST**        | 66.67%                    | 70.83%           | 62.50%           | 70.83%                   |
+| **Relevance Detection**               | 88.89%                    | 100.00%          | 100.00%          | 83.33%                   |
 and the MMLU-Pro and DPAB results: