Safetensors
English
llama
emrecanacikgoz commited on
Commit
ee09ff5
Β·
verified Β·
1 Parent(s): a7048c1

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +14 -14
README.md CHANGED
@@ -8,14 +8,14 @@ base_model:
8
  - meta-llama/Llama-3.3-70B-Instruct
9
  ---
10
 
11
- # CALM-70B: Conversational Agentic Language Model
12
 
13
  [![Made with Oumi](https://badgen.net/badge/Made%20with/Oumi/%23085CFF?icon=https%3A%2F%2Foumi.ai%2Flogo_dark.svg)](https://github.com/oumi-ai/oumi)
14
 
15
  ## Model Description
16
- **CALM-70B** is our middle scale **Conversational Agentic Language Model**, designed to integrate **Task-Oriented Dialogue (TOD) capabilities** with **Language Agent (LA) functionalities** at a **larger scale** than its predecessor CALM-8B. By leveraging **CALM-IT**, a multi-task dataset interleaving **multi-turn ReAct reasoning** with **complex API usage**, CALM-70B achieves **state-of-the-art performance** across TOD and function-calling benchmarks.
17
 
18
- CALM-70B has been fine-tuned on a **comprehensive multi-tasking** covering dialogue state tracking, function calling, and multi-turn reasoning, surpassing even proprietary models like **GPT-4o** on major conversational evaluation benchmarks: **MultiWOZ 2.4 (TOD), BFCL V3 (LA), and API-Bank (LA).**
19
 
20
 
21
  ## Model Sources
@@ -23,20 +23,20 @@ CALM-70B has been fine-tuned on a **comprehensive multi-tasking** covering dialo
23
  <!-- Provide the basic links for the model. -->
24
 
25
  - πŸ“ **Paper:** https://arxiv.org/abs/2502.08820
26
- - 🌐 **Project Page:** https://emrecanacikgoz.github.io/CALM/
27
- - πŸ’» **Repository:** https://github.com/oumi-ai/oumi/tree/main/configs/projects/calm
28
- - πŸ’Ž **Dataset:** https://huggingface.co/datasets/uiuc-convai/CALM-IT
29
 
30
 
31
  ---
32
  ## Model Details
33
 
34
- - **Model Name:** CALM-70B
35
  - **Developed by:** Colloboration of UIUC Conversational AI LAB and Oumi
36
  - **License:** cc-by-nc-4.0
37
  - **Architecture:** Fine-tuned **Llama 3.3 70B Instruct**
38
  - **Parameter Count:** 70B
39
- - **Training Data:** CALM-IT
40
  - **Training Type:** Full Fine-tunning (FFT)
41
  - **Fine-tuning Framework:** [Oumi](https://github.com/oumi-ai/oumi)
42
  - **Training Hardware:** 8 NVIDIA H100 GPUs
@@ -78,7 +78,7 @@ CALM-70B has been fine-tuned on a **comprehensive multi-tasking** covering dialo
78
  ---
79
 
80
 
81
- ## πŸ’‘ CALM-IT Dataset
82
  <img src="table.png" alt="CALM-IT Dataset Statistics" width="800"/>
83
 
84
 
@@ -93,8 +93,8 @@ CALM-70B has been fine-tuned on a **comprehensive multi-tasking** covering dialo
93
  ```python
94
  from transformers import AutoModelForCausalLM, AutoTokenizer
95
 
96
- tokenizer = AutoTokenizer.from_pretrained("uiuc-convai/CALM-70B")
97
- model = AutoModelForCausalLM.from_pretrained("uiuc-convai/CALM-70B")
98
  ```
99
 
100
  ### πŸ›  Example Oumi Inference
@@ -115,7 +115,7 @@ oumi train -c ./oumi_train.yaml
115
 
116
 
117
  ---
118
- - **Scalability to CALM-405B:** Next iteration will extend capabilities for even larger-scale conversations.
119
  - **Continuous Open-Source Expansion:** Ongoing release of datasets, model weights, and training artifacts to foster community research.
120
 
121
  ---
@@ -127,10 +127,10 @@ This model is licensed under [Creative Commons NonCommercial (CC BY-NC 4.0)](htt
127
 
128
  ---
129
  ## Citation
130
- If you use **CALM-70B** in your research, please cite:
131
  ```
132
  @misc{acikgoz2025singlemodelmastermultiturn,
133
- title={Can a Single Model Master Both Multi-turn Conversations and Tool Use? CALM: A Unified Conversational Agentic Language Model},
134
  author={Emre Can Acikgoz and Jeremiah Greer and Akul Datta and Ze Yang and William Zeng and Oussama Elachqar and Emmanouil Koukoumidis and Dilek Hakkani-TΓΌr and Gokhan Tur},
135
  year={2025},
136
  eprint={2502.08820},
 
8
  - meta-llama/Llama-3.3-70B-Instruct
9
  ---
10
 
11
+ # CoALM-70B: Conversational Agentic Language Model
12
 
13
  [![Made with Oumi](https://badgen.net/badge/Made%20with/Oumi/%23085CFF?icon=https%3A%2F%2Foumi.ai%2Flogo_dark.svg)](https://github.com/oumi-ai/oumi)
14
 
15
  ## Model Description
16
+ **CoALM-70B** is our middle scale **Conversational Agentic Language Model**, designed to integrate **Task-Oriented Dialogue (TOD) capabilities** with **Language Agent (LA) functionalities** at a **larger scale** than its predecessor CoALM-8B. By leveraging **CoALM-IT**, a multi-task dataset interleaving **multi-turn ReAct reasoning** with **complex API usage**, CoALM-70B achieves **state-of-the-art performance** across TOD and function-calling benchmarks.
17
 
18
+ CoALM-70B has been fine-tuned on a **comprehensive multi-tasking** covering dialogue state tracking, function calling, and multi-turn reasoning, surpassing even proprietary models like **GPT-4o** on major conversational evaluation benchmarks: **MultiWOZ 2.4 (TOD), BFCL V3 (LA), and API-Bank (LA).**
19
 
20
 
21
  ## Model Sources
 
23
  <!-- Provide the basic links for the model. -->
24
 
25
  - πŸ“ **Paper:** https://arxiv.org/abs/2502.08820
26
+ - 🌐 **Project Page:** https://emrecanacikgoz.github.io/CoALM/
27
+ - πŸ’» **Repository:** https://github.com/oumi-ai/oumi/tree/main/configs/projects/CALM
28
+ - πŸ’Ž **Dataset:** https://huggingface.co/datasets/uiuc-convai/CoALM-IT
29
 
30
 
31
  ---
32
  ## Model Details
33
 
34
+ - **Model Name:** CoALM-70B
35
  - **Developed by:** Colloboration of UIUC Conversational AI LAB and Oumi
36
  - **License:** cc-by-nc-4.0
37
  - **Architecture:** Fine-tuned **Llama 3.3 70B Instruct**
38
  - **Parameter Count:** 70B
39
+ - **Training Data:** CoALM-IT
40
  - **Training Type:** Full Fine-tunning (FFT)
41
  - **Fine-tuning Framework:** [Oumi](https://github.com/oumi-ai/oumi)
42
  - **Training Hardware:** 8 NVIDIA H100 GPUs
 
78
  ---
79
 
80
 
81
+ ## πŸ’‘ CoALM-IT Dataset
82
  <img src="table.png" alt="CALM-IT Dataset Statistics" width="800"/>
83
 
84
 
 
93
  ```python
94
  from transformers import AutoModelForCausalLM, AutoTokenizer
95
 
96
+ tokenizer = AutoTokenizer.from_pretrained("uiuc-convai/CoALM-70B")
97
+ model = AutoModelForCausalLM.from_pretrained("uiuc-convai/CoALM-70B")
98
  ```
99
 
100
  ### πŸ›  Example Oumi Inference
 
115
 
116
 
117
  ---
118
+ - **Scalability to CoALM-405B:** Next iteration will extend capabilities for even larger-scale conversations.
119
  - **Continuous Open-Source Expansion:** Ongoing release of datasets, model weights, and training artifacts to foster community research.
120
 
121
  ---
 
127
 
128
  ---
129
  ## Citation
130
+ If you use **CoALM-70B** in your research, please cite:
131
  ```
132
  @misc{acikgoz2025singlemodelmastermultiturn,
133
+ title={Can a Single Model Master Both Multi-turn Conversations and Tool Use? CoALM: A Unified Conversational Agentic Language Model},
134
  author={Emre Can Acikgoz and Jeremiah Greer and Akul Datta and Ze Yang and William Zeng and Oussama Elachqar and Emmanouil Koukoumidis and Dilek Hakkani-TΓΌr and Gokhan Tur},
135
  year={2025},
136
  eprint={2502.08820},