File size: 5,570 Bytes
1e21185 59fd3eb 1e21185 e7163e4 1e21185 fc98b4d 073a7ee d03ebcb fc98b4d 073a7ee f8dc2bf fc98b4d f8dc2bf fc98b4d f8dc2bf 073a7ee f8dc2bf fc98b4d 073a7ee fc98b4d f8dc2bf fc98b4d 073a7ee 995e282 fc98b4d 073a7ee fc98b4d 6986cac fc98b4d f8dc2bf 073a7ee f8dc2bf fc98b4d f8dc2bf fc98b4d 073a7ee f8dc2bf fc98b4d 995e282 fc98b4d 073a7ee fc98b4d 995e282 307ac22 073a7ee fc08c57 fc98b4d 995e282 fc98b4d 031c6fa 22fea2f 031c6fa da0624f 59fd3eb da0624f fc98b4d 073a7ee fc98b4d f8dc2bf 073a7ee f8dc2bf fc98b4d 073a7ee fc98b4d |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 |
---
license: cc-by-nc-4.0
language:
- en
metrics:
- accuracy
base_model:
- meta-llama/Llama-3.1-405B-Instruct
pipeline_tag: text-generation
---
# CoALM-405B: The Largest Open-Source Agentic LLM
[](https://github.com/oumi-ai/oumi)
## π Model Overview
**CoALM-405B** is the **largest fully open-source Conversational Agentic Language Model**. This model sets a new standard in **Conversational AI**, seamlessly integrating both **Task-Oriented Dialogue (TOD) capabilities** and **Language Agent (LA) functionalities**.
It is designed to **push the boundaries** of open-source agentic LLMs, excelling at **multi-turn dialogue, tool usage, reasoning, and API execution**. It is the **best-performing fully open-source LLM** on the **Berkeley Function Calling Leaderboard V3 (BFCL V3)**, marking a leap in open-source AI research.
## Model Sources
<!-- Provide the basic links for the model. -->
- π **Paper:** https://arxiv.org/abs/2502.08820
- π **Project Page:** https://emrecanacikgoz.github.io/CoALM/
- π» **Repository:** https://github.com/oumi-ai/oumi/tree/main/configs/projects/CALM
- π **Dataset:** https://huggingface.co/datasets/uiuc-convai/CoALM-IT
---
## π Model Details
- **Model Name:** CoALM-405B
- **Developed by:** Colloboration of UIUC Conversational AI LAB and Oumi
- **License:** cc-by-nc-4.0
- **Architecture:** Meta-Llama 3.1-405B Instruct
- **Training Data:** CoALM-IT
- **Fine-tuning Framework:** [Oumi](https://github.com/oumi-ai/oumi)
- **Training Hardware:** 8 NVIDIA H100 GPUs
- **Training Duration:** ~6.5 days
- **Evaluation Benchmarks:** MultiWOZ 2.4, BFCL V3, API-Bank
- **Release Date:** February 5, 2025
---
## π Why CoALM-405B is a Game-Changer
- **π¨ Largest Open-Source Agentic LLM:** A **405B** parameter model that brings state-of-the-art agentic capabilities to the public domain.
- **π― Best Open-Source Performance on BFCL V3:** Outperforms leading proprietary models like **GPT-4o, Gemini, and Claude** in function-calling tasks.
- **π True Zero-Shot Function Calling:** Generalizes to unseen API tasks with **unmatched accuracy**.
- **π€ Multi-Turn Dialogue Mastery:** Excels at long conversations, **task tracking, and complex reasoning**.
- **π API Tool Use and Reasoning:** Makes precise API calls, interprets responses, and synthesizes **coherent** multi-step solutions.
- **π Fully Open-Source & Reproducible:** Released under **cc-by-nc-4.0**, including model weights, training logs, and datasets.
## π‘ CoALM-IT Dataset
<img src="table.png" alt="CALM-IT Dataset Statistics" width="800"/>
---
## π Benchmark Performance
<img src="results.png" alt="CALM-IT Dataset Statistics" width="1000"/>
---
## π§ Training Process
### Fine-tuning Stages
1. **TOD Fine-tuning:** Optimized for **dialogue state tracking** (e.g., augmented SNIPS in instruction-tuned format).
2. **Function Calling Fine-tuning:** Trained to generate **highly accurate API calls** from LA datasets.
3. **ReAct-based Fine-tuning:** Enhances multi-turn conversations with structured **thought-action-observation-response reasoning**.
### Training Hyperparameters
- **Base Model:** Meta-Llama 3.1-405B Instruct
- **LoRA Config:** Rank = 16, Scaling Factor = 32
- **Batch Size:** 2
- **Learning Rate:** 1e-4
- **Optimizer:** AdamW (betas = 0.9, 0.999, epsilon = 1e-8)
- **Precision:** q4
- **Warm-up Steps:** 500
- **Gradient Accumulation Steps:** 1
---
## βοΈ How to Use CoALM-405B
It requires 16xH100 NVIDIA GPUs for Inference.
### π How to Load the Model using HuggingFace
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained("uiuc-convai/CoALM-8B")
model = AutoModelForCausalLM.from_pretrained("uiuc-convai/CoALM-8B")
```
### π Example Oumi Inference
Oumi multi-node inference support is under development.
CoALM-405B likely requires multi-node inference as most single nodes support up to 640GB of GPU VRAM.
To run multi-node inference, we recommend [vLLM](https://docs.vllm.ai/en/latest/serving/distributed_serving.html).
### π Example Oumi Fine-Tuning
```bash
pip install oumi
# See oumi_train.yaml in this model's /oumi/ directory.
oumi train -c ./oumi_train.yaml
```
More fine-tuning and **community-driven** optimizations are planned to enhance real-world usability.
## Acknowledgements
We'd like to thank the [Oumi AI Team](https://github.com/oumi-ai/oumi) for collaborating on training the models using the Oumi platform on [Together AI's](https://www.together.ai/) cloud.
## License
This model is licensed under [Creative Commons NonCommercial (CC BY-NC 4.0)](https://creativecommons.org/licenses/by-nc/4.0/legalcode).
---
## π Citation
If you use **CoALM-405B** in your research, please cite:
```
@misc{acikgoz2025singlemodelmastermultiturn,
title={Can a Single Model Master Both Multi-turn Conversations and Tool Use? CoALM: A Unified Conversational Agentic Language Model},
author={Emre Can Acikgoz and Jeremiah Greer and Akul Datta and Ze Yang and William Zeng and Oussama Elachqar and Emmanouil Koukoumidis and Dilek Hakkani-TΓΌr and Gokhan Tur},
year={2025},
eprint={2502.08820},
archivePrefix={arXiv},
primaryClass={cs.AI},
url={https://arxiv.org/abs/2502.08820},
}
```
For more details, visit [Project Repository](https://github.com/oumi-ai/oumi/tree/main/configs/projects/CALM) or contact **[email protected]**.
|