CoALM-405B / README.md
Nessii013's picture
Update README.md
995e282 verified
|
raw
history blame
4.45 kB
metadata
license: cc-by-nc-4.0
language:
  - en
metrics:
  - accuracy
base_model:
  - meta-llama/Llama-3.1-405B-Instruct
pipeline_tag: text-generation

CALM-405B: The Largest Open-Source Agentic LLM

🌟 Model Overview

CALM-405B is the largest open-source Conversational Agentic Language Model (LLM) ever created. This model sets a new standard in Conversational AI, seamlessly integrating both Task-Oriented Dialogue (TOD) capabilities and Language Agent (LA) functionalities. It is designed to push the boundaries of open-source agentic LLMs, excelling at multi-turn dialogue, tool usage, reasoning, and API execution. It is the best-performing fully open-source LLM on the Berkeley Function Calling Leaderboard V3 (BFCL V3), marking a historic leap in open-source AI research.

Model Sources [TODO]

  • Paper: [More Information Needed]
  • Repository: [More Information Needed]

πŸš€ Model Details

  • Model Name: CALM-405B
  • Developed by: Colloboration of UIUC Conversational AI LAB and Oumi
  • License: Apache 2.0
  • Architecture: Meta-Llama 3.1-405B Instruct
  • Training Data: CALM-IT
  • Fine-tuning Framework: Oumi
  • Training Hardware: 8 NVIDIA H100 GPUs
  • Training Duration: ~6.5 days
  • Evaluation Benchmarks: MultiWOZ 2.4, BFCL V3, API-Bank
  • Release Date: February 5, 2025

πŸ† Why CALM-405B is a Game-Changer

  • 🚨 Largest Open-Source Agentic LLM: A 405B parameter model that brings state-of-the-art agentic capabilities to the public domain.
  • 🎯 Best Open-Source Performance on BFCL V3: Outperforms leading proprietary models like GPT-4o, Gemini, and Claude in function-calling tasks.
  • πŸ” True Zero-Shot Function Calling: Generalizes to unseen API tasks with unmatched accuracy.
  • πŸ€– Multi-Turn Dialogue Mastery: Excels at long conversations, task tracking, and complex reasoning.
  • πŸ›  API Tool Use and Reasoning: Makes precise API calls, interprets responses, and synthesizes coherent multi-step solutions.
  • πŸ“œ Fully Open-Source & Reproducible: Released under Apache 2.0, including model weights, training logs, and datasets.

πŸ“Š Benchmark Performance

TODO: Add BFCL results


πŸ”§ Training Process

Fine-tuning Stages

  1. TOD Fine-tuning: Optimized for dialogue state tracking (e.g., augmented SNIPS in instruction-tuned format).
  2. Function Calling Fine-tuning: Trained to generate highly accurate API calls from LA datasets.
  3. ReAct-based Fine-tuning: Enhances multi-turn conversations with structured thought-action-observation-response reasoning.

Training Hyperparameters

  • Base Model: Meta-Llama 3.1-405B Instruct
  • LoRA Config: Rank = 16, Scaling Factor = 32
  • Batch Size: 2
  • Learning Rate: 1e-4
  • Optimizer: AdamW (betas = 0.9, 0.999, epsilon = 1e-8)
  • Precision: q4
  • Warm-up Steps: 500
  • Gradient Accumulation Steps: 1

πŸ’‘ How to Use CALM-405B

🚨 It requires 16xH100 NVIDIA GPUs for Inference.

πŸ— How to Load the Model using HuggingFace

from transformers import AutoModelForCausalLM, AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("uiuc-convai/CALM-8B")
model = AutoModelForCausalLM.from_pretrained("uiuc-convai/CALM-8B")

πŸ›  Example Oumi Inference

CALM-405B likely requires multi-node inference as most single nodes support up to 640GB of GPU VRAM. To run multi-node inference, we recommend vLLM

πŸ›  Example Oumi Fine-Tuning

pip install oumi

# See oumi_train.yaml in this model's /oumi/ directory.
oumi train -c ./oumi_train.yaml

More fine-tuning and community-driven optimizations are planned to enhance real-world usability.

License

This model is licensed under Creative Commons NonCommercial (CC BY-NC 4.0).


πŸ“– Citation

If you use CALM-405B in your research, please cite:

@article{yourpaper2024,
  title={CALM: Conversational Agentic Language Model},
  author={Your Name and Collaborators},
  journal={Your Conference/Journal},
  year={2024}
}

For more details, visit Project Repository or contact [email protected].