Inference Providers documentation

Z.ai

Hugging Face's logo
Join the Hugging Face community

and get access to the augmented documentation experience

to get started

Z.ai

All supported Z.ai models can be found here

Z.ai is an AI platform that provides cutting-edge large language models powered by GLM series. Their flagship models feature Mixture-of-Experts (MoE) architecture with advanced reasoning, coding, and agentic capabilities.

For latest pricing, visit the pricing page.

Resources

Supported tasks

Chat Completion (LLM)

Find out more about Chat Completion (LLM) here.

import os
from huggingface_hub import InferenceClient

client = InferenceClient(
    provider="zai-org",
    api_key=os.environ["HF_TOKEN"],
)

completion = client.chat.completions.create(
    model="zai-org/GLM-4.6",
    messages=[
        {
            "role": "user",
            "content": "What is the capital of France?"
        }
    ],
)

print(completion.choices[0].message)
< > Update on GitHub