huihui-ai/DeepSeek-R1

This model converted from DeepSeek-R1 to BF16.
Here we simply provide the conversion command and related information about ollama.
If needed, we can upload the bf16 version.

FP8 to BF16

  1. Download deepseek-ai/DeepSeek-R1 model, requires approximately 641GB of space.
cd /home/admin/models
huggingface-cli download deepseek-ai/DeepSeek-R1 --local-dir ./deepseek-ai/DeepSeek-R1
  1. Create the environment.
conda create -yn DeepSeek-V3 python=3.12
conda activate DeepSeek-V3
pip install -r requirements.txt
  1. Convert to BF16, requires an additional approximately 1.3 TB of space.
    Here, you need to download the transformation code from the "inference" folder of deepseek-ai/DeepSeek-V3
cd deepseek-ai/DeepSeek-V3/inference
python fp8_cast_bf16.py --input-fp8-hf-path /home/admin/models/deepseek-ai/DeepSeek-R1/ --output-bf16-hf-path /home/admin/models/deepseek-ai/DeepSeek-R1-bf16

BF16 to f16.gguf

  1. Use the llama.cpp conversion program to convert DeepSeek-R1-bf16 to gguf format, requires an additional approximately 1.3 TB of space.
python convert_hf_to_gguf.py /home/admin/models/deepseek-ai/DeepSeek-R1-bf16 --outfile /home/admin/models/deepseek-ai/DeepSeek-R1-bf16/ggml-model-f16.gguf --outtype f16
  1. Use the llama.cpp quantitative program to quantitative model (llama-quantize needs to be compiled.), other quant option. Convert first Q2_K, requires an additional approximately 227 GB of space.
llama-quantize /home/admin/models/deepseek-ai/DeepSeek-R1-bf16/ggml-model-f16.gguf  /home/admin/models/deepseek-ai/DeepSeek-R1-bf16/ggml-model-Q2_K.gguf Q2_K
  1. Use llama-cli to test.
llama-cli -m /home/admin/models/deepseek-ai/DeepSeek-R1-bf16/ggml-model-Q2_K.gguf -n 2048

Use with ollama

Note: this model requires Ollama 0.5.5
Ollama will be uploaded soon in its directly runnable version.

Downloads last month
12
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and HF Inference API does not yet support model repos that contain custom code.