YAML Metadata
Warning:
empty or missing yaml metadata in repo card
(https://huggingface.co/docs/hub/model-cards#model-card-metadata)
This is the Deepseek-R1-Distill-Qwen-7B model, convert to OpenVINO with symmetric channel-wise INT4 weight compression.
To run inference on this model, install openvino-genai (pip install openvino-genai
) and run [llm_chat_deepseek.py(https://gist.github.com/helena-intel/554fba91f380df590ecc9245abdad33f)
Step-by-step instructions for best results:
pip install --pre --upgrade openvino openvino-genai openvino-tokenizers --extra-index-url https://storage.openvinotoolkit.org/simple/wheels/nightly
pip install huggingface-hub
huggingface-cli download helenai/DeepSeek-R1-Distill-Qwen-7B-ov-int4 --local-dir DeepSeek-R1-Distill-Qwen-7B-ov-int4
curl -O https://gist.githubusercontent.com/helena-intel/554fba91f380df590ecc9245abdad33f/raw/04f495164482823aa7e6ba1119a5c43e275d08f5/llm_chat_deepseek.py
python llm_chat_deepseek.py DeepSeek-R1-Distill-Qwen-7B-ov-int4 GPU
The last line specifies the device to run inference. GPU is recommended for recent Intel laptops with integrated graphics, or for Intel discrete graphics. Change to CPU if you do not have an Intel GPU, or to NPU if you have a system with an Intel NPU
Gradio chatbot notebook using this model: https://gist.github.com/helena-intel/69e1c2921a2bcb618fdd7cdfb0bd0202
- Downloads last month
- 0
Inference Providers
NEW
This model is not currently available via any of the supported third-party Inference Providers, and
HF Inference API was unable to determine this model's library.