CodeLlama-7b-mlx / README.md
reach-vb's picture
reach-vb HF staff
28893fc8b6102b0be03e6133425da5cef948d9e51c0983de9e6b5802ad6588d3
4ac3421
|
raw
history blame
1.25 kB
metadata
pipeline_tag: text-generation
inference: false
tags:
  - facebook
  - meta
  - llama
  - llama-2
  - codellama
  - mlx

CodeLlama

Code Llama is a collection of pretrained and fine-tuned generative text models ranging in scale from 7 billion to 34 billion parameters. This model is designed for general code synthesis and understanding. This is the repository for the 7B base model, in npz format suitable for use in Apple's MLX framework.

Weights have been converted to float16 from the original bfloat16 type, because numpy is not compatible with bfloat16 out of the box.

How to use with MLX.


# Install mlx, mlx-examples, huggingface-cli
pip install mlx
pip install huggingface_hub hf_transfer
git clone https://github.com/ml-explore/mlx-examples.git

# Download model
export HF_HUB_ENABLE_HF_TRANSFER=1
huggingface-cli download --local-dir models --local-dir-use-symlinks False mlx-llama/CodeLlama-7b-mlx CodeLlama-7b-mlx

# Run example
python mlx-examples/llama/llama.py CodeLlama-7b-mlx CodeLlama-7b-mlx/tokenizer.model "My name is "

Please, refer to the original model card for details on CodeLlama.