pcuenq HF staff reach-vb HF staff commited on
Commit
6325870
·
verified ·
1 Parent(s): 123455c

Update README.md (#3)

Browse files

- Update README.md (7d207e4971d450a04237f3ae7644bb4eb01306bb)


Co-authored-by: Vaibhav Srivastav <[email protected]>

Files changed (1) hide show
  1. README.md +44 -0
README.md CHANGED
@@ -234,6 +234,50 @@ The Meta Llama 3.2 collection of multilingual large language models (LLMs) is a
234
 
235
  **Out of Scope:** Use in any manner that violates applicable laws or regulations (including trade compliance laws). Use in any other way that is prohibited by the Acceptable Use Policy and Llama 3.2 Community License. Use in languages beyond those explicitly referenced as supported in this model card.
236
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
237
  ## Hardware and Software
238
 
239
  **Training Factors:** We used custom training libraries, Meta's custom built GPU cluster, and production infrastructure for pretraining. Fine-tuning, annotation, and evaluation were also performed on production infrastructure.
 
234
 
235
  **Out of Scope:** Use in any manner that violates applicable laws or regulations (including trade compliance laws). Use in any other way that is prohibited by the Acceptable Use Policy and Llama 3.2 Community License. Use in languages beyond those explicitly referenced as supported in this model card.
236
 
237
+ ## How to use
238
+
239
+ This repository contains two versions of Llama-3.2-1B-Instruct, for use with `transformers` and with the original `llama` codebase.
240
+
241
+ ### Use with transformers
242
+
243
+ Starting with `transformers >= 4.43.0` onward, you can run conversational inference using the Transformers `pipeline` abstraction or by leveraging the Auto classes with the `generate()` function.
244
+
245
+ Make sure to update your transformers installation via `pip install --upgrade transformers`.
246
+
247
+ ```python
248
+ import torch
249
+ from transformers import pipeline
250
+
251
+ model_id = "meta-llama/Llama-3.2-1B-Instruct"
252
+ pipe = pipeline(
253
+ "text-generation",
254
+ model=model_id,
255
+ torch_dtype=torch.bfloat16,
256
+ device_map="auto",
257
+ )
258
+ messages = [
259
+ {"role": "system", "content": "You are a pirate chatbot who always responds in pirate speak!"},
260
+ {"role": "user", "content": "Who are you?"},
261
+ ]
262
+ outputs = pipe(
263
+ messages,
264
+ max_new_tokens=256,
265
+ )
266
+ print(outputs[0]["generated_text"][-1])
267
+ ```
268
+
269
+ Note: You can also find detailed recipes on how to use the model locally, with `torch.compile()`, assisted generations, quantised and more at [`huggingface-llama-recipes`](https://github.com/huggingface/huggingface-llama-recipes)
270
+
271
+ ### Use with `llama`
272
+
273
+ Please, follow the instructions in the [repository](https://github.com/meta-llama/llama)
274
+
275
+ To download Original checkpoints, see the example command below leveraging `huggingface-cli`:
276
+
277
+ ```
278
+ huggingface-cli download meta-llama/Llama-3.2-1B-Instruct --include "original/*" --local-dir Llama-3.2-1B-Instruct
279
+ ```
280
+
281
  ## Hardware and Software
282
 
283
  **Training Factors:** We used custom training libraries, Meta's custom built GPU cluster, and production infrastructure for pretraining. Fine-tuning, annotation, and evaluation were also performed on production infrastructure.