davidshtian commited on
Commit
116fc0b
·
verified ·
1 Parent(s): 9695376

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +51 -0
README.md ADDED
@@ -0,0 +1,51 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ pipeline_tag: text-generation
5
+ inference: false
6
+ tags:
7
+ - mistral
8
+ - inferentia2
9
+ - neuron
10
+ - neuronx
11
+ ---
12
+ # Neuronx for [mistralai/Mistral-7B-Instruct-v0.2](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2) - Updated Mistral 7B Model on [AWS Inferentia2](https://aws.amazon.com/ec2/instance-types/inf2/)
13
+
14
+ This model has been exported to the `neuron` format using specific `input_shapes` and `compiler` parameters detailed in the paragraphs below.
15
+
16
+ Please refer to the 🤗 `optimum-neuron` [documentation](https://huggingface.co/docs/optimum-neuron/main/en/guides/models#configuring-the-export-of-a-generative-model) for an explanation of these parameters.
17
+
18
+ Note: To compile the mistralai/Mistral-7B-Instruct-v0.2 on Inf2, you need to update the model config sliding_window (either file or model variable) from null to default 4096.
19
+
20
+ ## Usage with 🤗 `optimum-neuron`
21
+
22
+ ```python
23
+ >>> from optimum.neuron import pipeline
24
+
25
+ >>> p = pipeline('text-generation', 'davidshtian/Mistral-7B-Instruct-v0.2-neuron-1x2048-2-cores')
26
+ >>> p("My favorite place on earth is", max_new_tokens=256, do_sample=True, top_k=50)
27
+ [{'generated_text': 'My favorite place on earth is the ocean. It is where I feel most
28
+ at peace. I love to travel and see new places. I have a'}]
29
+ ```
30
+
31
+ This repository contains tags specific to versions of `neuronx`. When using with 🤗 `optimum-neuron`, use the repo revision specific to the version of `neuronx` you are using, to load the right serialized checkpoints.
32
+
33
+ ## Arguments passed during export
34
+
35
+ **input_shapes**
36
+
37
+ ```json
38
+ {
39
+ "batch_size": 1,
40
+ "sequence_length": 2048,
41
+ }
42
+ ```
43
+
44
+ **compiler_args**
45
+
46
+ ```json
47
+ {
48
+ "auto_cast_type": "bf16",
49
+ "num_cores": 2,
50
+ }
51
+ ```