File size: 462 Bytes
5fa1a76
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
from transformers import AutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained(
     "mistralai/Mistral-7B-v0.1", device_map="auto", load_in_4bit=True
 )

You'll notice two flags in the from_pretrained call:

device_map ensures the model is moved to your GPU(s)
load_in_4bit applies 4-bit dynamic quantization to massively reduce the resource requirements

There are other ways to initialize a model, but this is a good baseline to begin with an LLM.