Iker commited on
Commit
ad85b8c
·
unverified ·
1 Parent(s): d54a92e
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -29,7 +29,7 @@ We currently support:
29
  - BF16 / FP16 / FP32 / 8 Bits / 4 Bits precision.
30
  - Automatic batch size finder: Forget CUDA OOM errors. Set an initial batch size, if it doesn't fit, we will automatically adjust it.
31
  - Multiple decoding strategies: Greedy Search, Beam Search, Top-K Sampling, Top-p (nucleus) sampling, etc. See [Decoding Strategies](#decodingsampling-strategies) for more information.
32
- - :new: Load huge models in a single with GPU 8-bits / 4-bits quantization and support for splitting the model between GPU and CPU. See [Loading Huge Models](#loading-huge-models) for more information.
33
  - :new: LoRA models support
34
  - :new: Support for any Seq2SeqLM or CausalLM model from HuggingFace's Hub.
35
  - :new: Prompt support! See [Prompting](#prompting) for more information.
 
29
  - BF16 / FP16 / FP32 / 8 Bits / 4 Bits precision.
30
  - Automatic batch size finder: Forget CUDA OOM errors. Set an initial batch size, if it doesn't fit, we will automatically adjust it.
31
  - Multiple decoding strategies: Greedy Search, Beam Search, Top-K Sampling, Top-p (nucleus) sampling, etc. See [Decoding Strategies](#decodingsampling-strategies) for more information.
32
+ - :new: Load huge models in a single GPU with 8-bits / 4-bits quantization and support for splitting the model between GPU and CPU. See [Loading Huge Models](#loading-huge-models) for more information.
33
  - :new: LoRA models support
34
  - :new: Support for any Seq2SeqLM or CausalLM model from HuggingFace's Hub.
35
  - :new: Prompt support! See [Prompting](#prompting) for more information.