--- language: - en library_name: transformers license: other license_name: gemma-terms-of-use license_link: https://ai.google.dev/gemma/terms tags: - text-generation-inference extra_gated_heading: Access Gemma on Hugging Face extra_gated_prompt: To access Gemma on Hugging Face, you’re required to review and agree to Google’s usage license. To do this, please ensure you’re logged-in to Hugging Face and click below. Requests are processed immediately. extra_gated_button_content: Acknowledge license --- [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1kFznlPlWYOrcgd7Q1NI2tYMLH_vTRuys?usp=sharing) # gemma-2b-awq-int4 gemma-2b-awq-int4 is a version of the [2B base model](https://huggingface.co/google/gemma-2b) model that was quantized using the AWQ method developed by [Lin et al. (2023)](https://arxiv.org/abs/2306.00978). Please refer to the [Original Gemma Model Card](https://ai.google.dev/gemma/docs) for details about the model preparation and training processes. ## Dependencies - [`autoawq==0.2.5`](https://pypi.org/project/autoawq/0.2.5/) – [AutoAWQ](https://github.com/casper-hansen/AutoAWQ) was used to quantize the gemma-2b model. - [`vllm==0.4.2`](https://pypi.org/project/vllm/0.4.2/) – [vLLM](https://github.com/vllm-project/vllm) was used to host models for benchmarking.