Spaces:
Running
Running
| title: README | |
| emoji: 📚 | |
| colorFrom: blue | |
| colorTo: green | |
| sdk: static | |
| pinned: false | |
| # together we advance_AI | |
| AI is increasingly pervasive across the modern world. | |
| It’s driving our smart technology in retail, cities, factories and healthcare, | |
| and transforming our digital homes. | |
| AMD offers advanced AI acceleration from data center to edge, | |
| enabling high performance and high efficiency to make the world smarter. | |
| # Getting Started with Hugging Face Transformers | |
| AMD’s Ryzen™ AI family of laptop processors provide users with an integrated Neural Processing Unit (NPU) | |
| which offloads the host CPU and GPU from AI processing tasks. Ryzen™ AI software consists of the Vitis ™ AI | |
| execution provider (EP) for ONNX Runtime combined with quantization tools and a pre-optimized model zoo. | |
| All of this is made possible based on Ryzen™ AI technology built on AMD XDNA™ architecture, | |
| purpose-built to run AI workloads efficiently and locally, | |
| offering a host of benefits for the developer innovating the next groundbreaking AI app. Details on getting started | |
| with Hugging Face models are available on the [Optimum page](https://huggingface.co/docs/optimum/main/en/amd/index) | |
| The following section describes how to use the most common transformers on Hugging Face | |
| for inference workloads on select AMD Instinct™ accelerators and AMD Radeon™ GPUs using the AMD ROCm software ecosystem. | |
| This base knowledge can be leveraged to start fine-tuning from a base model or even start developing your own model. | |
| General Linux and ML experience is a required pre-requisite. | |
| ## 1. Confirm you have a supported AMD hardware platform | |
| Is my [hardware supported](https://rocm.docs.amd.com/en/latest/release/gpu_os_support.html#gpu-support-table) with ROCm? | |
| ## 2. Install ROCm driver, libraries and tools | |
| Follow the detailed [installation instructions](https://rocm.docs.amd.com/en/latest/deploy/linux/index.html) for your Linux based platform. | |
| ## 3. Install Machine Learning Frameworks | |
| Pip installation is an easy way to acquire all the required packages and is described in more detail below. | |
| >If you prefer to use a container strategy, check out the pre-built images at | |
| [ROCm Docker Hub](https://hub.docker.com/u/rocm/) | |
| and [AMD Infinity Hub](https://www.amd.com/en/technologies/infinity-hub) | |
| after installing the required [dependancies](https://rocm.docs.amd.com/en/latest/deploy/docker.html). | |
| ### PyTorch | |
| AMD ROCm is fully integrated into the mainline PyTorch ecosystem. Pip wheels are built and tested as part of the stable and nightly releases. | |
| Go to [pytorch.org](https://pytorch.org) and use the 'Install PyTorch' widget. | |
| Select 'Stable + Linux + Pip + Python + ROCm' to get the specific pip installation command. | |
| An example command line (note the versioning of the whl file): | |
| > `pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/rocm5.4.2` | |
| ### TensorFlow | |
| AMD ROCm is upstreamed into the TensorFlow github repository. | |
| Pre-built wheels are hosted on [pipy.org](https://pypi.org/project/tensorflow-rocm/) | |
| The latest version can be installed with this command: | |
| > `pip install tensorflow-rocm` | |
| ## 4. Use a Hugging Face Model | |
| Now that you have the base requirements installed, get the latest transformer models. | |
| > `pip install transformers` | |
| This allows you to easily import any of the base models into your python application. | |
| Here is an example using [GPT2](https://huggingface.co/gpt2) in PyTorch: | |
| ```python | |
| from transformers import GPT2Tokenizer, GPT2Model | |
| tokenizer = GPT2Tokenizer.from_pretrained('gpt2') | |
| model = GPT2Model.from_pretrained('gpt2') | |
| text = "Replace me by any text you'd like." | |
| encoded_input = tokenizer(text, return_tensors='pt') | |
| output = model(**encoded_input) | |
| ``` | |
| All of the 200+ standard transformer models are regularly tested with our supported hardware platforms. | |
| Note that this also implies that all derivatives of those core models should also function correctly. | |
| Let us know if you run into issues at our [ROCm Community page](https://github.com/RadeonOpenCompute/ROCm/discussions) | |
| Here are a few of the more popular ones to get you started: | |
| - [BERT](https://huggingface.co/bert-base-uncased) | |
| - [BLOOM](https://huggingface.co/bigscience/bloom) | |
| - [LLaMA](https://huggingface.co/huggyllama/llama-7b) | |
| - [OPT](https://huggingface.co/facebook/opt-66b) | |
| - [T5](https://huggingface.co/t5-base) | |
| Click on the 'Use in Transformers' button to see the exact code to import a specific model into your Python application. | |
| ## 5. Optimum Support | |
| For a deeper dive into using Hugging Face libraries on AMD GPUs, check out the [Optimum](https://huggingface.co/docs/optimum/main/en/amd/amdgpu/overview) page | |
| describing details on Flash Attention 2, GPTQ Quantization and ONNX Runtime integration. | |
| # Serving a model with TGI | |
| Text Generation Inference (a.k.a “TGI”) provides an end-to-end solution to deploy large language models for inference at scale. | |
| TGI is already usable in production on AMD Instinct™ GPUs through the docker image `ghcr.io/huggingface/text-generation-inference:1.2-rocm`. | |
| Make sure to refer to the [documentation](https://huggingface.co/docs/text-generation-inference/supported_models#supported-hardware) | |
| concerning the support and any limitations. | |
| # Benchmarking | |
| The [Optimum-Benchmark](https://github.com/huggingface/optimum-benchmark) is available as a utility to easily benchmark the performance of transformers on AMD GPUs, | |
| across normal and distributed settings, with various supported optimizations and quantization schemes. | |
| # Useful Links and Blogs | |
| - Detailed Llama-2 results show casing the [Optimum benchmark on AMD Instinct MI250](https://huggingface.co/blog/huggingface-and-optimum-amd) | |
| - Check out our blog titled [Run a Chatgpt-like Chatbot on a Single GPU with ROCm](https://huggingface.co/blog/chatbot-amd-gpu) | |
| - Complete ROCm [Documentation](https://rocm.docs.amd.com/en/latest/) for installation and usage | |
| - Extended training content and connect with the development community at the [Developer Hub](https://www.amd.com/en/developer/rocm-hub.html) | |