{ "cells": [ { "attachments": {}, "cell_type": "markdown", "id": "190e8e4c-461f-4521-ae7f-3491fa827ab7", "metadata": { "tags": [] }, "source": [ "# Automatic Device Selection with OpenVINO™\n", "\n", "The [Auto device](https://docs.openvino.ai/2024/openvino-workflow/running-inference/inference-devices-and-modes/auto-device-selection.html) (or AUTO in short) selects the most suitable device for inference by considering the model precision, power efficiency and processing capability of the available [compute devices](https://docs.openvino.ai/2024/about-openvino/compatibility-and-support/supported-devices.html). The model precision (such as `FP32`, `FP16`, `INT8`, etc.) is the first consideration to filter out the devices that cannot run the network efficiently.\n", "\n", "Next, if dedicated accelerators are available, these devices are preferred (for example, integrated and discrete [GPU](https://docs.openvino.ai/2024/openvino-workflow/running-inference/inference-devices-and-modes/gpu-device.html)). [CPU](https://docs.openvino.ai/2024/openvino-workflow/running-inference/inference-devices-and-modes/cpu-device.html) is used as the default \"fallback device\". Keep in mind that AUTO makes this selection only once, during the loading of a model. \n", "\n", "When using accelerator devices such as GPUs, loading models to these devices may take a long time. To address this challenge for applications that require fast first inference response, AUTO starts inference immediately on the CPU and then transparently shifts inference to the GPU, once it is ready. This dramatically reduces the time to execute first inference.\n", "\n", "\n", "\n", "\n", "\n", "#### Table of contents:\n", "\n", "- [Import modules and create Core](#Import-modules-and-create-Core)\n", "- [Convert the model to OpenVINO IR format](#Convert-the-model-to-OpenVINO-IR-format)\n", "- [(1) Simplify selection logic](#(1)-Simplify-selection-logic)\n", " - [Default behavior of Core::compile_model API without device_name](#Default-behavior-of-Core::compile_model-API-without-device_name)\n", " - [Explicitly pass AUTO as device_name to Core::compile_model API](#Explicitly-pass-AUTO-as-device_name-to-Core::compile_model-API)\n", "- [(2) Improve the first inference latency](#(2)-Improve-the-first-inference-latency)\n", " - [Load an Image](#Load-an-Image)\n", " - [Load the model to GPU device and perform inference](#Load-the-model-to-GPU-device-and-perform-inference)\n", " - [Load the model using AUTO device and do inference](#Load-the-model-using-AUTO-device-and-do-inference)\n", "- [(3) Achieve different performance for different targets](#(3)-Achieve-different-performance-for-different-targets)\n", " - [Class and callback definition](#Class-and-callback-definition)\n", " - [Inference with THROUGHPUT hint](#Inference-with-THROUGHPUT-hint)\n", " - [Inference with LATENCY hint](#Inference-with-LATENCY-hint)\n", " - [Difference in FPS and latency](#Difference-in-FPS-and-latency)\n", "\n" ] }, { "attachments": {}, "cell_type": "markdown", "id": "fcfc461c", "metadata": {}, "source": [ "## Import modules and create Core\n", "[back to top ⬆️](#Table-of-contents:)\n" ] }, { "cell_type": "code", "execution_count": 1, "id": "967c128a", "metadata": { "tags": [] }, "outputs": [], "source": [ "import platform\n", "\n", "# Install required packages\n", "%pip install -q \"openvino>=2023.1.0\" Pillow torch torchvision tqdm --extra-index-url https://download.pytorch.org/whl/cpu\n", "\n", "if platform.system() != \"Windows\":\n", " %pip install -q \"matplotlib>=3.4\"\n", "else:\n", " %pip install -q \"matplotlib>=3.4,<3.7\"" ] }, { "cell_type": "code", "execution_count": 2, "id": "6c7f9f06", "metadata": { "tags": [] }, "outputs": [], "source": [ "import time\n", "import sys\n", "\n", "import openvino as ov\n", "\n", "from IPython.display import Markdown, display\n", "\n", "core = ov.Core()\n", "\n", "if not any(\"GPU\" in device for device in core.available_devices):\n", " display(\n", " Markdown(\n", " '