File size: 32,503 Bytes
db5855f
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
{
 "cells": [
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# OpenVINO™ Model conversion\n",
    "\n",
    "This notebook shows how to convert a model from original framework format to OpenVINO Intermediate Representation (IR).\n",
    "\n",
    "\n",
    "#### Table of contents:\n",
    "\n",
    "- [OpenVINO IR format](#OpenVINO-IR-format)\n",
    "- [Fetching example models](#Fetching-example-models)\n",
    "- [Conversion](#Conversion)\n",
    "    - [Setting Input Shapes](#Setting-Input-Shapes)\n",
    "    - [Compressing a Model to FP16](#Compressing-a-Model-to-FP16)\n",
    "    - [Convert Models from memory](#Convert-Models-from-memory)\n",
    "- [Migration from Legacy conversion API](#Migration-from-Legacy-conversion-API)\n",
    "    - [Specifying Layout](#Specifying-Layout)\n",
    "    - [Changing Model Layout](#Changing-Model-Layout)\n",
    "    - [Specifying Mean and Scale Values](#Specifying-Mean-and-Scale-Values)\n",
    "    - [Reversing Input Channels](#Reversing-Input-Channels)\n",
    "    - [Cutting Off Parts of a Model](#Cutting-Off-Parts-of-a-Model)\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Required imports. Please execute this cell first.\n",
    "%pip install --upgrade pip\n",
    "%pip install -q --extra-index-url https://download.pytorch.org/whl/cpu \\\n",
    "\"openvino-dev>=2024.0.0\" \"requests\" \"tqdm\" \"transformers[onnx]>=4.31\" \"torch>=2.1\" \"torchvision\" \"tensorflow_hub\" \"tensorflow\""
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## OpenVINO IR format\n",
    "[back to top ⬆️](#Table-of-contents:)\n",
    "\n",
    "OpenVINO [Intermediate Representation (IR)](https://docs.openvino.ai/2024/documentation/openvino-ir-format.html) is the proprietary model format of OpenVINO. It is produced after converting a model with model conversion API. Model conversion API translates the frequently used deep learning operations to their respective similar representation in OpenVINO and tunes them with the associated weights and biases from the trained model. The resulting IR contains two files: an `.xml` file, containing information about network topology, and a `.bin` file, containing the weights and biases binary data.\n",
    "\n",
    "There are two ways to convert a model from the original framework format to OpenVINO IR: Python conversion API and OVC command-line tool. You can choose one of them based on whichever is most convenient for you.\n",
    "\n",
    "OpenVINO conversion API supports next model formats: `PyTorch`, `TensorFlow`, `TensorFlow Lite`, `ONNX`, and `PaddlePaddle`. These model formats can be read, compiled, and converted to OpenVINO IR, either automatically or explicitly.\n",
    "\n",
    " For more details, refer to [Model Preparation](https://docs.openvino.ai/2024/openvino-workflow/model-preparation.html) documentation."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "usage: ovc INPUT_MODEL... [-h] [--output_model OUTPUT_MODEL]\n",
      "           [--compress_to_fp16 [True | False]] [--version] [--input INPUT]\n",
      "           [--output OUTPUT] [--extension EXTENSION] [--verbose]\n",
      "\n",
      "positional arguments:\n",
      "  INPUT_MODEL           Input model file(s) from TensorFlow, ONNX,\n",
      "                        PaddlePaddle. Use openvino.convert_model in Python to\n",
      "                        convert models from PyTorch.\n",
      "\n",
      "optional arguments:\n",
      "  -h, --help            show this help message and exit\n",
      "  --output_model OUTPUT_MODEL\n",
      "                        This parameter is used to name output .xml/.bin files\n",
      "                        with converted model.\n",
      "  --compress_to_fp16 [True | False]\n",
      "                        Compress weights in output OpenVINO model to FP16. To\n",
      "                        turn off compression use \"--compress_to_fp16=False\"\n",
      "                        command line parameter. Default value is True.\n",
      "  --version             Print ovc version and exit.\n",
      "  --input INPUT         Information of model input required for model\n",
      "                        conversion. This is a comma separated list with\n",
      "                        optional input names and shapes. The order of inputs\n",
      "                        in converted model will match the order of specified\n",
      "                        inputs. The shape is specified as comma-separated\n",
      "                        list. Example, to set `input_1` input with shape\n",
      "                        [1,100] and `sequence_len` input with shape [1,?]:\n",
      "                        \"input_1[1,100],sequence_len[1,?]\", where \"?\" is a\n",
      "                        dynamic dimension, which means that such a dimension\n",
      "                        can be specified later in the runtime. If the\n",
      "                        dimension is set as an integer (like 100 in [1,100]),\n",
      "                        such a dimension is not supposed to be changed later,\n",
      "                        during a model conversion it is treated as a static\n",
      "                        value. Example with unnamed inputs: \"[1,100],[1,?]\".\n",
      "  --output OUTPUT       One or more comma-separated model outputs to be\n",
      "                        preserved in the converted model. Other outputs are\n",
      "                        removed. If `output` parameter is not specified then\n",
      "                        all outputs from the original model are preserved. Do\n",
      "                        not add :0 to the names for TensorFlow. The order of\n",
      "                        outputs in the converted model is the same as the\n",
      "                        order of specified names. Example: ovc model.onnx\n",
      "                        output=out_1,out_2\n",
      "  --extension EXTENSION\n",
      "                        Paths or a comma-separated list of paths to libraries\n",
      "                        (.so or .dll) with extensions. To disable all\n",
      "                        extensions including those that are placed at the\n",
      "                        default location, pass an empty string.\n",
      "  --verbose             Print detailed information about conversion.\n"
     ]
    }
   ],
   "source": [
    "# OVC CLI tool parameters description\n",
    "\n",
    "! ovc --help"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Fetching example models\n",
    "[back to top ⬆️](#Table-of-contents:)\n",
    "\n",
    "This notebook uses two models for conversion examples:\n",
    "\n",
    "* [Distilbert](https://huggingface.co/distilbert-base-uncased-finetuned-sst-2-english) NLP model from Hugging Face\n",
    "* [Resnet50](https://pytorch.org/vision/stable/models/generated/torchvision.models.resnet50.html#torchvision.models.ResNet50_Weights) CV classification model from torchvision"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [],
   "source": [
    "from pathlib import Path\n",
    "\n",
    "# create a directory for models files\n",
    "MODEL_DIRECTORY_PATH = Path(\"model\")\n",
    "MODEL_DIRECTORY_PATH.mkdir(exist_ok=True)"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Fetch [distilbert](https://huggingface.co/distilbert-base-uncased-finetuned-sst-2-english) NLP model from Hugging Face and export it in ONNX format:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from transformers import AutoModelForSequenceClassification, AutoTokenizer\n",
    "from transformers.onnx import export, FeaturesManager\n",
    "\n",
    "ONNX_NLP_MODEL_PATH = MODEL_DIRECTORY_PATH / \"distilbert.onnx\"\n",
    "\n",
    "# download model\n",
    "hf_model = AutoModelForSequenceClassification.from_pretrained(\"distilbert-base-uncased-finetuned-sst-2-english\")\n",
    "# initialize tokenizer\n",
    "tokenizer = AutoTokenizer.from_pretrained(\"distilbert-base-uncased-finetuned-sst-2-english\")\n",
    "\n",
    "# get model onnx config function for output feature format sequence-classification\n",
    "model_kind, model_onnx_config = FeaturesManager.check_supported_model_or_raise(hf_model, feature=\"sequence-classification\")\n",
    "# fill onnx config based on pytorch model config\n",
    "onnx_config = model_onnx_config(hf_model.config)\n",
    "\n",
    "# export to onnx format\n",
    "export(\n",
    "    preprocessor=tokenizer,\n",
    "    model=hf_model,\n",
    "    config=onnx_config,\n",
    "    opset=onnx_config.default_onnx_opset,\n",
    "    output=ONNX_NLP_MODEL_PATH,\n",
    ")"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Fetch [Resnet50](https://pytorch.org/vision/stable/models/generated/torchvision.models.resnet50.html#torchvision.models.ResNet50_Weights) CV classification model from torchvision:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from torchvision.models import resnet50, ResNet50_Weights\n",
    "\n",
    "# create model object\n",
    "pytorch_model = resnet50(weights=ResNet50_Weights.DEFAULT)\n",
    "# switch model from training to inference mode\n",
    "pytorch_model.eval()"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Convert PyTorch model to ONNX format:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "ONNX model exported to model/resnet.onnx\n"
     ]
    }
   ],
   "source": [
    "import torch\n",
    "import warnings\n",
    "\n",
    "ONNX_CV_MODEL_PATH = MODEL_DIRECTORY_PATH / \"resnet.onnx\"\n",
    "\n",
    "if ONNX_CV_MODEL_PATH.exists():\n",
    "    print(f\"ONNX model {ONNX_CV_MODEL_PATH} already exists.\")\n",
    "else:\n",
    "    with warnings.catch_warnings():\n",
    "        warnings.filterwarnings(\"ignore\")\n",
    "        torch.onnx.export(model=pytorch_model, args=torch.randn(1, 3, 224, 224), f=ONNX_CV_MODEL_PATH)\n",
    "    print(f\"ONNX model exported to {ONNX_CV_MODEL_PATH}\")"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Conversion\n",
    "[back to top ⬆️](#Table-of-contents:)\n",
    "\n",
    "To convert a model to OpenVINO IR, use the following API:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "model/distilbert.onnx\n"
     ]
    }
   ],
   "source": [
    "import openvino as ov\n",
    "\n",
    "# ov.convert_model returns an openvino.runtime.Model object\n",
    "print(ONNX_NLP_MODEL_PATH)\n",
    "ov_model = ov.convert_model(ONNX_NLP_MODEL_PATH)\n",
    "\n",
    "# then model can be serialized to *.xml & *.bin files\n",
    "ov.save_model(ov_model, MODEL_DIRECTORY_PATH / \"distilbert.xml\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...\n",
      "To disable this warning, you can either:\n",
      "\t- Avoid using `tokenizers` before the fork if possible\n",
      "\t- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[ INFO ] Generated IR will be compressed to FP16. If you get lower accuracy, please consider disabling compression by removing argument \"compress_to_fp16\" or set it to false \"compress_to_fp16=False\".\n",
      "Find more information about compression to FP16 at https://docs.openvino.ai/2024/openvino-workflow/model-preparation/conversion-parameters.html\n",
      "[ SUCCESS ] XML file: model/distilbert.xml\n",
      "[ SUCCESS ] BIN file: model/distilbert.bin\n"
     ]
    }
   ],
   "source": [
    "! ovc model/distilbert.onnx --output_model model/distilbert.xml"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Setting Input Shapes\n",
    "[back to top ⬆️](#Table-of-contents:)\n",
    "\n",
    "Model conversion is supported for models with dynamic input shapes that contain undefined dimensions. However, if the shape of data is not going to change from one inference request to another, it is recommended to set up static shapes (when all dimensions are fully defined) for the inputs. Doing so at the model preparation stage, not at runtime, can be beneficial in terms of performance and memory consumption.\n",
    "\n",
    "For more information refer to [Setting Input Shapes](https://docs.openvino.ai/2024/openvino-workflow/model-preparation/setting-input-shapes.html) documentation."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [],
   "source": [
    "import openvino as ov\n",
    "\n",
    "ov_model = ov.convert_model(ONNX_NLP_MODEL_PATH, input=[(\"input_ids\", [1, 128]), (\"attention_mask\", [1, 128])])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...\n",
      "To disable this warning, you can either:\n",
      "\t- Avoid using `tokenizers` before the fork if possible\n",
      "\t- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[ INFO ] Generated IR will be compressed to FP16. If you get lower accuracy, please consider disabling compression by removing argument \"compress_to_fp16\" or set it to false \"compress_to_fp16=False\".\n",
      "Find more information about compression to FP16 at https://docs.openvino.ai/2024/openvino-workflow/model-preparation/conversion-parameters.html\n",
      "[ SUCCESS ] XML file: model/distilbert.xml\n",
      "[ SUCCESS ] BIN file: model/distilbert.bin\n"
     ]
    }
   ],
   "source": [
    "! ovc model/distilbert.onnx --input input_ids[1,128],attention_mask[1,128] --output_model model/distilbert.xml"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The `input` parameter allows overriding original input shapes if it is supported by the model topology. Shapes with dynamic dimensions in the original model can be replaced with static shapes for the converted model, and vice versa. The dynamic dimension can be marked in model conversion API parameter as `-1` or `?` when using `ovc`:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [],
   "source": [
    "import openvino as ov\n",
    "\n",
    "ov_model = ov.convert_model(ONNX_NLP_MODEL_PATH, input=[(\"input_ids\", [1, -1]), (\"attention_mask\", [1, -1])])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...\n",
      "To disable this warning, you can either:\n",
      "\t- Avoid using `tokenizers` before the fork if possible\n",
      "\t- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[ INFO ] Generated IR will be compressed to FP16. If you get lower accuracy, please consider disabling compression by removing argument \"compress_to_fp16\" or set it to false \"compress_to_fp16=False\".\n",
      "Find more information about compression to FP16 at https://docs.openvino.ai/2024/openvino-workflow/model-preparation/conversion-parameters.html\n",
      "[ SUCCESS ] XML file: model/distilbert.xml\n",
      "[ SUCCESS ] BIN file: model/distilbert.bin\n"
     ]
    }
   ],
   "source": [
    "! ovc model/distilbert.onnx --input \"input_ids[1,?],attention_mask[1,?]\" --output_model model/distilbert.xml"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "To optimize memory consumption for models with undefined dimensions in runtime, model conversion API provides the capability to define boundaries of dimensions. The boundaries of undefined dimension can be specified with ellipsis in the command line or with `openvino.Dimension` class in Python. For example, launch model conversion for the ONNX Bert model and specify a boundary for the sequence length dimension:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [],
   "source": [
    "import openvino as ov\n",
    "\n",
    "\n",
    "sequence_length_dim = ov.Dimension(10, 128)\n",
    "\n",
    "ov_model = ov.convert_model(\n",
    "    ONNX_NLP_MODEL_PATH,\n",
    "    input=[\n",
    "        (\"input_ids\", [1, sequence_length_dim]),\n",
    "        (\"attention_mask\", [1, sequence_length_dim]),\n",
    "    ],\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...\n",
      "To disable this warning, you can either:\n",
      "\t- Avoid using `tokenizers` before the fork if possible\n",
      "\t- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[ INFO ] Generated IR will be compressed to FP16. If you get lower accuracy, please consider disabling compression by removing argument \"compress_to_fp16\" or set it to false \"compress_to_fp16=False\".\n",
      "Find more information about compression to FP16 at https://docs.openvino.ai/2024/openvino-workflow/model-preparation/conversion-parameters.html\n",
      "[ SUCCESS ] XML file: model/distilbert.xml\n",
      "[ SUCCESS ] BIN file: model/distilbert.bin\n"
     ]
    }
   ],
   "source": [
    "! ovc model/distilbert.onnx --input input_ids[1,10..128],attention_mask[1,10..128] --output_model model/distilbert.xml"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Compressing a Model to FP16\n",
    "[back to top ⬆️](#Table-of-contents:)\n",
    "\n",
    "By default model weights compressed to FP16 format when saving OpenVINO model to IR. This saves up to 2x storage space for the model file and in most cases doesn't sacrifice model accuracy. Weight compression can be disabled by setting `compress_to_fp16` flag to `False`:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [],
   "source": [
    "import openvino as ov\n",
    "\n",
    "ov_model = ov.convert_model(ONNX_NLP_MODEL_PATH)\n",
    "ov.save_model(ov_model, MODEL_DIRECTORY_PATH / \"distilbert.xml\", compress_to_fp16=False)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...\n",
      "To disable this warning, you can either:\n",
      "\t- Avoid using `tokenizers` before the fork if possible\n",
      "\t- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[ SUCCESS ] XML file: model/distilbert.xml\n",
      "[ SUCCESS ] BIN file: model/distilbert.bin\n"
     ]
    }
   ],
   "source": [
    "! ovc model/distilbert.onnx --output_model model/distilbert.xml --compress_to_fp16=False"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Convert Models from memory\n",
    "[back to top ⬆️](#Table-of-contents:)\n",
    "\n",
    "Model conversion API supports passing original framework Python object directly. More details can be found in [PyTorch](https://docs.openvino.ai/2024/openvino-workflow/model-preparation/convert-model-pytorch.html), [TensorFlow](https://docs.openvino.ai/2024/openvino-workflow/model-preparation/convert-model-tensorflow.html), [PaddlePaddle](https://docs.openvino.ai/2024/openvino-workflow/model-preparation/convert-model-paddle.html) frameworks conversion guides."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import openvino as ov\n",
    "import torch\n",
    "\n",
    "example_input = torch.rand(1, 3, 224, 224)\n",
    "\n",
    "ov_model = ov.convert_model(pytorch_model, example_input=example_input, input=example_input.shape)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [],
   "source": [
    "import os\n",
    "\n",
    "import openvino as ov\n",
    "import tensorflow_hub as hub\n",
    "\n",
    "os.environ[\"TFHUB_CACHE_DIR\"] = str(Path(\"./tfhub_modules\").resolve())\n",
    "\n",
    "model = hub.load(\"https://www.kaggle.com/models/google/movenet/frameworks/TensorFlow2/variations/singlepose-lightning/versions/4\")\n",
    "movenet = model.signatures[\"serving_default\"]\n",
    "\n",
    "ov_model = ov.convert_model(movenet)"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Migration from Legacy conversion API\n",
    "[back to top ⬆️](#Table-of-contents:)\n",
    "\n",
    "In the 2023.1 OpenVINO release OpenVINO Model Conversion API was introduced with the corresponding Python API: `openvino.convert_model` method. `ovc` and `openvino.convert_model` represent a lightweight alternative of `mo` and `openvino.tools.mo.convert_model` which are considered legacy API now.\n",
    "`mo.convert_model()` provides a wide range of preprocessing parameters. Most of these parameters have analogs in OVC or can be replaced with functionality from `ov.PrePostProcessor` class. Refer to [Optimize Preprocessing notebook](../optimize-preprocessing/optimize-preprocessing.ipynb) for more information about [Preprocessing API](https://docs.openvino.ai/2024/openvino-workflow/running-inference/optimize-inference/optimize-preprocessing.html). Here is the migration guide from legacy model preprocessing to Preprocessing API."
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Specifying Layout\n",
    "[back to top ⬆️](#Table-of-contents:)\n",
    "\n",
    "Layout defines the meaning of dimensions in a shape and can be specified for both inputs and outputs. Some preprocessing requires to set input layouts, for example, setting a batch, applying mean or scales, and reversing input channels (BGR<->RGB). For the layout syntax, check the [Layout API overview](https://docs.openvino.ai/2024/openvino-workflow/running-inference/optimize-inference/optimize-preprocessing/layout-api-overview.html). To specify the layout, you can use the layout option followed by the layout value.\n",
    "\n",
    "The following example specifies the `NCHW` layout for a Pytorch Resnet50 model that was exported to the ONNX format:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Converter API\n",
    "import openvino as ov\n",
    "\n",
    "ov_model = ov.convert_model(ONNX_CV_MODEL_PATH)\n",
    "\n",
    "prep = ov.preprocess.PrePostProcessor(ov_model)\n",
    "prep.input(\"input.1\").model().set_layout(ov.Layout(\"nchw\"))\n",
    "ov_model = prep.build()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...\n",
      "To disable this warning, you can either:\n",
      "\t- Avoid using `tokenizers` before the fork if possible\n",
      "\t- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)\n"
     ]
    }
   ],
   "source": [
    "# Legacy Model Optimizer API\n",
    "from openvino.tools import mo\n",
    "\n",
    "ov_model = mo.convert_model(ONNX_CV_MODEL_PATH, layout=\"nchw\")"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Changing Model Layout\n",
    "[back to top ⬆️](#Table-of-contents:)\n",
    "\n",
    "Transposing of matrices/tensors is a typical operation in Deep Learning - you may have a BMP image `640x480`, which is an array of `{480, 640, 3}` elements, but Deep Learning model can require input with shape `{1, 3, 480, 640}`.\n",
    "\n",
    "Conversion can be done implicitly, using the layout of a user’s tensor and the layout of an original model:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Converter API\n",
    "import openvino as ov\n",
    "\n",
    "ov_model = ov.convert_model(ONNX_CV_MODEL_PATH)\n",
    "\n",
    "prep = ov.preprocess.PrePostProcessor(ov_model)\n",
    "prep.input(\"input.1\").tensor().set_layout(ov.Layout(\"nhwc\"))\n",
    "prep.input(\"input.1\").model().set_layout(ov.Layout(\"nchw\"))\n",
    "ov_model = prep.build()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Legacy Model Optimizer API\n",
    "from openvino.tools import mo\n",
    "\n",
    "ov_model = mo.convert_model(ONNX_CV_MODEL_PATH, layout=\"nchw->nhwc\")\n",
    "\n",
    "# alternatively use source_layout and target_layout parameters\n",
    "ov_model = mo.convert_model(ONNX_CV_MODEL_PATH, source_layout=\"nchw\", target_layout=\"nhwc\")"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Specifying Mean and Scale Values\n",
    "[back to top ⬆️](#Table-of-contents:)\n",
    "\n",
    "Using Preprocessing API `mean` and `scale` values can be set. Using these API, model embeds the corresponding preprocessing block for mean-value normalization of the input data and optimizes this block. Refer to [Optimize Preprocessing notebook](../optimize-preprocessing/optimize-preprocessing.ipynb) for more examples."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Converter API\n",
    "import openvino as ov\n",
    "\n",
    "ov_model = ov.convert_model(ONNX_CV_MODEL_PATH)\n",
    "\n",
    "prep = ov.preprocess.PrePostProcessor(ov_model)\n",
    "prep.input(\"input.1\").tensor().set_layout(ov.Layout(\"nchw\"))\n",
    "prep.input(\"input.1\").preprocess().mean([255 * x for x in [0.485, 0.456, 0.406]])\n",
    "prep.input(\"input.1\").preprocess().scale([255 * x for x in [0.229, 0.224, 0.225]])\n",
    "\n",
    "ov_model = prep.build()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Legacy Model Optimizer API\n",
    "from openvino.tools import mo\n",
    "\n",
    "\n",
    "ov_model = mo.convert_model(\n",
    "    ONNX_CV_MODEL_PATH,\n",
    "    mean_values=[255 * x for x in [0.485, 0.456, 0.406]],\n",
    "    scale_values=[255 * x for x in [0.229, 0.224, 0.225]],\n",
    ")"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Reversing Input Channels\n",
    "[back to top ⬆️](#Table-of-contents:)\n",
    "\n",
    "Sometimes, input images for your application can be of the `RGB` (or `BGR`) format, and the model is trained on images of the `BGR` (or `RGB`) format, which is in the opposite order of color channels. In this case, it is important to preprocess the input images by reverting the color channels before inference."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Converter API\n",
    "import openvino as ov\n",
    "\n",
    "ov_model = ov.convert_model(ONNX_CV_MODEL_PATH)\n",
    "\n",
    "prep = ov.preprocess.PrePostProcessor(ov_model)\n",
    "prep.input(\"input.1\").tensor().set_layout(ov.Layout(\"nchw\"))\n",
    "prep.input(\"input.1\").preprocess().reverse_channels()\n",
    "ov_model = prep.build()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Legacy Model Optimizer API\n",
    "from openvino.tools import mo\n",
    "\n",
    "ov_model = mo.convert_model(ONNX_CV_MODEL_PATH, reverse_input_channels=True)"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Cutting Off Parts of a Model\n",
    "[back to top ⬆️](#Table-of-contents:)\n",
    "\n",
    "Cutting model inputs and outputs from a model is no longer available in the new conversion API. Instead, we recommend performing the cut in the original framework. Examples of model cutting of TensorFlow protobuf, TensorFlow SavedModel, and ONNX formats with tools provided by the Tensorflow and ONNX frameworks can be found in [documentation guide](https://docs.openvino.ai/2024/documentation/legacy-features/transition-legacy-conversion-api.html#cutting-off-parts-of-a-model). For PyTorch, TensorFlow 2 Keras, and PaddlePaddle, we recommend changing the original model code to perform the model cut."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.10"
  },
  "openvino_notebooks": {
   "imageUrl": "",
   "tags": {
    "categories": [
     "Convert",
     "API Overview"
    ],
    "libraries": [],
    "other": [],
    "tasks": [
     "Image Classification",
     "Text Classification"
    ]
   }
  },
  "widgets": {
   "application/vnd.jupyter.widget-state+json": {
    "state": {},
    "version_major": 2,
    "version_minor": 0
   }
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}