File size: 282 Bytes
5fa1a76
 
 
1
2
3
pred = onnx_qa(question, context)

If you have an Intel CPU, take a look at 🤗 Optimum Intel which supports a variety of compression techniques (quantization, pruning, knowledge distillation) and tools for converting models to the OpenVINO format for higher performance inference.