ONNX & PyTorch Outputs Are Different?
The advancement of Machine Learning models is leading us to use powerful devices with GPUs to minimize the computational time but still, complex models take a lot of time in training and computing.?ONNX?provides us with a way to minimize the computational time by converting the PyTorch, Tensor-Flow or similar complex models into .onnx format.
There are approximations involved while converting the model graph using Onnx. These approximations are the result of several computational steps in the conversion stage. After conversion, the outputs of the onnx models can be different from the original models (approximation). In the case of complex models e.g. Sentence Transformers that have pooling layers on basic Convolutional Neural Network architecture, this output difference can be slightly higher. For details and experiments, read this blog...