You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have a mixed-precision ONNX model, which is based on some OnnxCast nodes here and there. This works fine with GPU inference, but when trying to run it on the CPU there are various issues. Specifically, I am getting the following:
RuntimeError: "slow_conv2d_cpu" not implemented for 'Half'
This makes sense, some of the layers are using half precision and those are not implemented for CPU runtime. So the next logical step is to cast the model parameters to float32 by using model.float(). However, this yields the following error now:
RuntimeError: Input type (c10::Half) and bias type (float) should be the same
I didn't dig too deep into this, but my hypothesis is that the OnnxCast nodes are still doing conversions to float16 even though all the model parameters are float32 now. I tried just modifying the FX graph and converting the OnnxCast nodes to noops, but I haven't been able to make that work yet. Maybe adding an argument to onnx2torch.convert would help with this scenario, for example something like cast=False which is cast=True by default. Just a thought.
The text was updated successfully, but these errors were encountered:
Thank you for this library, really great tool!
I have a mixed-precision ONNX model, which is based on some
OnnxCast
nodes here and there. This works fine with GPU inference, but when trying to run it on the CPU there are various issues. Specifically, I am getting the following:This makes sense, some of the layers are using half precision and those are not implemented for CPU runtime. So the next logical step is to cast the model parameters to float32 by using
model.float()
. However, this yields the following error now:I didn't dig too deep into this, but my hypothesis is that the
OnnxCast
nodes are still doing conversions to float16 even though all the model parameters are float32 now. I tried just modifying the FX graph and converting theOnnxCast
nodes to noops, but I haven't been able to make that work yet. Maybe adding an argument toonnx2torch.convert
would help with this scenario, for example something likecast=False
which iscast=True
by default. Just a thought.The text was updated successfully, but these errors were encountered: