Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conv3DTranspose with strides leads to wrong output dimensions if data format is channels_first #1714

Open
fthielke opened this issue Sep 14, 2021 · 2 comments
Assignees
Labels
bug An unexpected problem or unintended behavior contribution welcome Community contribution is welcomed

Comments

@fthielke
Copy link
Contributor

Describe the bug
When converting a model containing Conv3DTranspose with strides > 1 and data_format='channes_first', the output of the resulting ONNX model has the wrong shape (seems to be off by one).

Urgency
Not very high.

Can be easily worked around by using data format channels_last and adding transpose operations which are removed by the optimizer anyhow; adding the workaround each time is annoying, though.

System information

  • OS Platform and Distribution: Windows 10
  • Tensorflow Version: 2.6.0
  • Python version: 3.9.6

To Reproduce
The attached Jupyter notebook test_convtranspose.ipynb.gz creates a simple model containing only a Conv3DTranspose with kernel size (3,3,3) and strides (2,2,2), either using data_format='channes_first' or 'channels_last'.

For the model using 'channels_last', the converted ONNX model correctly doubles its input shape. The other model however does not: e.g. for an input of size (8,8,8), the output size is (16,16,17).

@TomWildenhain-Microsoft
Copy link
Contributor

Hi @fthielke,

Our unit tests don't always cover the channels_first case since I think tf won't run it on CPU and our CI doesn't have GPU, so it is quite likely that we have a bug. It would be fantastic if you were able to help track down where in the code the issue occurs (I'd recommend stepping through conversion with a python debugger). Hopefully is a simple fix.

@TomWildenhain-Microsoft TomWildenhain-Microsoft added the contribution welcome Community contribution is welcomed label Sep 27, 2021
fthielke added a commit to fthielke/tensorflow-onnx that referenced this issue Oct 19, 2021
 onnx#1714)

While shape calculations for the input correctly distinguished between channels_first and channels_last, shape calculations for the inputs of the final Slice and Pad nodes always assumed channels_last format.

Signed-off-by: fthielke <[email protected]>
@fthielke
Copy link
Contributor Author

fthielke commented Oct 19, 2021

The debugger sadly was not too helpful for finding the bug, but I could easily spot it by comparing the resulting models with and without the workaround in Netron.

The fix is indeed quite simple, I've opened a PR: #1748

fthielke added a commit to fthielke/tensorflow-onnx that referenced this issue Nov 22, 2021
 onnx#1714)

While shape calculations for the input correctly distinguished between channels_first and channels_last, shape calculations for the inputs of the final Slice and Pad nodes always assumed channels_last format.

Signed-off-by: fthielke <[email protected]>
@fatcat-z fatcat-z added the bug An unexpected problem or unintended behavior label Mar 16, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug An unexpected problem or unintended behavior contribution welcome Community contribution is welcomed
Projects
None yet
Development

No branches or pull requests

4 participants