-
Notifications
You must be signed in to change notification settings - Fork 66
ONNX Tutorial: filter.dim32(i + 2) == kernel_[i] #12
Comments
@Pavel-Akapian thank you for trying out the tutorial. To be able to help further, I need some information on how you are running the model. can you describe what super-resolution model version are you using? The tutorial highlights 1) a small model that is also available in pytorch examples and 2) the SRResNet model what image processing did you use if any at all? what is the image input dimension to the model? The error seems to indicate that the input data is not what the model is expecting. Have you been able to successfully run the tutorial part until the mobile execution using pdb? That would be the first step to get right. |
@prigoyal thank you for quick reply.
I don't get well what do you mean by two versions (I don't notice SSResNet).
with no errors and 'cat_superres.jpg' is successfully created on server machine.
|
@prigoyal Me and my collegues compared pb files and found how to resolve this problem.
We need to move the last external_input:"9" before external_input:"1"
In fact image is loaded into the first external_input. |
OK, the fact that the PyTorch exporter places actual inputs at the end of the inputs list (rather than the beginning) is a known wart. onnx-caffe2 is able to handle this if you don't use the protobuf manually but we plan on fixing this. EDIT: This doesn't seem to be the actual problem here. |
@Pavel-Akapian the issue is rather very simple. There is no need for modifying the pb here. This version of super-resolution model requires and input image of dim 1x1x224x224 and the reason for that is mentioned in the tutorial. The error you were getting is also indicating that the filter dim is not right. Can you please try it out by passing the correct input without modifying the pb manually? |
@prigoyal this also happens with 1x1x224x224. |
@Pavel-Akapian it's actually slightly weird that you were able to execute the nets until
as you mentioned and that didn't require any tampering with pb manually but executing on iOS needs that. Can you create a simple repro of the error so we can look into it further? Tampering with pb is not the right solution and should be figured out correctly. I am not able to repro this issue with tutorial yet. Also, were you able to rather deploy on android device following some adb instructions in tutorial? |
Here's the pb's generated by exact execution of tutorial that can run on server and not on iOS. ('.txt' ending is fake so github can upload it) |
Hey @Pavel-Akapian how are you running the network? There are a couple ways to do it but I am guessing you are using the predictor API? This requires the external_input[0] (the first one) to be the input data. As you correctly determined, this was not what was created by the pytorch exporter. You can try running the network instead by workspace.RunNet(predict_net) and populating the blob "9" to verify this |
@bwasti we should update predictor to be more flexible, similar to this: https://github.com/onnx/onnx-caffe2/blob/master/onnx_caffe2/backend.py#L318-L321 |
@jerryzh168 what do you mean by more flexible? One thing that might be nice is a TensorMap that takes string->Tensor for This is the culprit code btw: https://github.com/caffe2/caffe2/blob/master/caffe2/core/predictor.cc#L48 |
@bwasti Yeah, by flexible I mean the predictor shouldn't depend on the ordering of external_input. We can figure out what are the missing blobs by calling workspace.HasBlob. An extension of getting a string->Tensor map as input is nice to have too. |
Hello!
We're trying to replicate PyTorch ONNX Super-Resolution Tutorial . Conversion seems to work OK. But when deploying model to iOS an error occurs (on predictor->run):
We can run original caffe2 models on device. When we compared manually written caffe2 models and the model made by conversion tool, we noticed conversion tool adds (maybe it could help to fix this issue?):
Also this problem replicates on more simple examples.
The text was updated successfully, but these errors were encountered: