Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RuntimeError: The size of tensor a (705) must match the size of tensor b (673) at non-singleton dimension 1 #1

Open
whydna opened this issue Jun 22, 2023 · 2 comments

Comments

@whydna
Copy link

whydna commented Jun 22, 2023

Getting the following error:

Running predict()...
Traceback (most recent call last):
File "/root/.pyenv/versions/3.9.17/lib/python3.9/site-packages/cog/server/worker.py", line 222, in _predict
for r in result:
File "/root/.pyenv/versions/3.9.17/lib/python3.9/site-packages/torch/utils/_contextlib.py", line 35, in generator_context
response = gen.send(None)
File "predict.py", line 338, in predict
pipe, kwargs = self.build_pipe(
File "predict.py", line 188, in build_pipe
img = getattr(self, "{}_preprocess".format(name))(img)
File "predict.py", line 133, in depth_preprocess
return self.midas(img)
File "/src/midas_hack.py", line 52, in __call__
depth = self.model(image_depth)[0]
File "/root/.pyenv/versions/3.9.17/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/root/.pyenv/versions/3.9.17/lib/python3.9/site-packages/controlnet_aux/midas/api.py", line 167, in forward
prediction = self.model(x)
File "/root/.pyenv/versions/3.9.17/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/root/.pyenv/versions/3.9.17/lib/python3.9/site-packages/controlnet_aux/midas/midas/dpt_depth.py", line 108, in forward
return super().forward(x).squeeze(dim=1)
File "/root/.pyenv/versions/3.9.17/lib/python3.9/site-packages/controlnet_aux/midas/midas/dpt_depth.py", line 71, in forward
layer_1, layer_2, layer_3, layer_4 = forward_vit(self.pretrained, x)
File "/root/.pyenv/versions/3.9.17/lib/python3.9/site-packages/controlnet_aux/midas/midas/vit.py", line 59, in forward_vit
glob = pretrained.model.forward_flex(x)
File "/root/.pyenv/versions/3.9.17/lib/python3.9/site-packages/controlnet_aux/midas/midas/vit.py", line 145, in forward_flex
x = x + pos_embed
RuntimeError: The size of tensor a (705) must match the size of tensor b (673) at non-singleton dimension 1

when running w/ params:

prompt
depth_image url_to_image
hough_image url_to_image
num_outputs 4
guidance_scale 9
negative_prompt 
image_resolution 512
num_inference_steps 20
@whydna
Copy link
Author

whydna commented Jun 22, 2023

I was able to narrow it down to be related to the depth_image property.

This is the image I'm using - it fails in the demo as well:

notwork

@whydna
Copy link
Author

whydna commented Jun 26, 2023

@anotherjesse

Here is a fix (tested on my own fork) - the depth preprocess requires images to have dimensions divisible by 64px.

See: huggingface/controlnet_aux#2

Any chance we can get this deployed to replicate?

    def depth_preprocess(self, img):
        W, H = img.size
        W_new = int(np.round(W/64) * 64)
        H_new = int(np.round(H/64) * 64)
        img = img.resize((W_new, H_new))
        return self.midas(img)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant