-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SlidingWindowInferer runtime increase if sw_batch_size is too big #6628
Comments
Hi @matt3o, I can't reproduce the issue, could you please share more information such as which model did you use, infer on GPU or CPU? |
Hey @KumoLiu, thanks for the quick response again! |
Btw I am not sure if 30 seconds for (256,256,256) on sw_batch_size 1 |
Hi @matt3o, I can not even run with |
@KumoLiu, then we will have to debug this as soon as I publish my code. I am using exactly the network config you just mentioned. I would guess your problem now is related to #6626, in theory SlidingWindowInferer on DynUNet can work just fine on 24 Gb and I got it to run on smaller crops even on 11 Gb. |
Hi @matt3o, I investigate a little bit more using
And I found that L252-L284 will cause more time when batch size increase. Such as when Lines 252 to 284 in 2cbed6c
But I didn't see the time increase issue. Could you please try this simple demo in your local and see if you get similar results with me?
Thanks! |
@KumoLiu I get the similar results when using the UNet, but not when I am using the DynUNet. I will append the modified code and the runtime results. I also added amp as in the real code and ran the examples on the 50Gb GPU server.
|
Describe the bug
I am currently using the SlidingWindowInferer for some modified DeepEdit Code. I discovered that for small sw_roi_sizes like (32,32,32) I have to set a higher sw_batch_size to make it run faster. See the data for that below.
However when the sw_batch_size becomes too big, the performance takes a dramatic hit which does not make any sense to me. Initial inputs volume shape is (1,3,344,344,284) and the inferer is created with
eval_inferer = SlidingWindowInferer(roi_size=args.sw_roi_size, sw_batch_size=args.sw_batch_size, mode="gaussian")
Results of my test runs:
138 seconds for (32,32,32) on sw_batch_size 1
13.38 seconds for (32,32,32) on sw_batch_size 200 (12 iterations)
11 seconds for (32,32,32) on sw_batch_size 500 (8 iterations)
11 seconds for (32,32,32) on sw_batch_size 1000 (3 iterations)
93 seconds for (32,32,32) on sw_batch_size 2000 (2 iterations)
191 seconds for (32,32,32) on sw_batch_size 2400 (1 iteration)
I tried to debug that but I am not sure why this crazy increase in terms of time is happening. Of course I can always calculate the best sw_batch_size beforehand (1/4 of the actual amount of slices I guess from above but I have to know the size of the maximum volume beforehand), but an actual solution would be nice. Or maybe it is an issue with my code I am not aware of, would be good to know anyways.
To Reproduce
Use the SlidingWindowInferer, set the sw_batch_size so that is it is higher that the actual amount of slices and then the performance will deteriorate heavily.
Environment
Tried it on Monai 1.1 and also on the nightly, no change.
The text was updated successfully, but these errors were encountered: