Tiling does not work on Windows11 #77

mflamand · 2024-11-08T19:40:38Z

Hi,

First let me say that I have been quite happy with the performance of deconwolf. I have tested it using sets of RNA FISH images with great results. I think it works better (and faster) than the blind deconvolution algorithms I was using before. congrats!

I believe I may have found a bug. When using the latest release (0.4.3) in Windows 11, I am unable use the tiling option to process images that are too large for my GPU memory. For example, if I try to launch a run (mock run with 3 iterations, --verbose 2), I get the following :

dw --iter 3 --tilesize 1024 --prefix tiling --gpu --verbose 2 .\CamK2a_AAV15_06_CY3.tif .\PSF.tif
outFile: .\tiling_CamK2a_AAV15_06_CY3.tif, outFolder: .\
Settings:
image: .\CamK2a_AAV15_06_CY3.tif
psf: .\PSF.tif
output: .\tiling_CamK2a_AAV15_06_CY3.tif
log file: .\tiling_CamK2a_AAV15_06_CY3.tif.log.txt
nIter: 3
nThreads for FFT: 16
nThreads for OMP: 16
verbosity: 2
background level: auto
method: Scaled Heavy Ball + OpenCL (SHBCL2)
metric: Idiv
Stopping after 3 iterations
overwrite: NO
tiling, maxSize: 1024
tiling, padding: 20
XY crop factor: 0.001000
Offset: 5.000000
Output Format: 16 bit integer
Scaling: Automatic
Border Quality: 2 Minimal boundary artifacts
FFT lookahead: 0
FFTW3 plan: FFTW_MEASURE
Initial guess: Flat average

deconwolf: '0.4.3'

BUILD_DATE: 'Jun 22 2024'
TIFF Backend: 'LIBTIFF, Version 4.6.0
Copyright (c) 1988-1996 Sam Leffler
Copyright (c) 1991-1996 Silicon Graphics, Inc.'
OpenMP: YES
OpenCL: YES
VkFFT: YES
sizeof(int) = 4
sizeof(float) = 4
sizeof(double) = 8
sizeof(size_t) = 8

Image dimensions: 2048 x 2048 x 39

Reading .\PSF.tif
PSF Z-crop [181 x 181 x 265] -> [181 x 181 x 77]
PSF XY-crop [181 x 181 x 77] -> [161 x 161 x 77]
Output: .\tiling_CamK2a_AAV15_06_CY3.tif(.log.txt)
-> Divided the [2048 x 2048 x 39] image into 4 tiles
Initializing .\tiling_CamK2a_AAV15_06_CY3.tif.raw to 0
Dumping .\CamK2a_AAV15_06_CY3.tif to .\CamK2a_AAV15_06_CY3.tif.raw (for quicker io)

-> Processing tile 1 / 4
PSF X-crop: Not cropping
Deconvolving using shbcl2 (using inplace)
Setting the background level to 0.010000
image: [1044x1044x39], psf: [161x161x77], job: [1204x1204x115]
Found 2 CL platforms
Found 1 CL devices
Will use device 0 (first = 0)
CL device #0
CL_DEVICE_TYPE=CL_DEVICE_TYPE_GPU
CL_DEVICE_GLOBAL_MEM_SIZE = 17175150592 (17175 MiB)
CL_DEVICE_NAME = NVIDIA RTX 2000 Ada Generation
CL_DEVICE_VENDOR = NVIDIA Corporation
CL_DRIVER_VERSION = 553.24
CL_DEVICE_EXTENSIONS = cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_fp64 cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_icd cl_khr_gl_sharing cl_nv_compiler_options cl_nv_device_attribute_query cl_nv_pragma_unroll cl_nv_d3d10_sharing cl_khr_d3d10_sharing cl_nv_d3d11_sharing cl_nv_copy_opts cl_khr_gl_event cl_nv_create_buffer cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_nv_kernel_attribute cl_khr_device_uuid cl_khr_pci_bus_info cl_khr_external_semaphore cl_khr_external_memory cl_khr_external_semaphore_win32 cl_khr_external_memory_win32
CL_DEVICE_MAX_WORK_ITEM_DIMENSIONS=4099
Using VkFFT version 10304
Preparing for convolutions of size 1204 x 1204 x 115
Warning: Will write the VkFFT configuration in the current folder.
Reason: Can not determine a suitable folder under Windows.
vkFFT cache file: VkFFT_kernelCache_1204x1204x115.binary
Initializing VkFFT for size 1204 x 1204 x 115
fimcl_fft_inplace
VkFFTAppend (for in-place forward transform)
.Creating weight map for boundary handling
fimcl_fft_inplace
VkFFTAppend (for in-place forward transform)
fimcl_convolve
fimcl_copy
fimcl_ifft_inplace
Downloading real data 1204 x 1204 x 115 (166705840 floats)
Start guess: FLAT
fimcl_copy
Iterating .fimcl_copy
fimcl_fft_inplace
VkFFTAppend (for in-place forward transform)
fimcl_convolve
fimcl_copy
fimcl_ifft_inplace
...fimcl_fft_inplace
VkFFTAppend (for in-place forward transform)
fimcl_convolve
fimcl_copy
fimcl_ifft_inplace
Iteration 1/ 3, Idiv=0.000e+00 .fimcl_copy
fimcl_fft_inplace
VkFFTAppend (for in-place forward transform)
fimcl_convolve
fimcl_copy
fimcl_ifft_inplace
...fimcl_fft_inplace
VkFFTAppend (for in-place forward transform)
fimcl_convolve
fimcl_copy
fimcl_ifft_inplace
Iteration 2/ 3, Idiv=0.000e+00 .fimcl_copy
fimcl_fft_inplace
VkFFTAppend (for in-place forward transform)
fimcl_convolve
fimcl_copy
fimcl_ifft_inplace
...fimcl_fft_inplace
VkFFTAppend (for in-place forward transform)
fimcl_convolve
fimcl_copy
fimcl_ifft_inplace
Iteration 3/ 3, Idiv=0.000e+00
Downloading real data 1204 x 1204 x 115 (166705840 floats)
Closing the OpenCL environment

The same is happening when processing using the CPU:

dw --iter 3 --tilesize 1024 --prefix tiling_cpu --verbose 2 .\CamK2a_AAV15_06_CY3.tif .\PSF.tif
outFile: .\tiling_cpu_CamK2a_AAV15_06_CY3.tif, outFolder: .\
Settings:
image: .\CamK2a_AAV15_06_CY3.tif
psf: .\PSF.tif
output: .\tiling_cpu_CamK2a_AAV15_06_CY3.tif
log file: .\tiling_cpu_CamK2a_AAV15_06_CY3.tif.log.txt
nIter: 3
nThreads for FFT: 16
nThreads for OMP: 16
verbosity: 2
background level: auto
method: Scaled Heavy Ball (SHB)
metric: Idiv
Stopping after 3 iterations
overwrite: NO
tiling, maxSize: 1024
tiling, padding: 20
XY crop factor: 0.001000
Offset: 5.000000
Output Format: 16 bit integer
Scaling: Automatic
Border Quality: 2 Minimal boundary artifacts
FFT lookahead: 0
FFTW3 plan: FFTW_MEASURE
Initial guess: Flat average

deconwolf: '0.4.3'
BUILD_DATE: 'Jun 22 2024'
TIFF Backend: 'LIBTIFF, Version 4.6.0
Copyright (c) 1988-1996 Sam Leffler
Copyright (c) 1991-1996 Silicon Graphics, Inc.'
OpenMP: YES
OpenCL: YES
VkFFT: YES
sizeof(int) = 4
sizeof(float) = 4
sizeof(double) = 8
sizeof(size_t) = 8

Image dimensions: 2048 x 2048 x 39
Reading .\PSF.tif
PSF Z-crop [181 x 181 x 265] -> [181 x 181 x 77]
PSF XY-crop [181 x 181 x 77] -> [161 x 161 x 77]
Output: .\tiling_cpu_CamK2a_AAV15_06_CY3.tif(.log.txt)
-> Divided the [2048 x 2048 x 39] image into 4 tiles
Initializing .\tiling_cpu_CamK2a_AAV15_06_CY3.tif.raw to 0
Dumping .\CamK2a_AAV15_06_CY3.tif to .\CamK2a_AAV15_06_CY3.tif.raw (for quicker io)

-> Processing tile 1 / 4
PSF X-crop: Not cropping
Deconvolving
Setting the background level to 0.010000
image: [1044x1044x39], psf: [161x161x77], job: [1204x1204x115]
Estimated peak memory usage: 5.8 GB
creating fftw3 plans ...
c2r plan ...
c2r inplace plan ...
r2c plan ...
r2c inplace plan ...
Exported fftw wisdom to fftw_wisdom_float_inplace_threads_16.dat
Iteration 3/ 3, Idiv=0.000e+00

It seems that the program always exits after the first tile is processed. The Idiv value stays at = 0.000e+00 (no background signal?). So my guess is that it fails to properly read in the image.

I get the same issue on 2 systems (#1: Intel 14900k, RTX 2000 Ada 16Gb, 64Gb Ram; #2: AMD 5900X, RTX 3080 10Gb, 64Gb RAM). I can use tiling with both systems whit Ubuntu 24.04 (in CPU or GPU modes), but not with Windows11. Tiling also works under WSL-Ubuntu and MacOS 15.1 (Apple M3 pro 18Gb) in CPU mode. GPU mode on MacOS does not work for me(it hangs at "fimcl_convolve"), but I wasn't looking to use GPU mode on my MacBook anyway.

By the way, related to issue #75, I am able to use the GPU mode under windows 11 with out any problem when the image is cropped.

I have no problem using dw under Ubuntu for now. For convenience (the workstation also runs windows exclusive software) it would be great if the issue could be fixed/looked at in the future. I am happy to do some testing if needed.

Best,
Mathieu

elgw · 2024-11-11T15:11:58Z

Hi!

I'm glad that you find the software useful :)

Thank you for taking the time to report these issues and finding.

At the moment I can't say when I have time to look at the windows specific issues, but they won't be forgotten.

Unfortunately there is less chance that I will get deconwolf to run smoothly on MacOS in the nearest future (I have no access to hardware and OpenCL not the best backend). Possibly I'll revise that when/if deconwolf switches to/adds a Vulkan backend for the GPU computations.

Cheers,
Erik

elgw added the win11 Windows 11 specific label Nov 11, 2024

elgw self-assigned this Nov 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tiling does not work on Windows11 #77

Tiling does not work on Windows11 #77

mflamand commented Nov 8, 2024

elgw commented Nov 11, 2024

Tiling does not work on Windows11 #77

Tiling does not work on Windows11 #77

Comments

mflamand commented Nov 8, 2024

elgw commented Nov 11, 2024