Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Entire chunk that can fit in GPU memory will be split into two blocks if padding method present #453

Open
yousefmoazzam opened this issue Sep 19, 2024 · 2 comments
Labels
bug Something isn't working framework Data-handling framework related

Comments

@yousefmoazzam
Copy link
Collaborator

Firstly, it's worth pointing out that being able to fit an entire chunk into GPU memory is not that common, so this issue is not very relevant to processing big data - it was discovered when running on the small test data that easily fits into GPU memory.

On commit 930e0da in #446, running tomo_standard.nxs with the pipeline_gpu1.yaml pipeline one can see that the small test data gets split into two blocks (whereas if padding is switched off for remove_outlier in the associated methods database YAML file there is only one block):

(base) root@492a2a3538d1:/httomo# python -m httomo run tests/test_data/tomo_standard.nxs tests/samples/pipeline_template_examples/pipeline_gpu1.yaml output_dir/
Pipeline has been separated into 2 sections
See the full log file at: output_dir/19-09-2024_09_44_59_output/user.log
Running loader (pattern=projection): standard_tomo...
    Finished loader: standard_tomo (httomo) Took 37.68ms
Section 0 (pattern=projection) with the following methods:
    data_reducer (httomolib)
    find_center_vo (httomolibgpu)
    remove_outlier (httomolibgpu)
    normalize (httomolibgpu)
     0%|          | 0/2 [00:00<?, ?block/s]
    50%|#####     | 1/2 [00:00<00:00,  1.29block/s]
    --->The center of rotation is 79.5
    Finished processing last block
Section 1 (pattern=sinogram) with the following methods:
    remove_stripe_based_sorting (httomolibgpu)
    FBP (httomolibgpu)
    save_intermediate_data (httomo)
    save_to_images (httomolib)
     0%|          | 0/1 [00:00<?, ?block/s]
    Finished processing last block
Pipeline finished. Took 1.937s

This is due to how the max slices is calculated. The absolute maximum that the max slices can be (at the start of the function, before being potentially whittled down by the different methods in a section) is based on the chunk_shape of the data source:

data_shape = self.source.chunk_shape
max_slices = data_shape[slicing_dim]

The chunk_shape property on any implementor of DataSetSource does not include padding. Therefore, the absolute max slices that is started with is the length of the chunk shape's slicing dim unpadded. Therefore, even if the GPU could fit:

  • all slices
  • plus, the necessary padding slices

the determine_max_slices() method only can report that max slices is "all slices without padding slices".

In the context of the test data, the max slices calculated is 180 slices (all projections). But, execution of the first section needs 2 padding slices added, so there are 182 slices to process. Because the max slices is only 180 and not 182 (even though the GPU can fit 182 slices in memory), this forces the chunk to be split into 2 blocks.

In order to fix this, I think the determine_max_slices() logic needs to account for required padding slices, to handle the case when the max slice + padding slices could also fit into GPU memory.

@yousefmoazzam yousefmoazzam added bug Something isn't working framework Data-handling framework related labels Sep 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working framework Data-handling framework related
Projects
None yet
Development

No branches or pull requests

3 participants
@yousefmoazzam and others