Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integrate padding into runner #446

Merged
merged 8 commits into from
Sep 19, 2024
Merged

Conversation

yousefmoazzam
Copy link
Collaborator

@yousefmoazzam yousefmoazzam commented Sep 12, 2024

This enables the task runner to load and process padded blocks.

Checklist

  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • I have made corresponding changes to the documentation

@yousefmoazzam
Copy link
Collaborator Author

@dkazanc Only one method, remove_outlier, that can make use of the padding feature has been enabled so far, purely as a means of testing. Feel free to add support for any other methods as a means of testing if the padding feature works or not 🙂

@yousefmoazzam yousefmoazzam marked this pull request as ready for review September 12, 2024 16:06
@dkazanc
Copy link
Collaborator

dkazanc commented Sep 13, 2024

Thanks Yousef, can you please check if you get an assertion error by running the following pipeline. Note the saving is enabled after the median filter, if the saving is disabled the pipeline works. I'll add the median enabled padding bit in a second. cheers

- method: standard_tomo
  module_path: httomo.data.hdf.loaders
  parameters:
    name: tomo
    data_path: entry1/tomo_entry/data/data
    image_key_path: entry1/tomo_entry/instrument/detector/image_key
    rotation_angles:
      data_path: /entry1/tomo_entry/data/rotation_angle
    preview:
      detector_y:
        start: 900
        stop: 1120
- method: find_center_vo
  module_path: httomolibgpu.recon.rotation
  parameters:
    ind: mid
    smin: -50
    smax: 50
    srad: 6
    step: 0.25
    ratio: 0.5
    drop: 20
  id: centering
  side_outputs:
    cor: centre_of_rotation
- method: normalize
  module_path: httomolibgpu.prep.normalize
  parameters:
    cutoff: 10.0
    minus_log: true
    nonnegativity: false
    remove_nans: true
- method: median_filter
  module_path: httomolibgpu.misc.corr
  parameters:
    kernel_size: 5
    dif: 0.0
  save_result: true
- method: FBP
  module_path: httomolibgpu.recon.algorithm
  parameters:
    center: ${{centering.side_outputs.centre_of_rotation}}
    filter_freq_cutoff: 1.1
    recon_size: null
    recon_mask_radius: null
  save_result: true

@dkazanc
Copy link
Collaborator

dkazanc commented Sep 13, 2024

to the error above, it actually fails on the saving the 2nd intermediate data from FBP, not the one that is after median I believe.

  File "/home/algol/Documents/DEV/httomo/httomo/methods.py", line 69, in save_intermediate_data
    _save_dataset_data(dataset, data, global_shape, global_index)
  File "/home/algol/Documents/DEV/httomo/httomo/methods.py", line 139, in _save_dataset_data
    assert stop[0] <= dataset.shape[0]

@yousefmoazzam
Copy link
Collaborator Author

Yep, I can reproduce the error mentioned. Looks like maybe the intermediate data wrapper doesn't deal with padded blocks correctly or something, I'll take a look, thanks for testing this out!

FYI, for me, I can get rid of FBP and the error still appears. So on my end it looks like it happens on saving the last block that is outputted by the median filter:

(base) root@492a2a3538d1:/httomo# python -m httomo run tests/test_data/tomo_standard.nxs median-filter-padding-pipeline.yaml output_dir/
____! CCPi-regularisation package (CuPy part needed only) is missing, please install !____
____! CCPi-regularisation package is missing, please install to support regularisation !____
Pipeline has been separated into 2 sections
See the full log file at: output_dir/13-09-2024_10_13_56_output/user.log
Running loader (pattern=projection): standard_tomo...
    Finished loader: standard_tomo (httomo) Took 37.62ms
Section 0 (pattern=projection) with the following methods:
    data_reducer (httomolib)
    find_center_vo (httomolibgpu)
    normalize (httomolibgpu)
    median_filter (httomolibgpu)
    save_intermediate_data (httomo)
     0%|          | 0/2 [00:00<?, ?block/s]
    50%|#####     | 1/2 [00:02<00:02,  2.18s/block]
    --->The center of rotation is 79.5
Traceback (most recent call last):
  File "/opt/conda/lib/python3.10/runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/opt/conda/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/httomo/httomo/__main__.py", line 4, in <module>
    main()
  File "/opt/conda/lib/python3.10/site-packages/click/core.py", line 1157, in __call__
    return self.main(*args, **kwargs)
  File "/opt/conda/lib/python3.10/site-packages/click/core.py", line 1078, in main
    rv = self.invoke(ctx)
  File "/opt/conda/lib/python3.10/site-packages/click/core.py", line 1688, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/opt/conda/lib/python3.10/site-packages/click/core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/opt/conda/lib/python3.10/site-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
  File "/httomo/httomo/cli.py", line 219, in run
    runner.execute()
  File "/httomo/httomo/runner/task_runner.py", line 66, in execute
    self._execute_section(section, i)
  File "/httomo/httomo/runner/task_runner.py", line 138, in _execute_section
    block = self._execute_section_block(section, block)
  File "/httomo/httomo/runner/task_runner.py", line 206, in _execute_section_block
    block = self._execute_method(method, block)
  File "/httomo/httomo/runner/task_runner.py", line 239, in _execute_method
    block = method.execute(block)
  File "/httomo/httomo/method_wrappers/save_intermediate.py", line 76, in execute
    self._method(
  File "/httomo/httomo/methods.py", line 69, in save_intermediate_data
    _save_dataset_data(dataset, data, global_shape, global_index)
  File "/httomo/httomo/methods.py", line 139, in _save_dataset_data
    assert stop[0] <= dataset.shape[0]
AssertionError

The pipeline I used was adapted from your example to run on the test data, and removes the FBP method from the picture:

- method: standard_tomo
  module_path: httomo.data.hdf.loaders
  parameters:
    name: tomo
    data_path: entry1/tomo_entry/data/data
    image_key_path: entry1/tomo_entry/instrument/detector/image_key
    rotation_angles:
      data_path: /entry1/tomo_entry/data/rotation_angle
- method: find_center_vo
  module_path: httomolibgpu.recon.rotation
  parameters:
    ind: mid
    smin: -50
    smax: 50
    srad: 6
    step: 0.25
    ratio: 0.5
    drop: 20
  id: centering
  side_outputs:
    cor: centre_of_rotation
- method: normalize
  module_path: httomolibgpu.prep.normalize
  parameters:
    cutoff: 10.0
    minus_log: true
    nonnegativity: false
    remove_nans: true
- method: median_filter
  module_path: httomolibgpu.misc.corr
  parameters:
    kernel_size: 5
    dif: 0.0
  save_result: true

@dkazanc
Copy link
Collaborator

dkazanc commented Sep 13, 2024

OK, I've looked at the result of the padding. Here is the obvious poorly stitched region of unpadded data after median filter with kernel 7.
no_padding_2
And here is the padded one (for now I ignored the first block and the last block not saved due to the error above). Interestingly the sharp line has reduced its intensity but not fully removed, something remains there. I guess the reason that the line is in the different place is that the block size has changed. Still I wonder why there is a line left... I'll check the padding calculation, may be we padded slightly less than needed.
padding_2

@yousefmoazzam
Copy link
Collaborator Author

Oh yeah I see what you mean, I have to squint a bit at the second image, but I do see a fainter line still!

I'll have a think/look and see if I can come up with anything to explain this unexpected behaviour - looks like the padding is sort of half working only... 😅

@yousefmoazzam
Copy link
Collaborator Author

Also, the latest commit I pushed should have fixed the error when saving the last block.

@dkazanc
Copy link
Collaborator

dkazanc commented Sep 13, 2024

It could be an issue of an index, i.e., the padded region is slightly smaller (1 pixel smaller) than it should be. So I went to _calc_padding_median_filter and increased the padding region to 1 pixel more like this (kernel_size // 2 + 1, kernel_size // 2 + 1). It seems that it solved the problem, I see not boundaries here. So may be worth rechecking the indices quickly? And yes the data is saved in full now, thanks.
padded2

@dkazanc
Copy link
Collaborator

dkazanc commented Sep 13, 2024

It would be nice to have a test that runs a method with padding on the whole region as a full 3D method and then also imitates the blocking with more processed padded blocks than needed. The the framework does it thing to unpad, etc., but the results of both approaches should be identical.
E.g., the result of full-data 3D median = the result of padded N-number of blocks 3D median.
For some methods, e.g., iterative ones, this won't hold, but for median it should.

@yousefmoazzam
Copy link
Collaborator Author

It would be nice to have a test that runs a method with padding on the whole region as a full 3D method and then also imitates the blocking with more processed padded blocks than needed. The the framework does it thing to unpad, etc., but the results of both approaches should be identical. E.g., the result of full-data 3D median = the result of padded N-number of blocks 3D median. For some methods, e.g., iterative ones, this won't hold, but for median it should.

Mmm, I had a similar thought when you shared the first pair of screenshots. It feels like, to have absolute confidence that the both the following produce the same result:

  • without padding, processing an entire 3D chunk in one go
  • with padding, processing two 3D blocks from the chunk sequentially

a test making assertions on arrays would be best, so then we don't have to rely on visual inspection.

Idea for a test

I was having a think about at what "level" would be the best place to test this. The objects that are involved in padded blocks have tests, and we need to be careful to not have this new test testing stuff that is already tested.

My idea after an initial thought was that this could be a test for the task runner:

  • give it a section that contains a single method (to keep it as simple as possible), one that needs padding
  • execute that single section, and force it to have multiple blocks, so we have multiple padded blocks being processed
    • the _execute_section() method is private, which isn't the nicest to run manually, but it seems like the easiest way currently, and other task runner tests do this too so maybe it's good enough for the moment? 🤷
  • run the method separately on the same input, but as one full array, not split into multiple blocks
  • compare the padded output execution saved in the sink to what running the method separately does
    • the data in the sink has only the core part of the processed blocks are written to the sink, the padded slices are thrown away, so it in theory should be the same as when running the method on the entire chunk
    • the .sink attribute on the task runner is public so it's reasonable to access it

It would then be considered an integration test of the runner for padding I guess, which doesn't yet exist I think. And it avoids testing padding for the lower-level objects (like DataSetBlock or the wrappers), which already have padding tests.

3D method requiring padding that is used in the test

Another thing to mention is that, if possible, I'd prefer to not have to rely on a 3D method in some external package for testing padding. This is so then the test of the padding framework in httomo is independent of any methods backend.

I'm not 100% sure if it'd work nicely, but if it did, would you be okay with the test defining a small 3D GPU method local to the test, that can be used? The wrapper in the section probably can be patched to use this dummy 3D method, and I reckon all the other stuff relating to the methods database can be fiddled with/mocked in order to get the wrapper in the test to execute this dummy 3D method?

@dkazanc
Copy link
Collaborator

dkazanc commented Sep 13, 2024

Yes, that sounds like a good solution. We don't need a 3D method for that indeed. We can simply randomly generate a 3D data first and a method that modifies this data, for instance, scaling or adding a number. If things will go wrong with the padding, I believe, we should be able to trace it only with that. But the input data must be random or the standard data we use.

@yousefmoazzam
Copy link
Collaborator Author

yousefmoazzam commented Sep 13, 2024

Ok cool, I'll get started on that test 🙂 (And it should fail at first, given the horizontal lines in the data shown earlier).

One thing I'd like to clarify since I'm not sure now: I thought the dummy method in the test does have to be 3D, in the sense that it has to use the neighbourhood/padded slices in some way? I thought that if it doesn't use the padded slices, then there won't be any difference between:

  • running the dummy method with correctly padded blocks
  • running the dummy method without padding at all or incorrectly padded blocks

The goal of the test is to be able to tell if the blocks have been padded correctly by the task runner, so I thought we need to be able to see the difference between those two cases? And there's only a difference between the two cases when the method is 3D? Or have I gotten things mixed up (which is quite possible, apologies)?... 😅

@dkazanc
Copy link
Collaborator

dkazanc commented Sep 13, 2024

thanks @yousefmoazzam . Yes, I think by simply changing a value per pixel won't change anything as I think the stitching of the boundaries back to the original position is happening correctly at the moment (we do not see jumps in the image). So we need a 3D change indeed. The simplest could be a sum over 6+1(central) pixels if we consider 3 x 3 x 3 neighbourhood or 24+1 pixels for 5 x 5 x 5 (padding 2). Could you do that? That should be enough I believe?

@yousefmoazzam
Copy link
Collaborator Author

Yeah I agree about the emphasis on having a simple 3D method in the test; while the method does need to be 3D, it doesn't need to be complex, just enough to ensure that a change of neighbourhood has a detectable effect on its output. What you suggested sounds like a good candidate, I'll give it a go 🙂

@dkazanc
Copy link
Collaborator

dkazanc commented Sep 14, 2024

I can confirm that after the fix 0a3cd1a the lines disappeared. So visually all seem to work correctly and I hope that the test should pass.

Note that this test currently makes some assumptions about the internals
of the task runner in order to verify that the result of processing the
padded blocks is as expected. See the code comments marked `NOTE` for
more details.
@yousefmoazzam
Copy link
Collaborator Author

Ok, apologies for being quiet after the integration test was added, I was investigating the two issues that are linked above:

  1. The median filter issue appears to possibly not be related to the padding, but something specific to the median filter method.
  2. The chunk being split into two blocks even if the chunk can fit into GPU memory doesn't feel like it's too urgent, since full chunks often don't fit into GPU memory when running on full data, but obviously is nice to fix for the sake of the number of blocks being produced is as expected.

I think the integration test is mostly ok, but I have added two notes marked NOTE that flag the bad things being done (making assumptions about the internals of the task runner).

If we feel that it's a bad way to test things we can always remove the test and try to think of better ways to do it (which probably requires some refactoring to get rid of the need to inspect internals of the task runner), since the horizontal lines were gone even before I wrote the test.

Let me know what you think @dkazanc 🙂

@dkazanc
Copy link
Collaborator

dkazanc commented Sep 19, 2024

Many thanks @yousefmoazzam. I think we can live with #453 in place for know as it mainly the issue of the small data? I've run tests, they all pass for me. Feel free to wrap up this PR.

@yousefmoazzam
Copy link
Collaborator Author

Yeah, I think #453 affects small data mostly. I agree that it can be left for later, so I'll merge this to wrap up the main integration of padding into the runner.

@yousefmoazzam yousefmoazzam merged commit 351e028 into main Sep 19, 2024
6 of 7 checks passed
@yousefmoazzam yousefmoazzam deleted the integrate-padding-into-runner branch September 19, 2024 13:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants