Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update AddChannel, AsChannelFirst with EnsureChannelFirst #509

Merged
merged 27 commits into from
Oct 13, 2023

Conversation

KumoLiu
Copy link
Collaborator

@KumoLiu KumoLiu commented Sep 25, 2023

Fixes Project-MONAI/MONAI#7036.
Fixes #517.

Description

  • Update AddChannel, AsChannelFirst with EnsureChannelFirst.
  • Update workflow to workflow_type
  • Review deprecated _meta_dict API usage

Status

Work in progress

Please ensure all the checkboxes:

  • Codeformat tests passed locally by running ./runtests.sh --codeformat.
  • In-line docstrings updated.
  • Update version and changelog in metadata.json if changing an existing bundle.
  • Please ensure the naming rules in config files meet our requirements (please refer to: CONTRIBUTING.md).
  • Ensure versions of packages such as monai, pytorch and numpy are correct in metadata.json.
  • Descriptions should be consistent with the content, such as eval_metrics of the provided weights and TorchScript modules.
  • Files larger than 25MB are excluded and replaced by providing download links in large_file.yml.
  • Avoid using path that contains personal information within config files (such as use /home/your_name/ for "bundle_root").

Signed-off-by: KumoLiu <[email protected]>
@KumoLiu
Copy link
Collaborator Author

KumoLiu commented Sep 25, 2023

Update all related bundles including AddChannel and AsChannelFirst and verify them except for "ventricular_short_axis_3label" since I didn't have suitable data to test. cc @ericspod, could you please help update this one and verify it? Thanks!

@ericspod
Copy link
Member

Hi @KumoLiu how much data did you need? I can save the output for the example (256, 256) 2D image if that's enough.

@KumoLiu
Copy link
Collaborator Author

KumoLiu commented Sep 26, 2023

Hi @KumoLiu how much data did you need? I can save the output for the example (256, 256) 2D image if that's enough.

Hi @ericspod, yeah, that would be great, just one test data with the same format used in the bundle is okay. Thanks in advance!

@ericspod
Copy link
Member

SC-N-2-3-0_seg.zip
This is the output of the network converted to uint8 labels. This follows the notebook in the docs directory and should be enough to test the output of the network.

SC-N-2-3-0_pred.zip
This is the raw output tensor from the network if you want to use this instead.

@KumoLiu
Copy link
Collaborator Author

KumoLiu commented Oct 11, 2023

I may leave "ventricular_short_axis_3label" this bundle in this PR.
It still using the API before v0.6.

Not easy to update this one, when I remove the as_tensor_output, AddChannel and specify data_type in EnsureType, it throws the error below.

Update: fixed

error message opt/pytorch/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:365: operator(): block: [64,0,0], thread: [63,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed. 2023-10-11 03:38:01,405 - ignite.engine.engine.SupervisedTrainer - ERROR - Current run is terminating due to exception: CUDA error: device-side assert triggered CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1. Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.

2023-10-11 03:38:01,430 - ignite.engine.engine.SupervisedTrainer - ERROR - Engine run is terminating due to exception: CUDA error: device-side assert triggered
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.

2023-10-11 03:38:01,431 - ignite.engine.engine.SupervisedTrainer - INFO - Deleted previous saved final checkpoint: model_final_iteration=1.pt
Traceback (most recent call last):
File "/usr/local/lib/python3.8/dist-packages/ignite/engine/engine.py", line 1068, in _run_once_on_dataset_as_gen
self.state.output = self._process_function(self, self.state.batch)
File "/workspace/Code/MONAI/monai/engines/trainer.py", line 230, in _iteration
_compute_pred_loss()
File "/workspace/Code/MONAI/monai/engines/trainer.py", line 216, in _compute_pred_loss
engine.state.output[Keys.LOSS] = engine.loss_function(engine.state.output[Keys.PRED], targets).mean()
File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/workspace/Code/MONAI/monai/losses/dice.py", line 176, in forward
intersection = torch.sum(target * input, dim=reduce_axis)
File "/workspace/Code/MONAI/monai/data/meta_tensor.py", line 282, in torch_function
ret = super().torch_function(func, types, args, kwargs)
File "/usr/local/lib/python3.8/dist-packages/torch/_tensor.py", line 1296, in torch_function
ret = func(*args, **kwargs)
RuntimeError: CUDA error: device-side assert triggered
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.

@KumoLiu KumoLiu marked this pull request as ready for review October 11, 2023 03:52
@ericspod
Copy link
Member

I may leave "ventricular_short_axis_3label" this bundle in this PR. It still using the API before v0.6.

Is this totally fixed now? I think the error you have is from some issue on the platform side or something with the versions of Pytorch and/or CUDA. I don't remember what the as_tensor_output argument was meant to fix, it might have been MetaTensor related.

@KumoLiu
Copy link
Collaborator Author

KumoLiu commented Oct 11, 2023

I may leave "ventricular_short_axis_3label" this bundle in this PR. It still using the API before v0.6.

Is this totally fixed now? I think the error you have is from some issue on the platform side or something with the versions of Pytorch and/or CUDA. I don't remember what the as_tensor_output argument was meant to fix, it might have been MetaTensor related.

Yes, it works now. Thanks!

@wyli
Copy link
Collaborator

wyli commented Oct 11, 2023

/build

@wyli
Copy link
Collaborator

wyli commented Oct 12, 2023

/build

KumoLiu and others added 6 commits October 12, 2023 18:18
Signed-off-by: KumoLiu <[email protected]>
Signed-off-by: KumoLiu <[email protected]>
Signed-off-by: KumoLiu <[email protected]>
Signed-off-by: KumoLiu <[email protected]>
Signed-off-by: KumoLiu <[email protected]>
Signed-off-by: KumoLiu <[email protected]>
@wyli
Copy link
Collaborator

wyli commented Oct 12, 2023

/build

Signed-off-by: KumoLiu <[email protected]>
@wyli
Copy link
Collaborator

wyli commented Oct 12, 2023

/build

@wyli
Copy link
Collaborator

wyli commented Oct 12, 2023

/build

@wyli
Copy link
Collaborator

wyli commented Oct 12, 2023

/build

@yiheng-wang-nv
Copy link
Collaborator

/build

@yiheng-wang-nv
Copy link
Collaborator

/build

Copy link
Collaborator

@wyli wyli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, it looks good to me.

@KumoLiu KumoLiu mentioned this pull request Oct 13, 2023
8 tasks
@yiheng-wang-nv yiheng-wang-nv merged commit 340e65a into Project-MONAI:dev Oct 13, 2023
4 checks passed
@KumoLiu KumoLiu deleted the ensurechannelfirst branch October 13, 2023 13:26
yiheng-wang-nv pushed a commit that referenced this pull request Oct 16, 2023
Part of #509.

### Status
**Ready**

### Please ensure all the checkboxes:
<!--- Put an `x` in all the boxes that apply, and remove the not
applicable items -->
- [x] Codeformat tests passed locally by running `./runtests.sh
--codeformat`.
- [ ] In-line docstrings updated.
- [x] Update `version` and `changelog` in `metadata.json` if changing an
existing bundle.
- [ ] Please ensure the naming rules in config files meet our
requirements (please refer to: `CONTRIBUTING.md`).
- [ ] Ensure versions of packages such as `monai`, `pytorch` and `numpy`
are correct in `metadata.json`.
- [ ] Descriptions should be consistent with the content, such as
`eval_metrics` of the provided weights and TorchScript modules.
- [ ] Files larger than 25MB are excluded and replaced by providing
download links in `large_file.yml`.
- [ ] Avoid using path that contains personal information within config
files (such as use `/home/your_name/` for `"bundle_root"`).

---------

Signed-off-by: KumoLiu <[email protected]>
Co-authored-by: Wenqi Li <[email protected]>
yiheng-wang-nv added a commit to yiheng-wang-nv/model-zoo that referenced this pull request Jul 29, 2024
…ect-MONAI#509)

Fixes Project-MONAI/MONAI#7036.
Fixes Project-MONAI#517.

### Description
- Update `AddChannel`, `AsChannelFirst` with `EnsureChannelFirst`.
- Update `workflow` to `workflow_type`
- Review deprecated _meta_dict API usage

### Status
**Work in progress**

### Please ensure all the checkboxes:
<!--- Put an `x` in all the boxes that apply, and remove the not
applicable items -->
- [x] Codeformat tests passed locally by running `./runtests.sh
--codeformat`.
- [ ] In-line docstrings updated.
- [ ] Update `version` and `changelog` in `metadata.json` if changing an
existing bundle.
- [ ] Please ensure the naming rules in config files meet our
requirements (please refer to: `CONTRIBUTING.md`).
- [ ] Ensure versions of packages such as `monai`, `pytorch` and `numpy`
are correct in `metadata.json`.
- [ ] Descriptions should be consistent with the content, such as
`eval_metrics` of the provided weights and TorchScript modules.
- [ ] Files larger than 25MB are excluded and replaced by providing
download links in `large_file.yml`.
- [ ] Avoid using path that contains personal information within config
files (such as use `/home/your_name/` for `"bundle_root"`).

---------

Signed-off-by: KumoLiu <[email protected]>
Co-authored-by: Wenqi Li <[email protected]>
Co-authored-by: Yiheng Wang <[email protected]>
yiheng-wang-nv pushed a commit to yiheng-wang-nv/model-zoo that referenced this pull request Jul 29, 2024
Part of Project-MONAI#509.

### Status
**Ready**

### Please ensure all the checkboxes:
<!--- Put an `x` in all the boxes that apply, and remove the not
applicable items -->
- [x] Codeformat tests passed locally by running `./runtests.sh
--codeformat`.
- [ ] In-line docstrings updated.
- [x] Update `version` and `changelog` in `metadata.json` if changing an
existing bundle.
- [ ] Please ensure the naming rules in config files meet our
requirements (please refer to: `CONTRIBUTING.md`).
- [ ] Ensure versions of packages such as `monai`, `pytorch` and `numpy`
are correct in `metadata.json`.
- [ ] Descriptions should be consistent with the content, such as
`eval_metrics` of the provided weights and TorchScript modules.
- [ ] Files larger than 25MB are excluded and replaced by providing
download links in `large_file.yml`.
- [ ] Avoid using path that contains personal information within config
files (such as use `/home/your_name/` for `"bundle_root"`).

---------

Signed-off-by: KumoLiu <[email protected]>
Co-authored-by: Wenqi Li <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

review deprecated _meta_dict api usage test_dataset_tracking
4 participants