Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ThreadDataLoader with multiple thread worker ruins RandCropByPosNegLabeld #8080

Open
eclipse0922 opened this issue Sep 11, 2024 · 0 comments

Comments

@eclipse0922
Copy link

eclipse0922 commented Sep 11, 2024

Describe the bug
A clear and concise description of what the bug is.

ThreadDataLoader with multiple thread worker ruins RandCropByPosNegLabeld
RandCropByPosNegLabeld should produce same sized data pathces but its output has diffrent sized images as below error message.

I checked my input image and label data, all of them larger than crop size(160, 160, 160)

[2024-09-11 23:33:50,010][dev_collate][CRITICAL] - >>> collate dict key "image" out of 2 keys
[2024-09-11 23:33:50,071][dev_collate][CRITICAL] - >>>> collate/stack a list of tensors
[2024-09-11 23:33:50,071][dev_collate][CRITICAL] - >>>> E: stack expects each tensor to be equal size, but got [1, 160, 160, 160] at entry 0 and [1, 160, 160, 110] at entry 2, shape [torch.Size([1, 160, 160, 160]), torch.Size([1, 160, 160, 160]), torch.Size([1, 160, 160, 110]), torch.Size([1, 160, 160, 160])] in collate([metatensor([[[[0.0112, 0.0116, 0.0093, ..., 0.0084, 0.0121, 0.0074],

To Reproduce

Use ThreadDataloader with multiple thread worker
Use RandCropByPosNegLabeld transform
Load Transformed data with ThreadDataloader
Boom!

from monai.data import (
    CacheDataset,
    ThreadDataLoader,
    SmartCacheDataset,
    Dataset,
    DataLoader,
)

from monai.transforms import (
    Compose,
    LoadImaged,
    EnsureChannelFirstd,
    RandCropByPosNegLabeld,
    CropForegroundd,
    RandSpatialCropd
    EnsureTyped,
    ToTensord
)

def image_loader_transforms(cfg):
    return Compose(
        [
            LoadImaged(keys=["image", "label"]),
            EnsureChannelFirstd(keys=["image", "label"]),
            #RandSpatialCropd(keys=["image", "label"], roi_size=cfg.trainer.image_size, random_size=False),
            RandCropByPosNegLabeld(
                keys=["image", "label"],
                image_key="image",
                label_key="label",
                spatial_size=cfg.trainer.image_size,
                pos=1,
                neg=1,
                num_samples=cfg.trainer.num_random_crops,
                allow_smaller=False
            ),
            ToTensord(keys=["image", "label"]),
        ]
    )


  dataset = Dataset(
        data=full_dataset,
        transform= image_loader_transforms(cfg),
    )

 batch_s=4

 dataloader =ThreadDataLoader(
      dataset,
      batch_size=batch_s,
      num_workers=16,
      shuffle=True,
      use_thread_workers=True,
      collate_fn=list_data_collate,
  )

I set
num_random_crops =4
image_size =160,160,160

Expected behavior
A clear and concise description of what you expected to happen.

The RandCropByPosNegLabeld should produce cropped images of the same size regardless of the data loader type.

Screenshots
If applicable, add screenshots to help explain your problem.
image

Environment

Ensuring you use the relevant python executable, please paste the output of:

================================
Printing MONAI config...
================================
MONAI version: 1.3.2
Numpy version: 1.26.4
Pytorch version: 2.4.0+cu121
MONAI flags: HAS_EXT = False, USE_COMPILED = False, USE_META_DICT = False
MONAI rev id: 59a7211070538586369afd4a01eca0a7fe2e742e
MONAI __file__: /home/<username>/Dev/cc_ai_sandbox/slimunetr_mtl_test/.conda/lib/python3.11/site-packages/monai/__init__.py

Optional dependencies:
Pytorch Ignite version: 0.4.11
ITK version: 5.4.0
Nibabel version: 5.2.1
scikit-image version: 0.23.2
scipy version: 1.14.1
Pillow version: 10.4.0
Tensorboard version: 2.17.1
gdown version: 5.2.0
TorchVision version: 0.19.0+cu121
tqdm version: 4.66.5
lmdb version: 1.5.1
psutil version: 6.0.0
pandas version: 2.2.2
einops version: 0.8.0
transformers version: NOT INSTALLED or UNKNOWN VERSION.
mlflow version: 2.16.0
pynrrd version: 1.0.0
clearml version: 1.16.4

For details about installing the optional dependencies, please visit:
    https://docs.monai.io/en/latest/installation.html#installing-the-recommended-dependencies


================================
Printing system config...
================================
System: Linux
Linux version: Ubuntu 22.04.4 LTS
Platform: Linux-6.8.0-40-generic-x86_64-with-glibc2.35
Processor: x86_64
Machine: x86_64
Python version: 3.11.9
Process name: pt_main_thread
Command: ['python', '-c', 'import monai; monai.config.print_debug_info()']
Open files: [popenfile(path='/home/sewon/.vscode-server/data/logs/20240911T230538/network.log', fd=19, position=0, mode='a', flags=33793), popenfile(path='/home/sewon/.vscode-server/data/logs/20240911T230538/ptyhost.log', fd=20, position=2515, mode='a', flags=33793), popenfile(path='/home/sewon/.vscode-server/data/logs/20240911T230538/remoteagent.log', fd=24, position=489, mode='a', flags=33793)]
Num physical CPUs: 48
Num logical CPUs: 96
Num usable CPUs: 96
CPU usage (%): [5.4, 5.0, 5.0, 5.0, 5.3, 5.3, 5.0, 8.4, 5.0, 5.3, 5.0, 5.3, 5.7, 5.3, 5.0, 5.3, 5.3, 0.0, 0.0, 0.0, 5.3, 5.3, 5.0, 0.0, 5.4, 5.0, 5.0, 5.3, 5.3, 5.0, 5.0, 5.3, 5.3, 5.3, 0.0, 5.0, 5.3, 0.0, 0.0, 5.0, 5.0, 5.0, 0.0, 0.0, 0.0, 0.0, 5.3, 5.0, 5.0, 5.0, 0.0, 0.0, 0.0, 0.0, 5.3, 5.3, 5.3, 0.4, 0.0, 0.0, 5.0, 5.0, 0.0, 0.0, 5.3, 5.3, 5.3, 5.3, 0.0, 0.0, 5.3, 5.0, 5.0, 0.0, 5.0, 5.0, 5.0, 0.0, 5.0, 5.0, 5.0, 0.0, 5.3, 0.0, 0.0, 5.3, 5.3, 0.0, 0.0, 1.1, 5.0, 5.0, 5.3, 5.0, 0.0, 96.9]
CPU freq. (MHz): 872
Load avg. in last 1, 5, 15 mins (%): [1.9, 2.8, 2.3]
Disk usage (%): 92.2
Avg. sensor temp. (Celsius): UNKNOWN for given OS
Total physical memory (GB): 125.5
Available memory (GB): 119.3
Used memory (GB): 5.0

================================
Printing GPU config...
================================
Num GPUs: 2
Has CUDA: True
CUDA version: 12.1
cuDNN enabled: True
NVIDIA_TF32_OVERRIDE: None
TORCH_ALLOW_TF32_CUBLAS_OVERRIDE: None
cuDNN version: 90100
Current device: 0
Library compiled for CUDA architectures: ['sm_50', 'sm_60', 'sm_70', 'sm_75', 'sm_80', 'sm_86', 'sm_90']
GPU 0 Name: NVIDIA H100 PCIe
GPU 0 Is integrated: False
GPU 0 Is multi GPU board: False
GPU 0 Multi processor count: 114
GPU 0 Total memory (GB): 79.2
GPU 0 CUDA capability (maj.min): 9.0
GPU 1 Name: NVIDIA H100 PCIe
GPU 1 Is integrated: False
GPU 1 Is multi GPU board: False
GPU 1 Multi processor count: 114
GPU 1 Total memory (GB): 79.2
GPU 1 CUDA capability (maj.min): 9.0

Additional context
Add any other context about the problem here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant