Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

About meanshift time consumption #6

Open
cama36 opened this issue Sep 25, 2023 · 4 comments
Open

About meanshift time consumption #6

cama36 opened this issue Sep 25, 2023 · 4 comments

Comments

@cama36
Copy link

cama36 commented Sep 25, 2023

Hello, I noticed that when using the embedding features branch, you used the meanshift clustering algorithm, but after my testing, this part of the clustering algorithm usually accounts for more than 80% of the entire process. I noticed that you also provided meanshift. GPU version, but it seems that it does not work properly. Do you have a better way to reduce this part of the time consumption?Thanks!

@bxiang233
Copy link
Collaborator

Hello, I noticed that when using the embedding features branch, you used the meanshift clustering algorithm, but after my testing, this part of the clustering algorithm usually accounts for more than 80% of the entire process. I noticed that you also provided meanshift. GPU version, but it seems that it does not work properly. Do you have a better way to reduce this part of the time consumption?Thanks!

Hi! I think I used the GPU version of meanshift, and what do you mean "not work properly"? Yes, the clustering step is time-consuming... Do you use your own dataset or ours? If for your own dataset, reduce the input radius of cylinders or use smaller voxel size would reduce the time.

@kasparas-k
Copy link

Hello, I noticed that when using the embedding features branch, you used the meanshift clustering algorithm, but after my testing, this part of the clustering algorithm usually accounts for more than 80% of the entire process. I noticed that you also provided meanshift. GPU version, but it seems that it does not work properly. Do you have a better way to reduce this part of the time consumption?Thanks!

Hi! I think I used the GPU version of meanshift, and what do you mean "not work properly"? Yes, the clustering step is time-consuming... Do you use your own dataset or ours? If for your own dataset, reduce the input radius of cylinders or use smaller voxel size would reduce the time.

I'm having the same issue with MeanShift slowing the training loop to impractically slow. The first 30 epochs take around 12 minutes each, I've been running the 31st epoch for over 6 hours now and it's still in batch 41/750.

Hardware:

RAM: 64 GB
CPU: AMD Ryzen 5 5600X 6-Core Processor
GPU: NVIDIA GeForce RTX 3060 12GB VRAM

I'm running treeins training task with the default settings found in the repository

python train.py task=panoptic data=panoptic/treeins_rad8 models=panoptic/area4_ablation_3heads_5 \
    model_name=PointGroup-PAPER training=treeins job_name=test_run_3heads

In your reply, you mention that GPU version of meanshift is used, but in meanshift_cluster. cluster_single line 79 and further, the tensors are explicitly sent to the CPU. The MeanShift used comes from sklearn which I believe has no GPU support as far as I'm aware. Is this GPU version of meanshift defined somewhere else in the repository? I'd appreciate any guidance you can provide.

@bxiang233
Copy link
Collaborator

Hello, I noticed that when using the embedding features branch, you used the meanshift clustering algorithm, but after my testing, this part of the clustering algorithm usually accounts for more than 80% of the entire process. I noticed that you also provided meanshift. GPU version, but it seems that it does not work properly. Do you have a better way to reduce this part of the time consumption?Thanks!

Hi! I think I used the GPU version of meanshift, and what do you mean "not work properly"? Yes, the clustering step is time-consuming... Do you use your own dataset or ours? If for your own dataset, reduce the input radius of cylinders or use smaller voxel size would reduce the time.

I'm having the same issue with MeanShift slowing the training loop to impractically slow. The first 30 epochs take around 12 minutes each, I've been running the 31st epoch for over 6 hours now and it's still in batch 41/750.

Hardware:

RAM: 64 GB
CPU: AMD Ryzen 5 5600X 6-Core Processor
GPU: NVIDIA GeForce RTX 3060 12GB VRAM

I'm running treeins training task with the default settings found in the repository

python train.py task=panoptic data=panoptic/treeins_rad8 models=panoptic/area4_ablation_3heads_5 \
    model_name=PointGroup-PAPER training=treeins job_name=test_run_3heads

In your reply, you mention that GPU version of meanshift is used, but in meanshift_cluster. cluster_single line 79 and further, the tensors are explicitly sent to the CPU. The MeanShift used comes from sklearn which I believe has no GPU support as far as I'm aware. Is this GPU version of meanshift defined somewhere else in the repository? I'd appreciate any guidance you can provide.

Hi! Thank you for your interest! I tried to recall my distant memory... Basically, I set the maximum number of training epochs, and if training is to be fully completed, it would take about a week. You can also check your training curves generated by wandb.

Regarding the GPU version, I remember I tried this repo: https://github.com/masqm/Faster-Mean-Shift-Euc. Basically, you just need to call their clustering function.

However, I later used parallel CPU computation for acceleration because I found it to be faster. If you're interested, you can make a comparison yourself. I didn't pursue further acceleration and optimization strategies because it is no longer the focus of my current research. I also struggled with the long training times due to the lengthy clustering process. I believe there is still room for improvement. If you have better methods, feel free to share them! Thank you.

Best,
Binbin

@kasparas-k
Copy link

Hi! Thank you for your interest! I tried to recall my distant memory... Basically, I set the maximum number of training epochs, and if training is to be fully completed, it would take about a week. You can also check your training curves generated by wandb.

Regarding the GPU version, I remember I tried this repo: https://github.com/masqm/Faster-Mean-Shift-Euc. Basically, you just need to call their clustering function.

However, I later used parallel CPU computation for acceleration because I found it to be faster. If you're interested, you can make a comparison yourself. I didn't pursue further acceleration and optimization strategies because it is no longer the focus of my current research. I also struggled with the long training times due to the lengthy clustering process. I believe there is still room for improvement. If you have better methods, feel free to share them! Thank you.

Best, Binbin

Thank you so much for your quick response. I'll try the code you linked to, maybe it will work faster in my case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants