About training at the 31st epoch #4

BLUE-hub · 2023-08-23T06:54:32Z

Hello and thank you for your work. May I ask if you have encountered the following problem: a situation where training on NPM3D dataset gets stuck at the 31st epoch and does not continue. I have tried turning down the batch_size as well as changing the numwork to 0 and still have the same problem. Do you have a solution for this please?

bxiang233 · 2023-08-23T10:46:57Z

Hello and thank you for your work. May I ask if you have encountered the following problem: a situation where training on NPM3D dataset gets stuck at the 31st epoch and does not continue. I have tried turning down the batch_size as well as changing the numwork to 0 and still have the same problem. Do you have a solution for this please?

Hi, thanks a lot for your interest on our work! 31st is the first epoch including the ScoreNet. I also meet the problem when I run it on a GPU with small size. Using a smaller batch_size or a smaller radius size of the sampling cylinders could solve the problem. If not, could you please provide more infos about the problem? Like what GPU did you use? What command and what dataset did you use for training? Thanks.

Best,
Binbin

BLUE-hub · 2023-08-23T15:42:37Z

Thanks for your reply,I tried again and it produces the following error. My training data is NPM3D,GPU is rtx3090.Is there a solution, thanks.
File "D:/PanopticSegForLargeScalePointCloud-main/train.py", line 17, in main
trainer.train()
File "D:\PanopticSegForLargeScalePointCloud-main\torch_points3d\trainer.py", line 152, in train
self._train_epoch(epoch)
File "D:\PanopticSegForLargeScalePointCloud-main\torch_points3d\trainer.py", line 207, in _train_epoch
self._model.optimize_parameters2(epoch, i, self._dataset.batch_size)
File "D:\PanopticSegForLargeScalePointCloud-main\torch_points3d\models\base_model.py", line 274, in optimize_parameters2
self._grad_scale.step(self._optimizer) # update parameters
AttributeError: 'NoneType' object has no attribute 'step'

bxiang233 · 2023-08-25T09:08:26Z

Thanks for your reply,I tried again and it produces the following error. My training data is NPM3D,GPU is rtx3090.Is there a solution, thanks. File "D:/PanopticSegForLargeScalePointCloud-main/train.py", line 17, in main trainer.train() File "D:\PanopticSegForLargeScalePointCloud-main\torch_points3d\trainer.py", line 152, in train self._train_epoch(epoch) File "D:\PanopticSegForLargeScalePointCloud-main\torch_points3d\trainer.py", line 207, in _train_epoch self._model.optimize_parameters2(epoch, i, self._dataset.batch_size) File "D:\PanopticSegForLargeScalePointCloud-main\torch_points3d\models\base_model.py", line 274, in optimize_parameters2 self._grad_scale.step(self._optimizer) # update parameters AttributeError: 'NoneType' object has no attribute 'step'

Hi, maybe this issue can give you a solution?
torch-points3d/torch-points3d#676 (comment)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

About training at the 31st epoch #4

About training at the 31st epoch #4

BLUE-hub commented Aug 23, 2023

bxiang233 commented Aug 23, 2023

BLUE-hub commented Aug 23, 2023

bxiang233 commented Aug 25, 2023

About training at the 31st epoch #4

About training at the 31st epoch #4

Comments

BLUE-hub commented Aug 23, 2023

bxiang233 commented Aug 23, 2023

BLUE-hub commented Aug 23, 2023

bxiang233 commented Aug 25, 2023