Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

crack detection: runtime errors with train_model() #104

Open
DogmaF opened this issue Apr 1, 2020 · 2 comments
Open

crack detection: runtime errors with train_model() #104

DogmaF opened this issue Apr 1, 2020 · 2 comments

Comments

@DogmaF
Copy link

DogmaF commented Apr 1, 2020

RE the Jupyter file for the crack detection project: I'm get runtime errors at cell [34], when I try to train the model. It seems to have something to do with signal handling. The last item in the error hierarchy is:
RuntimeError: DataLoader worker (pid 83316) is killed by signal: Unknown signal: 0.

To simplify debugging, I tried running it with zero epochs. Here are the error statements generated when I do that.

(I also found that I needed to add a line for import torchsummary, and move %matplotlib inline to the top of the import list to overcome other errors.)
This is on a Mac (OSX 10.15.4) with Python 3.7.6 and pytorch 1.4.0

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-34-51af14cba900> in <module>
      1 base_model = train_model(resnet50, criterion, optimizer, exp_lr_scheduler, num_epochs=0)
----> 2 visualize_model(base_model)
      3 plt.show()

<ipython-input-25-8be992550be9> in visualize_model(model, num_images)
      6 
      7     with torch.no_grad():
----> 8         for i, (inputs, labels) in enumerate(dataloaders['val']):
      9             inputs = inputs.to(device)
     10             labels = labels.to(device)

~/opt/anaconda3/lib/python3.7/site-packages/torch/utils/data/dataloader.py in __iter__(self)
    277             return _SingleProcessDataLoaderIter(self)
    278         else:
--> 279             return _MultiProcessingDataLoaderIter(self)
    280 
    281     @property

~/opt/anaconda3/lib/python3.7/site-packages/torch/utils/data/dataloader.py in __init__(self, loader)
    744         # prime the prefetch loop
    745         for _ in range(2 * self._num_workers):
--> 746             self._try_put_index()
    747 
    748     def _try_get_data(self, timeout=_utils.MP_STATUS_CHECK_INTERVAL):

~/opt/anaconda3/lib/python3.7/site-packages/torch/utils/data/dataloader.py in _try_put_index(self)
    870             return
    871 
--> 872         self._index_queues[worker_queue_idx].put((self._send_idx, index))
    873         self._task_info[self._send_idx] = (worker_queue_idx,)
    874         self._tasks_outstanding += 1

~/opt/anaconda3/lib/python3.7/multiprocessing/queues.py in put(self, obj, block, timeout)
     85         with self._notempty:
     86             if self._thread is None:
---> 87                 self._start_thread()
     88             self._buffer.append(obj)
     89             self._notempty.notify()

~/opt/anaconda3/lib/python3.7/multiprocessing/queues.py in _start_thread(self)
    157 
    158         # Start thread which transfers data from buffer to pipe
--> 159         self._buffer.clear()
    160         self._thread = threading.Thread(
    161             target=Queue._feed,

~/opt/anaconda3/lib/python3.7/site-packages/torch/utils/data/_utils/signal_handling.py in handler(signum, frame)
     64         # This following call uses `waitid` with WNOHANG from C side. Therefore,
     65         # Python can still get and update the process status successfully.
---> 66         _error_if_any_worker_fails()
     67         if previous_handler is not None:
     68             previous_handler(signum, frame)

RuntimeError: DataLoader worker (pid 83316) is killed by signal: Unknown signal: 0. 
@priya-dwivedi
Copy link
Owner

priya-dwivedi commented Apr 1, 2020 via email

@DogmaF
Copy link
Author

DogmaF commented Apr 2, 2020

Okay, thanks for the quick response! I will give it a try.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants