You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is more of a feature request I guess.
Is it possible to use multiple GPUs for the transfer code. It tried to implement this myself with
model = nn.DataParallel(model)
but that did not work because it needs to call model.module.rnn.
Traceback (most recent call last):
File "transfer2.py", line 185, in <module>
trXt, trY = transform(model, train_data)
File "transfer2.py", line 143, in transform
model.rnn.reset_hidden(batch_size)
File "/home/imsm/.conda/envs/jupyterlab/lib/python3.6/site-packages/torch/nn/modules/module.py", line 518, in __getattr__
type(self).__name__, name))
AttributeError: 'DataParallel' object has no attribute 'rnn'
But if I rename to
modelpar = nn.DataParallel(model)
model = modelpar.module
I'm back to a single GPU. Do I have to call model.module.rnn in every instance or does this not work at all?
The text was updated successfully, but these errors were encountered:
So the way DataParallel works is the .forward method takes cpu data (note that the tensors have to be on cpu) and broadcasts it to all the available GPUs where it passes the GPU data to the model's forward method.
What you tried didn't work because you only passed the data to the model's forward method not DataParallel's forward method. So you need to
Use the full dataparallel module for forward
Any time you need to access a model attribute, access it from the original modelpar.module module
Use cpu data type so that the model's forward can automatically send the data to the right gpu.
We had an implementation for this working in our original release of the codebase if you'd like a reference https://github.com/NVIDIA/sentiment-discovery/releases/tag/v0.1. It was too difficult to maintain while trying to add new features so we deprecated it.
This is more of a feature request I guess.
Is it possible to use multiple GPUs for the transfer code. It tried to implement this myself with
but that did not work because it needs to call model.module.rnn.
But if I rename to
I'm back to a single GPU. Do I have to call model.module.rnn in every instance or does this not work at all?
The text was updated successfully, but these errors were encountered: