Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Transfer learning tutorial doesn't work with pytorch backend #20287

Closed
off6atomic opened this issue Sep 25, 2024 · 12 comments
Closed

Transfer learning tutorial doesn't work with pytorch backend #20287

off6atomic opened this issue Sep 25, 2024 · 12 comments
Assignees
Labels

Comments

@off6atomic
Copy link

off6atomic commented Sep 25, 2024

If you run this code with backend set to "torch"
https://keras.io/guides/transfer_learning/

import os
os.environ["KERAS_BACKEND"] = "torch"

you will get error in the following cell:

for images, labels in train_ds.take(1):
    plt.figure(figsize=(10, 10))
    first_image = images[0]
    for i in range(9):
        ax = plt.subplot(3, 3, i + 1)
        augmented_image = data_augmentation(np.expand_dims(first_image, 0))
        plt.imshow(np.array(augmented_image[0]).astype("int32")) # error occurs here
        plt.title(int(labels[0]))
        plt.axis("off")
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
[<ipython-input-12-3402074d1a3a>](https://localhost:8080/#) in <cell line: 1>()
      5         ax = plt.subplot(3, 3, i + 1)
      6         augmented_image = data_augmentation(np.expand_dims(first_image, 0))
----> 7         plt.imshow(np.array(augmented_image[0]).astype("int32"))
      8         plt.title(int(labels[0]))
      9         plt.axis("off")

[/usr/local/lib/python3.10/dist-packages/torch/_tensor.py](https://localhost:8080/#) in __array__(self, dtype)
   1081             return handle_torch_function(Tensor.__array__, (self,), self, dtype=dtype)
   1082         if dtype is None:
-> 1083             return self.numpy()
   1084         else:
   1085             return self.numpy().astype(dtype, copy=False)

TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.

How do we fix this? This error happens on Colab and also on local machine.

@ghsanti
Copy link
Contributor

ghsanti commented Sep 26, 2024

Is the augmented image (or the first image) a tensor in GPU? @off6atomic

If it is, can you do .cpu() it ?

@mehtamansi29
Copy link
Collaborator

mehtamansi29 commented Sep 27, 2024

Hi @off6atomic -

Can you please let me know which keras version you are getting error ? I ran the same code by setting backend set to "torch" and it is running fine on keras 3.5.0.
And are you running this only GPU or CPU ?

Attached gist for the reference.

@off6atomic
Copy link
Author

off6atomic commented Sep 30, 2024

With GPU, Keras 3.4.1.

But there is a problem with your gist. You setup backend variable after importing keras. You need to setup backend variable before importing keras.

After you do that, you will face the same error even with Keras 3.5.0.

image

@mehtamansi29
Copy link
Collaborator

Hi @off6atomic -

As per mention in your code snipper, for pytorch backend getting error TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.

As per the error pytorch tensor which are in cuda:0 and Numpy array reside in CPU.
So as mentioned here you can use augmented_image[0].cpu() to move pytorch tensor from GPU into CPU.

for images, labels in train_ds.take(1):
    plt.figure(figsize=(10, 10))
    first_image = images[0]
    for i in range(9):
        ax = plt.subplot(3, 3, i + 1)
        augmented_image = data_augmentation(np.expand_dims(first_image, 0))
        plt.imshow(np.array(augmented_image[0].cpu()).astype("int32"))
        plt.title(int(labels[0]))
        plt.axis("off")

Attached gist having running entire code for reference.

@off6atomic
Copy link
Author

@mehtamansi29 It works!
But as a user I expect that the same code would work with all the backends. Is this expectation invalid here?

@mehtamansi29
Copy link
Collaborator

Hi @off6atomic-

But as a user I expect that the same code would work with all the backends. Is this expectation invalid here?

The code and description at here https://keras.io/guides/transfer_learning/ is for keras API which contains all backends in it.

@off6atomic
Copy link
Author

OK. I mean that if I change the backend to torch, I expect that I don't need to change anything else in the code for it to work.

In this case, we have to convert the tensor to cpu before it works. It's not a big hurdle though. Just wanted to know whether users are expected to know that they have to convert tensor to CPU in pytorch case and doesn't have to do it in tensorflow case.

@mehtamansi29
Copy link
Collaborator

Hi @off6atomic -

Just wanted to know whether users are expected to know that they have to convert tensor to CPU in pytorch case and doesn't have to do it in tensorflow case.

Yes. We have to convert tensor to CPU in pytorch case and doesn't have to do it in tensorflow case.

@off6atomic
Copy link
Author

Is this difference in behavior documented somewhere?

@mehtamansi29
Copy link
Collaborator

Hi @off6atomic -

That is not documented somewhere but from here you can find if cuda is available torch tensor is in GPU. And from the error TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first also get that not able to convert cuda:0 device type tensor means torch tensor need to convert to numpy for CPU.

@off6atomic
Copy link
Author

@mehtamansi29 Then there's probably no issue with the tutorial I guess. We can close it. Thank you!

Copy link

Are you satisfied with the resolution of your issue?
Yes
No

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants