Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

03_05_vae_faces_train Error durinig training #83

Open
takeofuture opened this issue Dec 4, 2020 · 8 comments
Open

03_05_vae_faces_train Error durinig training #83

takeofuture opened this issue Dec 4, 2020 · 8 comments

Comments

@takeofuture
Copy link

takeofuture commented Dec 4, 2020

I try to run this on Google Colab (TF 2.3) (Branch=tensorflow_2)
I downloaded the image from https://www.kaggle.com/jessicali9530/celeba-dataset
and only placed about 1000 jpg files and changed epoch to 10 from 200.
The image is 178 x 218 Pixcls

I got the following error,

Epoch 1/10
16/31 [==============>...............] - ETA: 1:49 - loss: 938.0780 - reconstruction_loss: 937.9746 - kl_loss: 0.1033

---------------------------------------------------------------------------

ValueError                                Traceback (most recent call last)

<ipython-input-12-6b00e1cc66a8> in <module>()
      5     , run_folder = RUN_FOLDER
      6     , print_every_n_batches = PRINT_EVERY_N_BATCHES
----> 7     , initial_epoch = INITIAL_EPOCH
      8 )

9 frames

/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/utils/generic_utils.py in update(self, current, values, finalize)
    556           self._values[k] = [v * value_base, value_base]
    557         else:
--> 558           self._values[k][0] += v * value_base
    559           self._values[k][1] += value_base
    560       else:

ValueError: operands could not be broadcast together with shapes (32,) (10,) (32,) 

I am not sure if this is bug of util code, does anybody try this sucessfuly ?
(I am wondering if the number of image is too little or something else)

@Zindyrella
Copy link

As far as I found out, this is connected to the fact that the number of input images is not an integer multiple of the batch size.


It works if the number of input images is for example 32,... 192,...


Non integer multiple sizes behave as followed:

200 input images:

ValueError: operands could not be broadcast together with shapes (32,) (8,) (32,)

600 input images:

ValueError: operands could not be broadcast together with shapes (32,) (24,) (32,)

Please let me know if you manage to make it work for input counts that are non-integer multiples of the batch size!!!!!! :)

@daiyl
Copy link

daiyl commented Dec 19, 2020

I got another problem:
"ValueError: Layer decoder expects 1 input(s), but it received 3 input tensors. Inputs received: [<tf.Tensor: shape=(32, 200), dtype=float32, numpy=...."

What's wrong? Thanks!

@daiyl
Copy link

daiyl commented Dec 19, 2020

I got another problem:
"ValueError: Layer decoder expects 1 input(s), but it received 3 input tensors. Inputs received: [<tf.Tensor: shape=(32, 200), dtype=float32, numpy=...."

What's wrong? Thanks!

Revise the code:

class VAEModel(Model):
    ...
    def call(self,inputs):
        **_z_mean, z_log_var,_** latent = self.encoder(inputs)
        return self.decoder(latent)

@daiyl
Copy link

daiyl commented Dec 19, 2020

As far as I found out, this is connected to the fact that the number of input images is not an integer multiple of the batch size.

It works if the number of input images is for example 32,... 192,...

Non integer multiple sizes behave as followed:

200 input images:

ValueError: operands could not be broadcast together with shapes (32,) (8,) (32,)

600 input images:

ValueError: operands could not be broadcast together with shapes (32,) (24,) (32,)

Please let me know if you manage to make it work for input counts that are non-integer multiples of the batch size!!!!!! :)

What's my case?

ValueError: operands could not be broadcast together with shapes (32,) (7,) (32,)

@Zindyrella
Copy link

Zindyrella commented Dec 19, 2020

What's my case?

ValueError: operands could not be broadcast together with shapes (32,) (7,) (32,)

@daiyl consider decreasing your number of input images by 7, or increase it by 25.
So your number of input images becomes an integer multiple of your batch size 32.

@daiyl
Copy link

daiyl commented Dec 19, 2020

What's my case?
ValueError: operands could not be broadcast together with shapes (32,) (7,) (32,)

@daiyl consider decreasing your number of input images by 7, or increase it by 25.
So your number of input images becomes an integer multiple of your batch size 32.

@Zindyrella Thanks for your reply. However, where can I control the number of input images? I cannot find the control code. The ValueError case seems to happen randomly.

@satya400
Copy link

satya400 commented Dec 26, 2020

@daiyl - Just in case if you are still having this issue, the below are the steps how I created a work around:

DATA_FOLDER = '/content/data_faces/img_align_celeba/' # assuming that you have your images in this folder
BATCH_SIZE = 32
filenames = np.array(glob(os.path.join(DATA_FOLDER, '*.jpg')))
NUM_IMAGES = len(filenames)

print(' -- NUM_IMAGES :', NUM_IMAGES)
print(' ---- NUM_IMAGES / BATCH_SIZE :', NUM_IMAGES/BATCH_SIZE)

Now, assuming that the sample output is as below:
--- NUM_IMAGES : 202599
--- NUM_IMAGES / BATCH_SIZE : 6331.21875

So we have 0.21875 * 32 = 7 excess images OR (1-0.21875) * 32 = 25 images less for forming whole-batches.

So please use the Jupyter magic commands ( %cd etc) to either create 25 dummy images as below. Alternatively you can delete 7 images. Since the data set is huge, adding 25 dummy images may not skew the results nor does removing 7 images

!cp 000001.jpg D1.jpg
!cp 000001.jpg D2.jpg
!cp 000001.jpg D3.jpg
!cp 000001.jpg D4.jpg
!cp 000001.jpg D5.jpg
!cp 000001.jpg D6.jpg
!cp 000001.jpg D7.jpg
!cp 000001.jpg D7.jpg
!cp 000001.jpg D8.jpg
!cp 000001.jpg D9.jpg
!cp 000001.jpg D10.jpg
!cp 000001.jpg D11.jpg
!cp 000001.jpg D12.jpg
!cp 000001.jpg D13.jpg
!cp 000001.jpg D14.jpg
!cp 000001.jpg D15.jpg
!cp 000001.jpg D16.jpg
!cp 000001.jpg D17.jpg
!cp 000001.jpg D18.jpg
!cp 000001.jpg D19.jpg
!cp 000001.jpg D20.jpg
!cp 000001.jpg D21.jpg
!cp 000001.jpg D22.jpg
!cp 000001.jpg D23.jpg
!cp 000001.jpg D24.jpg
!cp 000001.jpg D25.jpg

This may not be a good solution, but a quick fix to move ahead.

Thanks
Satya

@zm274310577
Copy link

I got another problem:
"ValueError: Layer decoder expects 1 input(s), but it received 3 input tensors. Inputs received: [<tf.Tensor: shape=(32, 200), dtype=float32, numpy=...."
What's wrong? Thanks!

Revise the code:

class VAEModel(Model):
    ...
    def call(self,inputs):
        **_z_mean, z_log_var,_** latent = self.encoder(inputs)
        return self.decoder(latent)

i have the same error, the update can not work!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants