Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement RandomResizedCrop layer #131

Closed
sayakpaul opened this issue Feb 11, 2022 · 19 comments · Fixed by #499
Closed

Implement RandomResizedCrop layer #131

sayakpaul opened this issue Feb 11, 2022 · 19 comments · Fixed by #499

Comments

@sayakpaul
Copy link
Contributor

Randomly resized cropping is pretty much a standard augmentation transformation that is used to train vision models. Recently, a team at Facebook also showed why it's often necessary to include this transformation for better generalization [1].

An implementation of this layer in PyTorch is available as RandomResizedCrop.

References:

[1] https://arxiv.org/abs/2106.05121

@LukeWood
Copy link
Contributor

Yeah, I agree RandomResizedCrop should be included.

@innat
Copy link
Contributor

innat commented Feb 12, 2022

@sayakpaul
how about this one ?https://keras.io/api/layers/preprocessing_layers/image_augmentation/random_crop/

@LukeWood If the above layer is the desired layer for the above request, then you may want to take a look at this.

@LukeWood
Copy link
Contributor

that layer doesn't fill the need because the Resize portion is also important. RandomResizedCrop basically stretch and crop images randomly.

@innat
Copy link
Contributor

innat commented Feb 12, 2022

something like this?

tf.keras.layers.RandomCrop
tf.keras.layers.Resizing

@LukeWood
Copy link
Contributor

So: passing Resizing followed by RandomCrop does work, but you temporarily have a tensor with dimension None, None for width, height after the resizing. This may be wasteful/undesirable to work with, so I think this may be a useful KPL to add it as a single operation . It may also be more efficient.

@innat
Copy link
Contributor

innat commented Feb 17, 2022

Sorry, didn't catch your concern. Could you elaborate? Like, are you saying that after passing the input to resize a layer, the shape of the output tensor becomes (100, 100, 3) to (None, None, 3)?

@sayakpaul
Copy link
Contributor Author

We do also need to remember that this layer needs to handle more than 1 sample i.e., an arbitrary number of images having a uniform shape.

@LukeWood
Copy link
Contributor

Sorry, didn't catch your concern. Could you elaborate? Like, are you saying that after passing the input to resize a layer, the shape of the output tensor becomes (100, 100, 3) to (None, None, 3)?

Precisely, because each individual image should be augmented independently. So they will have random sizes.

@LukeWood
Copy link
Contributor

They’d be a RaggedTensor really, which is a bad UX

@sayakpaul
Copy link
Contributor Author

With random resized crops, I think we are able to cover many different transformations composed into one: random zooming random translation, resizing, etc.

@innat
Copy link
Contributor

innat commented Feb 18, 2022

They’d be a RaggedTensor really, which is a bad UX

@LukeWood I passed some tensor to resize the layer, but it didn't give ragged output. What have I missed?

resize = tf.keras.layers.Resizing(224, 224)
sample = tf.ones(shape=(5, 100, 100, 3))
resize(sample).shape
TensorShape([5, 224, 224, 3])

sample = tf.ones(shape=(5, 100, 100, 3))
rand_crop = tf.keras.layers.RandomCrop(64, 64)
rand_crop(sample).shape
TensorShape([5, 64, 64, 3])

@innat
Copy link
Contributor

innat commented Feb 18, 2022

@sayakpaul @LukeWood
I am also on the board to have this layer, I also saw many use cases. But when I saw layers.RandomCrop, I thought it might be the possible alternative. I tested with the following setup, isn't the output equivalent of RandomResizedCrop?

simple_aug = tf.keras.Sequential(
    [
        tf.keras.layers.RandomCrop(224, 224),
        tf.keras.layers.Resizing(512, 512)
    ]
)

download

@MrinalTyagi
Copy link
Contributor

Hi team. If no one is working on this, I would like to work on the same.

@AdityaKane2001
Copy link
Contributor

We do also need to remember that this layer needs to handle more than 1 sample i.e., an arbitrary number of images having a uniform shape.

I agree. The most difficult part is to come up with a batched implementation. Correct me if I'm wrong, but I believe this is the procedure:

  1. Get random boxes, in which each box needs to have random aspect ratios.
  2. Get the respective crops of the image.
  3. Resize the images.

These tasks are straightforward for a single image, however are difficult for a batch of images.

The torch implementation takes a slight shortcut: it uses the same aspect ratio and cropping vertices for all images in the batch. See here. If that approach seems acceptable, a TF implementation is achievable as well.

@bhack
Copy link
Contributor

bhack commented Mar 14, 2022

@AdityaKane2001 we had a discussion about the randomizzation in the batch at #146

@beresandras
Copy link

Hi! I would love to work on this. I already have a batched, frame-wise random implementation for personal use here.

My steps were the following:

  • sample random aspect ratios from log-uniform(1/max_aspect_ratio, max_aspect_ratio) (so that it is centered around 1)
  • sample random relative areas from uniform(min_area, max_area)
  • calculate corresponding crop heights and widths
    • when relative area ~= 1 and aspect ratio <<1 or >>1 you will get invalid values. In this case the torchvision implementation retries up to 10 times, in my implementation, I just clip the height and width.
  • sample random positions for the crops
  • tf.image.crop_and_resize()

This implementation however should subclass BaseImageAugmentationLayer and should only implement it framewise + handle bboxes and segmentation masks, am I right?

@beresandras
Copy link

@sayakpaul @LukeWood I am also on the board to have this layer, I also saw many use cases. But when I saw layers.RandomCrop, I thought it might be the possible alternative. I tested with the following setup, isn't the output equivalent of RandomResizedCrop?

simple_aug = tf.keras.Sequential(
    [
        tf.keras.layers.RandomCrop(224, 224),
        tf.keras.layers.Resizing(512, 512)
    ]
)

download

The issue with this implementation is that it only takes crops with the same resolution. If your input images are 448x448 for example, you will always see 25% of their area, resized to a higher resolution. We want to sample crops with different resolutions, and resize them to the same resolution.

@LukeWood
Copy link
Contributor

Hi! I would love to work on this. I already have a batched, frame-wise random implementation for personal use here.

My steps were the following:

  • sample random aspect ratios from log-uniform(1/max_aspect_ratio, max_aspect_ratio) (so that it is centered around 1)

  • sample random relative areas from uniform(min_area, max_area)

  • calculate corresponding crop heights and widths

    • when relative area ~= 1 and aspect ratio <<1 or >>1 you will get invalid values. In this case the torchvision implementation retries up to 10 times, in my implementation, I just clip the height and width.
  • sample random positions for the crops

  • tf.image.crop_and_resize()

This implementation however should subclass BaseImageAugmentationLayer and should only implement it framewise + handle bboxes and segmentation masks, am I right?

Hey there! Your starting implementation looks great. Feel free to open a PR. You don’t need to handle any inputs outside of images and labels to get merged. We can add boxes in a follow up.

Feel free to open a PR!

@LukeWood
Copy link
Contributor

@sayakpaul @LukeWood I am also on the board to have this layer, I also saw many use cases. But when I saw layers.RandomCrop, I thought it might be the possible alternative. I tested with the following setup, isn't the output equivalent of RandomResizedCrop?

simple_aug = tf.keras.Sequential(
    [
        tf.keras.layers.RandomCrop(224, 224),
        tf.keras.layers.Resizing(512, 512)
    ]
)

download

The issue with this implementation is that it only takes crops with the same resolution. If your input images are 448x448 for example, you will always see 25% of their area, resized to a higher resolution. We want to sample crops with different resolutions, and resize them to the same resolution.

The rescales should also be random, but feel free to push a starter implementation to a PR. I am also happy to contribute some commits to this effort once the PR is open! @beresandras

freedomtan pushed a commit to freedomtan/keras-cv that referenced this issue Jul 20, 2023
* add transposed convolution layer

* fix comments
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants