[GroundingDino] Fix grounding dino loss 🚨 #31828

EduardoPach · 2024-07-07T12:20:56Z

What does this PR do?

As the original repo doesn't provide the loss implementation I'm using the one implemented here as a baseline since it was mentioned by the original repo, on this issue IDEA-Research/GroundingDINO#241, as a reliable source if one wants to train a GroundingDino model

TODO:

Test GroundingDinoMatcher and GroundingDinoLoss are working properly

Explanation of the Issue and Solution

So the issue was that GroundingDinoLoss and GroundingDinoHungarianMatcher were just a copy from DeformableDetr which is used for closed-set object detection (i.e. a fixed set of categories). Whereas in GroundingDino there's no limited amount of categories and the output logits are d_model dimensional where the first seq_len elements have a specified value and the subsequent are nan. The main differences are:

class_labels are associated with the text prompt used
The logits are asscoaited with the tokens of the text so it's not necessarily 1-to-1

For instance if an image with bounding boxes with fishes and jellyfishes using a prompt "fish. jellyfish." fish should have class_label 0 assigned to it and jellyfish should have 1 assigned. If the position of jellyfish and fish in the prompt swapped then the class_labels would swap as well. Moreover, jellyfish is represented by two tokens ([20919, 7529]) and fish by one token ([3869]) therefore we need to select the appropriate logits for each class.

As the original implementation doesn't provide the training loop or the loss implementation, but does recommend other implementations for training GroundingDino on this issue IDEA-Research/GroundingDINO#241, I took as baseline the implementation from Open-GroundingDino as it supports both visual grounding and object detection and they've trained their own GroundingDino using their code base achieving good performance.

Things added in this PR are:

build_label_maps which generates a list of torch.Tensor with lenght batch_size mapping each category to its corresponding tokens based on the input_ids
build_text_mask just expand the attention_mask to select the appropriate tokens when computing GroundingDino.loss_labels
Added enc_topk_proposals, encoder_logits and encoder_pred_boxes to GroundingDinoModelOutput and GroundingDinoObjectDetectionOutput to compute first stage loss
Added class_loss_coefficient (with correct default value) and class_loss_reduction to GroundingDinoConfig. class_loss_reduction was added because in sigmoid_focal_loss from the baseline implementation they reduced loss_ce with a simple sum, but that makes the losses imbalanced most of the time and in the original implementation they do have a sigmoid_focal_loss implemented, but using mean reduction, therefore I made I decided to make it configurable and use the sum one for testing reasons
Modifications to GroundingDinoLoss and GroundingDinoHungarianMatcher

Also added a new integration test called test_grounding_dino_loss where I compare the loss obtained from 2 sample images with the baseline implementation from Open-GroundingDino.

c.c. @amyeroberts

src/transformers/models/grounding_dino/modeling_grounding_dino.py

EduardoPach · 2024-07-14T12:10:29Z

@amyeroberts FYI for some reason, when testing locally, test_cross_attention_mask is failing on this branch, but when I tested using the main branch it was also failing (locally)

EduardoPach · 2024-07-19T13:06:02Z

c.c. @amyeroberts

amyeroberts

Thanks for fixing!

Overall looks good. A breaking change, so we should append the PR title with 🚨, but I think this is acceptable as it's aligning ourselves with the recommended loss calculation.

tests/models/grounding_dino/test_modeling_grounding_dino.py

amyeroberts · 2024-07-17T20:05:16Z

tests/models/grounding_dino/test_modeling_grounding_dino.py

+        input_ids = torch.tensor([101, 3869, 1012, 11420, 1012, 1012, 102])
+        input_ids = input_ids.unsqueeze(0).expand(self.batch_size, -1)


Why switch to hard coded input ids?

Otherwise tests that use labels and, therefore, compute a loss would complain as build_label_maps (here) would return None

Perhaps you can add a comment to explain this

src/transformers/models/grounding_dino/modeling_grounding_dino.py

SangbumChoi · 2024-07-20T04:16:47Z

@EduardoPach Thanks for the working this loss. Just sharing more well developed code for finetuning GroundingDINO https://github.com/open-mmlab/mmdetection/blob/main/configs/mm_grounding_dino/grounding_dino_swin-t_pretrain_obj365.py

src/transformers/models/grounding_dino/modeling_grounding_dino.py

EduardoPach · 2024-08-01T13:01:24Z

c.c. @amyeroberts

EduardoPach · 2024-08-05T12:21:25Z

Maybe @NielsRogge could have a look?

NielsRogge

Thanks for fixing!

src/transformers/models/grounding_dino/configuration_grounding_dino.py

NielsRogge · 2024-08-05T12:42:02Z

tests/models/grounding_dino/test_modeling_grounding_dino.py

+        input_ids = torch.tensor([101, 3869, 1012, 11420, 1012, 1012, 102])
+        input_ids = input_ids.unsqueeze(0).expand(self.batch_size, -1)


Perhaps you can add a comment to explain this

EduardoPach · 2024-08-10T00:17:13Z

Cough cough, c.c @amyeroberts

amyeroberts

Thanks for the continued work on this!

Main comments are about the precision and clarity of language in the docstrings and comments. It's important to write messages such that someone new to the code can understand.

tests/models/grounding_dino/test_modeling_grounding_dino.py

src/transformers/models/grounding_dino/modeling_grounding_dino.py

SangbumChoi

Overall it looks good. Let me check also in this PR #32483 for stable training convergence.

SangbumChoi · 2024-08-23T04:54:40Z

src/transformers/models/grounding_dino/modeling_grounding_dino.py

-def sigmoid_focal_loss(inputs, targets, num_boxes, alpha: float = 0.25, gamma: float = 2):
+# Similar to the one used in `DeformableDetr` but we pass `num_queries`, as `logits` are flattened
+# due to masked selection, and support different `reduction` modes.
+def sigmoid_focal_loss(


https://github.com/longzw1997/Open-GroundingDino/blob/main/models/GroundingDINO/utils.py#L138

Even though there are customization in this code, I like the current version of sigmoid_focal_loss 👍🏼

SangbumChoi · 2024-08-23T12:55:12Z

This is the result of current commit

HuggingFaceDocBuilderDev · 2024-08-27T12:38:25Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

tests/models/grounding_dino/test_modeling_grounding_dino.py

EduardoPach · 2024-09-03T20:52:51Z

c.c. @amyeroberts

SangbumChoi · 2024-09-04T13:56:18Z

tests/models/grounding_dino/test_modeling_grounding_dino.py

@@ -741,3 +780,53 @@ def test_cross_attention_mask(self):
        self.assertTrue(torch.allclose(outputs1.logits, outputs_batched.logits[:1], atol=1e-3))
        # For some reason 12 elements are > 1e-3, but the rest are fine
        self.assertTrue(torch.allclose(outputs2.logits, outputs_batched.logits[1:], atol=1.8e-3))
+
+    def test_grounding_dino_loss(self):
+        ds = load_dataset("EduardoPacheco/aquarium-sample", split="train")


Maybe we should move this to huggingface?

Maybe, wdyt @amyeroberts

Thanks for working on this, it is great to have it tested!
Do we actually need to load the whole dataset here? Can't we copy annotations and upload one image somewhere?

@qubvel by upload one image somewhere you mean to a hf dataset or to the tests fixtures?

yes, no, maybe?

@ydshieh can you please help, do we have any place for test assets?

@EduardoPach @qubvel I think we can do like this

transformers/tests/models/detr/test_modeling_detr.py

Line 562 in be9cf07

# We will verify our results on an image of cute cats

Hi. If IIRC, that dataset contains only 2 examples, right?

The easiest way is to have it as

hf-internal-testing/aquarium-sample

@qubvel I added you to hf-internal-testing (you have to accept the invitation) then you can create a copy of the above dataset.

amyeroberts

Thanks for fixing and all the work iterating on this!

Just two things before we're ready to merge:

This convo
Slow model tests -- Could you push an empty commit with the message [run_slow] grounding_dino?

qubvel · 2024-10-11T10:27:04Z

Slow tests are failed 🥲 I suppose most of them in the same state on the main, but the loss test is also failed, can you have a look? At least we need to add a better message in assert to see the diff.

SangbumChoi · 2024-10-11T10:43:31Z

@qubvel Sorry, the problem was not synchronize the device type of model and the input.
Resolved by de356ae

Strange thing is that local CLI returns follows but the CI pass on this

root@ccb8b39e6c2d:/mnt/nas2/users/sbchoi/transformers# RUN_SLOW=1 pytest tests/models/grounding_dino/test_modeling_grounding_dino.py

========================================= short test summary info =========================================
FAILED tests/models/grounding_dino/test_modeling_grounding_dino.py::GroundingDinoModelIntegrationTests::test_cross_attention_mask - AssertionError: False is not true
FAILED tests/models/grounding_dino/test_modeling_grounding_dino.py::GroundingDinoModelIntegrationTests::test_grounding_dino_loss - AssertionError: False is not true
FAILED tests/models/grounding_dino/test_modeling_grounding_dino.py::GroundingDinoModelIntegrationTests::test_inference_object_detection_head_equivalence_cpu_gpu - AssertionError: False is not true
==================== 3 failed, 46 passed, 109 skipped, 3 warnings in 211.76s (0:03:31) ====================

Can you also confirm about this? -> turns out that CI was having some delay (Local was correct)

qubvel · 2024-10-11T11:17:19Z

I have these ones also failed locally, making a fix for cross_attention + batch equivalence

========================================= short test summary info =========================================
FAILED tests/models/grounding_dino/test_modeling_grounding_dino.py::GroundingDinoModelIntegrationTests::test_cross_attention_mask - AssertionError: False is not true
FAILED tests/models/grounding_dino/test_modeling_grounding_dino.py::GroundingDinoModelIntegrationTests::test_grounding_dino_loss - AssertionError: False is not true
FAILED tests/models/grounding_dino/test_modeling_grounding_dino.py::GroundingDinoModelIntegrationTests::test_inference_object_detection_head_equivalence_cpu_gpu - AssertionError: False is not true
==================== 3 failed, 46 passed, 109 skipped, 3 warnings in 211.76s (0:03:31) ====================

qubvel · 2024-10-11T11:32:39Z

Two tests failed in CI

FAILED tests/models/grounding_dino/test_modeling_grounding_dino.py::GroundingDinoModelIntegrationTests::test_grounding_dino_loss - AssertionError: False is not true
FAILED tests/models/grounding_dino/test_modeling_grounding_dino.py::GroundingDinoModelIntegrationTests::test_inference_object_detection_head_equivalence_cpu_gpu - AssertionError: False is not true

For cpu-gpu test I have pretty large diffs locally and even nan (on main)

{
    "logits": tensor(0.0),
    "pred_boxes": tensor(0.8244),
    "last_hidden_state": tensor(0.9422),
    "init_reference_points": tensor(0.7221),
    "intermediate_hidden_states": tensor(2.5513),
    "intermediate_reference_points": tensor(0.8300),
    "encoder_last_hidden_state_vision": tensor(0.0001),
    "encoder_last_hidden_state_text": tensor(9.5665e-06),
    "enc_outputs_class": tensor(1.8090e-05),
    "enc_outputs_coord_logits": tensor(nan),
}

ydshieh · 2024-10-11T12:43:41Z

@qubvel

tests/models/grounding_dino/test_modeling_grounding_dino.py::GroundingDinoModelIntegrationTests::test_inference_object_detection_head_equivalence_cpu_gpu

is failing on our CI (main)

which could be find in #transformers-ci-daily-models channel

SangbumChoi · 2024-10-11T12:45:44Z

@ydshieh @qubvel Currently I'm debugging why this failure happens

ydshieh · 2024-10-11T12:57:31Z

@SangbumChoi

tests/models/grounding_dino/test_modeling_grounding_dino.py::GroundingDinoModelIntegrationTests::test_inference_object_detection_head_equivalence_cpu_gpu

is failing for quite long time 😅 but if you are willing to check it we appreciate a lot, but it's not necessary to merge this PR.

(but test_grounding_dino_loss yes)

SangbumChoi · 2024-10-11T14:17:08Z

is failing for quite long time 😅 but if you are willing to check it we appreciate a lot, but it's not necessary to merge this PR.

@ydshieh
For the aforemention error of this thing. I am not 100 % sure but I think it is related to somehow seed problem. The reason why I think this way is because even though I did not change any difference, somtimes the test pass.

qubvel · 2024-10-11T14:21:10Z

@SangbumChoi do you mean the loss test is random seed dependant? In that case, we can either slightly increase the tolerance or mask it as @is_flaky()

SangbumChoi · 2024-10-11T14:22:39Z

do you mean the loss test is random seed dependant? In that case, we can either slightly increase the tolerance or mask it as @is_flaky()

@qubvel Still not sure about it. I will debug deeper more tomorrow but the fact is that sometime the CI pass and sometime fails 😭 (Or there might be a computing issue with torch.backends.cudnn.deterministic = True)

qubvel · 2024-10-11T14:23:25Z

Ok, sure!

stevenwudi · 2024-10-12T08:42:04Z

HI @SangbumChoi , from the test_grounding_dino_loss, you have the "loss_ce_enc": torch.tensor(16226.3145),:
the scale of the loss is way to big from other loss? Isn't the trained network should have smaller scale loss or is there something wrong with the implementation?

Or does the loss actually need to include loss_ce_enc since the first stage is just to used for the regional proposal?

SangbumChoi · 2024-10-12T10:06:43Z

@stevenwudi

loss actually need to include loss_ce_enc since the first stage is just to used for the regional proposal?

Yeah, otherwise can you explain more detail about the reason why "loss_ce_enc" should be low? (I am also open to discuss this circumstances!)

stevenwudi · 2024-10-12T15:04:33Z

Yeah, otherwise can you explain more detail about the reason why "loss_ce_enc" should be low? (I am also open to discuss this circumstances!)

with the current sum as reduction, the loss_ce_enc will be several scales larger than all the rest of the loss and this will literally makes other loss neglible which will hinders training.

Maybe a better term is why we are not using mean as the class_loss_reduction default since the Open-GroundingDino seem to have a mean?
https://github.com/longzw1997/Open-GroundingDino/blob/main/models/GroundingDINO/utils.py#L168

SangbumChoi · 2024-10-12T15:11:49Z

@stevenwudi Yeah good point. Even though the loss scale is large, eventually it trains well at my experiment, but as you suggested it could be better if we set mean. Do you have some time for to experiment between the effectiveness of mean and sum? I will change that default as mean. cc. @EduardoPach

stevenwudi · 2024-10-12T15:20:41Z

@ydshieh Yeah good point. Even though the loss scale is large, eventually it trains well at my experiment, but as you suggested it could be better if we set mean. Do you have some time for to experiment between the effectiveness of mean and sum? I will change that default as mean. cc. @EduardoPach

(Actually, I don't have any experiment stats to support, and I am happy to know that you did train well at your experiment)
But what I observer is that with the sum reduction method, the first stage loss_ce is really large where as the second stage loss_ce lie in a more sensible scale. Any ideas? @SangbumChoi

EduardoPach · 2024-10-12T15:41:20Z

I will change that default as mean

@SangbumChoi I did this before but switch back to default sum. Why? Because using mean yielded a terrible result in my experiments with aquarium dataset and because Open-GroundingDino uses sum as default

with the current sum as reduction, the loss_ce_enc will be several scales larger than all the rest of the loss and this will literally makes other loss neglible which will hinders training.

@stevenwudi first of all thanks for spending your time and looking into the code! I had the same thought, but then after some experiments (as mentioned above) I set sum as default. Why? Because sum is the default in Open-GroundingDino .

https://github.com/longzw1997/Open-GroundingDino/blob/main/models/GroundingDINO/utils.py#L168

This is not use in their code. See here. Thus the expected value, since we're basing ourselves in Open-GroundingDino is pretty high, but converges reasonably fast (see attached img).

Btw if you actually look the encoder_logits they're pretty similar to logits because the encoder_logits the first stage output that then gets refined on the second stage and generates the final logits

stevenwudi · 2024-10-14T00:41:33Z

@EduardoPach thanks for the detailed explaination, code link and the training loss, this is super helpful.

nit: curious for the extremely large value of loss_ce_enc for a pretrained model, why do you think the scale is so much larger (hundreds of times the scale of other loss).
My understanding is the first stage and second stage loss_ce are doing very similar things: (quote from deformable DETR):

Two-Stage Deformable DETR. In the original DETR, object queries in the decoder are irrelevant
to the current image. Inspired by two-stage object detectors, we explore a variant of Deformable
DETR for generating region proposals as the first stage. The generated region proposals will be fed
into the decoder as object queries for further refinement, forming a two-stage Deformable DETR.
In the first stage, to achieve high-recall proposals, each pixel in the multi-scale feature maps would
serve as an object query. However, directly setting object queries as pixels will bring unacceptable
computational and memory cost for the self-attention modules in the decoder, whose complexity
grows quadratically with the number of queries. To avoid this problem, we remove the decoder and
form an encoder-only Deformable DETR for region proposal generation. In it, each pixel is assigned
as an object query, which directly predicts a bounding box. Top scoring bounding boxes are picked
as region proposals. No NMS is applied before feeding the region proposals to the second stage.

Hence, I can not help but wonder: does the high loss_ce_enc means that there is almost little generalization of the groundingdino wrt. the cross attention logit input. Could there be something fundamentally flaw with the grounding dino formulation?

This is not use in their code. See here.

Hmm, good to know, it seems that the Open-Groundingdino code is kinda really messy, glad we will have this HF implemetation soon 👍

SangbumChoi · 2024-10-15T08:44:55Z

@ydshieh @qubvel @EduardoPach The reason of failing the CI test was the difference between cpu and gpu. There is a slight difference from the beginning part (e.g. load_backbone part) of the architecture. (There is also slight difference in text part also when we change atol into 1e-7) This difference keeps get larger and larger to the head part and makes the non-negligible difference. This difference also makes to pick the different topk proposal also.

for i in outputs.keys():
    try:
        difference = cpu_outputs[i] - outputs[i].cpu()
        print(f"Difference: {(difference>1e-6).sum():10} Name: {i:40} Size: {cpu_outputs[i].size()}")
    except:
        continue

Difference:          1 Name: loss                                     Size: torch.Size([])
Difference:      15992 Name: logits                                   Size: torch.Size([2, 900, 256])
Difference:       1839 Name: pred_boxes                               Size: torch.Size([2, 900, 4])
Difference:     201506 Name: last_hidden_state                        Size: torch.Size([2, 900, 256])
Difference:         57 Name: init_reference_points                    Size: torch.Size([2, 900, 4])
Difference:    1056660 Name: intermediate_hidden_states               Size: torch.Size([2, 6, 900, 256])
Difference:      10381 Name: intermediate_reference_points            Size: torch.Size([2, 6, 900, 4])
Difference:      84492 Name: encoder_last_hidden_state_vision         Size: torch.Size([2, 23890, 256])
Difference:          0 Name: encoder_last_hidden_state_text           Size: torch.Size([2, 21, 256])
Difference:          2 Name: encoder_topk_proposals                   Size: torch.Size([2, 900])
Difference:     223762 Name: enc_outputs_class                        Size: torch.Size([2, 23890, 256])
Difference:      40176 Name: enc_outputs_coord_logits                 Size: torch.Size([2, 23890, 4])
Difference:      14181 Name: encoder_logits                           Size: torch.Size([2, 900, 256])
Difference:         57 Name: encoder_pred_boxes                       Size: torch.Size([2, 900, 4])

for i in outputs.keys():
    try:
        difference = cpu_outputs[i] - outputs[i].cpu()
        print(f"Difference: {(difference>1e-3).sum():10} Name: {i:40} Size: {cpu_outputs[i].size()}")
    except:
        continue

Difference:          1 Name: loss                                     Size: torch.Size([])
Difference:       4110 Name: logits                                   Size: torch.Size([2, 900, 256])
Difference:         97 Name: pred_boxes                               Size: torch.Size([2, 900, 4])
Difference:      38021 Name: last_hidden_state                        Size: torch.Size([2, 900, 256])
Difference:          8 Name: init_reference_points                    Size: torch.Size([2, 900, 4])
Difference:     120993 Name: intermediate_hidden_states               Size: torch.Size([2, 6, 900, 256])
Difference:        329 Name: intermediate_reference_points            Size: torch.Size([2, 6, 900, 4])
Difference:          0 Name: encoder_last_hidden_state_vision         Size: torch.Size([2, 23890, 256])
Difference:          0 Name: encoder_last_hidden_state_text           Size: torch.Size([2, 21, 256])
Difference:          2 Name: encoder_topk_proposals                   Size: torch.Size([2, 900])
Difference:          0 Name: enc_outputs_class                        Size: torch.Size([2, 23890, 256])
Difference:          0 Name: enc_outputs_coord_logits                 Size: torch.Size([2, 23890, 4])
Difference:          0 Name: encoder_logits                           Size: torch.Size([2, 900, 256])
Difference:          8 Name: encoder_pred_boxes                       Size: torch.Size([2, 900, 4])

@EduardoPach I think the test_grounding_dino_loss is the same issue as I stated above. So for the solution we can recalculate based on the gpu, enlarge the tolerance, or remove this testing function

SangbumChoi · 2024-10-16T07:49:03Z

Additional analysis for cpu/gpu difference

for i in range(len(backbone[0])):
    i, j = backbone[0][i][0], cpu_backbone[0][i][0]
    try:
        difference = j - i.cpu()
        print(f"Difference: {(difference>4e-5).sum():10}")
    except:
        continue

Difference: 0
Difference: 118
Difference: 11

backbone also has some large difference at the second and last layer.

Also SwinEmbedding has 1e-6 difference even if it is the very beginning of the architecture

transformers/src/transformers/models/swin/modeling_swin.py

Line 234 in d087165

class SwinEmbeddings(nn.Module):

SangbumChoi · 2024-10-16T07:51:28Z

@qubvel requesting for the review

ydshieh · 2024-10-16T11:35:17Z

So for the solution we can recalculate based on the gpu, enlarge the tolerance, or remove this testing function

We are running CI on T4 GPU: we can update the expected values accordingly on our side.

ydshieh · 2024-10-17T10:30:58Z

updated the values for test_grounding_dino_loss and it pass now.
I will leave @qubvel to take a final look and merge when they are back.

qubvel · 2024-10-30T09:12:32Z

Hi! I suppose we can merge it as tests fails are unrelated!

The only thing it would be nice to do before merging: I see @ArthurZucker started the initiative of moving losses to a separate module, let's update this branch and move this loss there too. Thanks!

EduardoPach · 2024-11-07T12:48:40Z

Hi! I suppose we can merge it as tests fails are unrelated!

The only thing it would be nice to do before merging: I see @ArthurZucker started the initiative of moving losses to a separate module, let's update this branch and move this loss there too. Thanks!

Basically, just moving the loss implementation to a new file in the transformers/loss and properly set grounding dino in the LOSS_MAPING, right? @SangbumChoi do you have time to do this? Otherwise I can find some time during the weekend

SangbumChoi · 2024-11-07T13:09:49Z

It would be great if you can handle this weekend. @EduardoPach

zappy586 reviewed Jul 9, 2024

View reviewed changes

src/transformers/models/grounding_dino/modeling_grounding_dino.py Outdated Show resolved Hide resolved

zappy586 reviewed Jul 9, 2024

View reviewed changes

src/transformers/models/grounding_dino/modeling_grounding_dino.py Show resolved Hide resolved

EduardoPach changed the title ~~WIP - [GroundingDino] Fix grounding dino loss~~ [GroundingDino] Fix grounding dino loss Jul 14, 2024

EduardoPach mentioned this pull request Jul 14, 2024

[GroundingDino] - GroundingDinoProcessor kwargs is Broken #31952

Closed

4 tasks

amyeroberts reviewed Jul 19, 2024

View reviewed changes

EduardoPach requested a review from amyeroberts July 23, 2024 11:30

EduardoPach changed the title ~~[GroundingDino] Fix grounding dino loss~~ [GroundingDino] Fix grounding dino loss 🚨 Jul 23, 2024

SangbumChoi reviewed Jul 29, 2024

View reviewed changes

src/transformers/models/grounding_dino/modeling_grounding_dino.py Outdated Show resolved Hide resolved

NielsRogge approved these changes Aug 5, 2024

View reviewed changes

This was referenced Aug 6, 2024

Zero-shot finetuning examples #32459

Open

Adding new zero-shot examples #32483

Open

amyeroberts reviewed Aug 15, 2024

View reviewed changes

SangbumChoi reviewed Aug 20, 2024

View reviewed changes

src/transformers/models/grounding_dino/modeling_grounding_dino.py Outdated Show resolved Hide resolved

EduardoPach requested review from amyeroberts and SangbumChoi August 22, 2024 09:53

SangbumChoi reviewed Aug 23, 2024

View reviewed changes

EduardoPach mentioned this pull request Aug 23, 2024

GroundingDino - Loss calculation exceptions #31434

Open

4 tasks

SangbumChoi reviewed Aug 27, 2024

View reviewed changes

tests/models/grounding_dino/test_modeling_grounding_dino.py Outdated Show resolved Hide resolved

SangbumChoi reviewed Sep 4, 2024

View reviewed changes

amyeroberts added the run-slow label Sep 17, 2024

amyeroberts approved these changes Sep 17, 2024

View reviewed changes

[run-slow] grounding_dino

de356ae

[run-slow] grounding_dino

2cc0ad8

ydshieh added 2 commits October 17, 2024 12:19

check

b8d3323

check

cc77eba

		input_ids = torch.tensor([101, 3869, 1012, 11420, 1012, 1012, 102])
		input_ids = input_ids.unsqueeze(0).expand(self.batch_size, -1)

[GroundingDino] Fix grounding dino loss 🚨 #31828

Are you sure you want to change the base?

[GroundingDino] Fix grounding dino loss 🚨 #31828

Conversation

EduardoPach commented Jul 7, 2024 • edited Loading

What does this PR do?

Explanation of the Issue and Solution

EduardoPach commented Jul 14, 2024

EduardoPach commented Jul 19, 2024

amyeroberts left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

SangbumChoi commented Jul 20, 2024

EduardoPach commented Aug 1, 2024

EduardoPach commented Aug 5, 2024

NielsRogge left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

EduardoPach commented Aug 10, 2024

amyeroberts left a comment

Choose a reason for hiding this comment

SangbumChoi left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

SangbumChoi commented Aug 23, 2024

HuggingFaceDocBuilderDev commented Aug 27, 2024

EduardoPach commented Sep 3, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

qubvel Sep 10, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ydshieh Sep 23, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

amyeroberts left a comment • edited Loading

Choose a reason for hiding this comment

qubvel commented Oct 11, 2024

SangbumChoi commented Oct 11, 2024 • edited Loading

qubvel commented Oct 11, 2024 • edited Loading

qubvel commented Oct 11, 2024

ydshieh commented Oct 11, 2024

SangbumChoi commented Oct 11, 2024 • edited Loading

ydshieh commented Oct 11, 2024

SangbumChoi commented Oct 11, 2024

qubvel commented Oct 11, 2024

SangbumChoi commented Oct 11, 2024 • edited Loading

qubvel commented Oct 11, 2024

stevenwudi commented Oct 12, 2024 • edited Loading

SangbumChoi commented Oct 12, 2024 • edited Loading

stevenwudi commented Oct 12, 2024 • edited Loading

SangbumChoi commented Oct 12, 2024 • edited Loading

stevenwudi commented Oct 12, 2024 • edited Loading

EduardoPach commented Oct 12, 2024

stevenwudi commented Oct 14, 2024 • edited Loading

SangbumChoi commented Oct 15, 2024 • edited Loading

SangbumChoi commented Oct 16, 2024

SangbumChoi commented Oct 16, 2024

ydshieh commented Oct 16, 2024

ydshieh commented Oct 17, 2024

qubvel commented Oct 30, 2024

EduardoPach commented Nov 7, 2024

SangbumChoi commented Nov 7, 2024

EduardoPach commented Jul 7, 2024 •

edited

Loading

qubvel Sep 10, 2024 •

edited

Loading

ydshieh Sep 23, 2024 •

edited

Loading

amyeroberts left a comment •

edited

Loading

SangbumChoi commented Oct 11, 2024 •

edited

Loading

qubvel commented Oct 11, 2024 •

edited

Loading

SangbumChoi commented Oct 11, 2024 •

edited

Loading

SangbumChoi commented Oct 11, 2024 •

edited

Loading

stevenwudi commented Oct 12, 2024 •

edited

Loading

SangbumChoi commented Oct 12, 2024 •

edited

Loading

stevenwudi commented Oct 12, 2024 •

edited

Loading

SangbumChoi commented Oct 12, 2024 •

edited

Loading

stevenwudi commented Oct 12, 2024 •

edited

Loading

stevenwudi commented Oct 14, 2024 •

edited

Loading

SangbumChoi commented Oct 15, 2024 •

edited

Loading