-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Question about two attention modules #26
Comments
Hi! Thanks for the message I will try to answer the three questions. The first thing I want to clarify is that the Attention Module is one of the contributions of the paper but ChAM is not a contribution of ours, it is just applied to the method. Because of this, I suggest you take a look at the original ChAM paper which is really nicely explained. Q1: The aim of ChAM, as explained in the original paper, is to compute self-attention over the channel dimension. We used this in order to attend more to specific channels from the Semantic Branch. Our idea, as features from the semantic branch depend on the semantic segmentation input tensor, is that ChAM will help us to attend more to specific objects (channels). Q2: You can use as many ChAM modules as you want. All the design of the proposed architecture is based on the Residual Network construction, so we are using the space between ResNet Basic Block to introduce them. If I remember properly the original authors also did the same thing, but again, is a matter of design and you can use them where ever you want. Q3: "Attention Module" and ChAM are both "Attention Mechanisms" but the aim of both is totally different. As explained before, ChAM aims to enhance the focus on specific channels (objects in our case) in the Semantic Branch. However, the Attention Module aims to force the RGB Branch network to focus on specific areas indicated by the final Semantic Branch Feature tensor. With this process, we try to focus RG Branch attention on specific objects from the image, the ones learned by the Semantic Branch. |
Thank you for your quick and detailed reply! About Q2, I notice that the output of RGB Branch is 512x7x7 and the 3ChAM modules change the input of Semantic Branch from 128x28x28 to 256x14x14 to512x7x7. Looking forward to your reply. |
Actually is the other way around. We started with RGB Branch as a common ResNet-18 architecture. Semantic Segmentation Branch is built to match the exact same feature sizes as the one obtained in the RGB branch. Actually, the Semantic Segmentation branch is a ResNet-like architecture but only including the layers that perform the downsampling in size. ChAM module does not change any size of any tensor, it just computes an attention feature tensor and applies it. The layers reducing the size are the convolutional layers from the Semantic Branch. |
Thank you so much! |
There are two attention modules used in SAScnenNet,one is chain connected 3xChAM and the other is “Attention Module”.
Q1: What happened when feature pass 3xChAM (concentrate on svevral special channel that strongly about the scene?)
Q2: Why do we need 3 ChAM but not less or more (is it bucause 3module can make the feature more concentrate on the decisive feature that help to make sure the scene?)
Q3: Why do we need “Attention Module”, what is the difference between this and ChAM in function (is it like one judeg "what" and another judge "where" in CBAM?)
I very much look forward to your reply
The text was updated successfully, but these errors were encountered: