Pose estimation slightly off #1795
-
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 5 replies
-
Hi @justinecorsilia, Thanks for the question! Looking at the user-labeled This could be due to using too low of an input scaling. Input scaling is nice because it allows us to save on computational cost, but when predicted locations are scaled back to full resolution on the original image, we lose information as to precisely which pixel the body part is located. Also, since the legs of your fly are only about 1pixel wide, any decrease in resolution might accidentally wipe-out the legs entirely! My suggestion would be to use a top-down pipeline which actually uses two models: centroid and centered instance. Both models should have the same augmentation settings of +-180 rotation just to provide more examples at different orientation. CentroidThe centroid model will find just a single point and take a centered crop around that point, passing the cropped image to the centered instance model. Anchor PointFor the centroid model, we will need to specify an anchor point. The anchor point should be a body part location that is consistently visible. We will be training the centroid model to find and predict just the single anchor point body part - so it is important that the model is able to find it. Usually the anchor point is also somewhere in the center of our animal, but this is not imperative. I would suggest using the thorax (which might cause trouble for us if the animal flips over often, but we can always try different anchor points later). Input ScalingThe input scaling should always be determined by how large the smallest body part is in terms of pixels. After multiplying by the input scaling, we still want to have the smallest body part be represented by at least 2 pixels (we only need 1 pixel, 2 is for safety). Since the centroid model only finds a single body part, this calculation should be simple. If we are using the thorax, which is a relatively large body part compared to your 1 pixel legs, then we can safely set the input scaling to 0.5 (or even to 0.25 if you're feeling frisky). Centered InstanceThe centered instance model will take in the cropped image and find all body parts at FULL RESOLUTION. The present input scaling is 1.0 (and the model does not support any other input scaling). We already save on memory with the top-down pipeline from finding centered crops at a low resolution and then using just the crops for the rest of the pipeline. If you are still having trouble, then would you mind also attaching a screenshot of your Training Pipeline GUI with all the settings filled in (for both models please if you decide to top-down). Even if you aren't having troubles, we'd be happy to hear if things went smoothly. Thanks, |
Beta Was this translation helpful? Give feedback.
-
Hi Liezl,
Thank you so much for your response, I really appreciate it. I believe my
current input scaling settings are as you suggested, would you be willing
to take a look at my current Training pipeline and let me know if there is
anything obvious that I might want to change? Thank you again for your help!
Sincerely,
Justine
[image: Image 6-6-24 at 4.43 PM.jpeg][image: Image 6-6-24 at 4.44
PM.jpeg][image:
Image 6-6-24 at 4.44 PM (1).jpeg]
…On Thu, 6 Jun 2024 at 15:46, Liezl Maree ***@***.***> wrote:
Hi @justinecorsilia <https://github.com/justinecorsilia>,
Thanks for the question! Looking at the user-labeled Instance/annotation
and the PredictedInstance, I definitely see that quite a few body part
locations were predicted a few pixels off.
This could be due to using too low of an input scaling. Input scaling is
nice because it allows us to save on computational cost, but when predicted
locations are scaled back to full resolution on the original image, we lose
information as to precisely which pixel the body part is located. Also,
since the legs of your fly are only about 1pixel wide, any decrease in
resolution might accidentally wipe-out the legs entirely!
My suggestion would be to use a top-down pipeline which actually uses two
models: centroid and centered instance. Both models should have the same
augmentation settings of +-180 rotation just to provide more examples at
different orientation.
Centroid
The centroid model will find just a single point and take a centered crop
around that point, passing the cropped image to the centered instance model.
Anchor Point
For the centroid model, we will need to specify an anchor point. The
anchor point should be a body part location that is consistently visible.
We will be training the centroid model to find and predict just the single
anchor point body part - so it is important that the model is able to find
it. Usually the anchor point is also somewhere in the center of our animal,
but this is not imperative. *I would suggest using the thorax* (which
might cause trouble for us if the animal flips over often, but we can
always try different anchor points later).
Input Scaling
The input scaling should always be determined by how large the smallest
body part is in terms of pixels. After multiplying by the input scaling, we
still want to have the smallest body part be represented by at least 2
pixels (we only need 1 pixel, 2 is for safety). Since the centroid model
only finds a single body part, this calculation should be simple. If we are
using the thorax, which is a relatively large body part compared to your 1
pixel legs, then we can safely set the *input scaling to 0.5* (or even to
0.25 if you're feeling frisky).
Centered Instance
The centered instance model will take in the cropped image and find all
body parts at FULL RESOLUTION. The present input scaling is 1.0 (and the
model does not support any other input scaling). We already save on memory
with the top-down pipeline from finding centered crops at a low resolution
and then using just the crops for the rest of the pipeline.
If you are still having trouble, then would you mind also attaching a
screenshot of your Training Pipeline GUI with all the settings filled in
(for both models please if you decide to top-down). Even if you aren't
having troubles, we'd be happy to hear if things went smoothly.
Thanks,
Liezl
—
Reply to this email directly, view it on GitHub
<#1795 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/BI6ETHDMR4YPBO4GAWK4IX3ZGC4B3AVCNFSM6AAAAABIY6ZVD6VHI2DSMVQWIX3LMV43SRDJONRXK43TNFXW4Q3PNVWWK3TUHM4TMOJTGQ3TC>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Beta Was this translation helpful? Give feedback.
@justinecorsilia you mentioned tracks in your first message. Just to clarify, you will get the tracks after you get pose predictions on each frame.
Please try adjusting the Rotation Min and Max Angle to -180, + 180 respectively for both models.
Also, was there a reason your Batch Size is 1 for the centered instance model?
Lastly, you should go through your labels and make sure they are correct. I see there that your Proboscis is marked as invisible. Is that intentional?
Thanks,
Elizabeth