Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Decoder #8

Open
y1131388949 opened this issue Jun 5, 2024 · 4 comments
Open

Decoder #8

y1131388949 opened this issue Jun 5, 2024 · 4 comments

Comments

@y1131388949
Copy link

I noticed that in both training and validation, the input of the decoder is the sequence of truth values to be predicted, but what should be the input of the decoder when using the trained STPosetransformer for prediction? Your paper states that the last frame of the input sequence is copied as the decoder input, what exactly does this look like? If I want to use my own recognized 3D keypoints of the human body as input to predict, what format should the input of the decoder be?

@mmahdavian
Copy link
Owner

mmahdavian commented Jun 5, 2024

Hi @y1131388949 . Each skeleton frame is 17 joints and each joint is 3 numbers. x,y,z. So you need 5x17x3 as input to encoder. You need to copy the fifth frame 20 times and make it 20x17x3 and input that to decoder.
Depending on how you would train the model, you can decrease the first joint value of all frames from the rest of the joints. So all joints values would be relative to hip joint. Also you can normalize the input and denormalize the output for better performance. These are settings used for training the provided pre-trained model.

@y1131388949
Copy link
Author

Thank you very much for your answer, it was very useful for me. I noticed that in the H36MDataset_v3 file, the selected keypoints for the human body are _MAJOR_JOINTS = [0, 1, 2, 5, 6, 7, 11, 12, 13, 14, 16, 17, 18, 24, 25, 26], which is a total of 16 keypoints, not 17 as you said. Which parts of the human body do these points correspond to? The key point diagrams I've looked up online don't seem to match up.

@y1131388949
Copy link
Author

H36M

@mmahdavian
Copy link
Owner

@y1131388949 As far as I remember, this is the skeleton structure:
fig

If there are 16 joints used in the data loading part, I guess it's removing the hip joint.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants