Explanation of output shape [1, 117] (World landmarks for pose) or [1, 195] (Pose landmarks) of pose_landmarks_detector.tflite in Mediapipe #5622
Labels
platform:python
MediaPipe Python issues
task:pose landmarker
Issues related to Pose Landmarker: Find people and body positions
type:support
General questions
I downloaded the
pose_landmaker_lite.task
file from the official Mediapipe guide for Pose Landmark Detection here:In order to access its
.tflite
models, I unzipped it usingunzip pose_landmaker_lite.task
and got 2 files:pose_detector.tflite
andpose_landmarks_detector.tflite
.Question 1: How do we interpret these models and how are they being used for tasks?
pose_landmarks_detector.tflite
appears to be one for pose detection, as we can visualize the structure and outputs of both the models at Netron App and see that this model has pose detection outputs:However, I have difficulty understanding the shapes and meaning of both
"Pose landmarks" Output Shape: [1,195]
and"World landmarks for pose" Output Shape: [1,117]
Question 2: How do we interpret the shapes
[1,195]
and[1,117]
?And finally,
Question 3: How do we interpret the structure of the model, especially that how does it relate with BlazePose and MobileNetV2? Also is there any support for fine-tuning, using the trained backbone in this model and writing a custom head?
The text was updated successfully, but these errors were encountered: