PoseFix

[Paper] | [Code]
- Pose Estimator가 아니라 Refining을 하는 model-agnotic 모델임.
- 기존에 존재하던 Pose Refine 모델들은 Two-Stage였음 → Pose estimiator에 의존적 이었고 따라서 Refinment를 성공하기 위해서는 세심한 설계가 필요했음
- 저자는 Pose Estimator 모델 아키텍처와 상관 없이 성공적인 결과를 내는 모델을 만드는게 목적임

OKS(Object Keypoint Similarity), KS(Keypoint similarity), Jitter, Inversion, Swap, Miss와 같은 pose estimiaton 에러를 사용해 Synthesized한 데이터를 만들고 이를 학습에 활용하여 Pose Refinement model을 학습하였음.
OKS
- Object Keypoint Similarity로 COCO Dataset에서 정의한 metric
- 예측한 keypoint와 GT keypoint와의 유사도를 측정하는 방법
Keypoint Similarity
Jitter: estimator 결과가 in-approximat 안에 존재하지만 human error margin 밖에 존재할 때
Inverseion: estimator 결과가 잘못된 신체 부위에 있는 경우
Swap: estimator 결과가 다른 사람에게 존재
Miss: estimator 결과가 in-approximat 안에 존재하지 않을 때
[Ref] 해당 논문에 나오는 분포를 참고해서 Synthesized한 데이터 만들었음
- frequency of each pose error(Jitter, Inversion, Swap, Miss) according to each pose error
- the number of visible keypoints
- overlap in the input image

FPS → 최저 27 samples/s 최대 30 samples/s
- 테스트 환경
  - CPU: Intel(R) Core(TM) i7-10700 CPU @ 2.90GHz
  - GPU: GeForce RTX 2070 SUPER
  - Memory: 16GB
  - Image size: 384x288x3
  - Batch: 16
해당 추론 속도는 배치 사이즈가 확보 되었을 때 나오는 속도
- Posefix인풋으로 original모델의 아웃풋 좌표들을 heatmap으로 매핑해줘야하는데 해당 부분에서 병목현상 발생

    python3 train.py --train_batch --test_batch --flip_test

    python3 test.py --checkpoint --test_batch --flip_test --video_path --detection_json

Provide feedback