Form Trainer: A Computer Vision Assistant for Physical Therapy

Anthony Tafoya, Derek Xu, Leon Jia

UC Berkeley, Presented to the Berkeley EECS & AI Research Symposium

ExerciseOutput.mp4

[paper work is based on]

Abstract

Our project tackles the question of extending previous work done in the UC Berkeley paper Everybody Dance Now to poses such as yoga and exercise regimes. The paper Everybody Dance Now presents a simple method for "do as I do" motion transfer, transferring a source dancer’s movements onto an amateur target. This is accomplished via a video-to-video translation by using pose as an intermediate representation. We identified a specific niche in the space of pose correction in yoga poses and weight training. This is done through two steps: video-to-pose and pose-to-video. Video-to-pose utilizes computer vision and the pose detection library OpenPose. Pose-to-video utilizes GANs to generate a realistic-looking model of our target subject. The bulk of the project was ideating and finding new ways to extend this technology to be more applicable for industry and consumer-facing applications. The broader ambition was to develop a model that can be used within a fitness app, where users can upload a video of themselves performing certain movements, and generalize it to yoga poses and weight training exercises. One possible extension is to add in some sort of corrective feature, taking into account how the user is currently performing their action and showing how it compares relative to how the action is ideally performed, and giving specific feedback to improve upon.

Prerequisites

PyTorch
Python Library Dominate

pip install dominate

Clone the repository

git clone https://github.com/carolineec/EverybodyDanceNow

We ran our code on a 12GB NVIDIA GPU (hosted by Machine Learning @ Berkeley)

Training

Global Stage

The training stage for the project is similar to pix2pixHD. We first train a "global" stage model at 512x256 resolution

# train a model at 512x256 resolution
python train_fullts.py \
--name MY_MODEL_NAME_global \
--dataroot MY_TRAINING_DATASET \
--checkpoints_dir WHERE_TO_SAVE_CHECKPOINTS \
--loadSize 512 \
--no_instance \
--no_flip \
--tf_log \
--label_nc 6

Local Stage

Followed by a "local" stage model with 1024x512 resolution.

# train a model at 1024x512 resolution
python train_fullts.py \
--name MY_MODEL_NAME_local \
--dataroot MY_TRAINING_DATASET \
--checkpoints_dir WHERE_TO_SAVE_CHECKPOINTS \
--load_pretrain MY_MODEL_NAME_global \
--netG local \
--ngf 32 \
--num_D 3 \
--resize_or_crop none \
--no_instance \
--no_flip \
--tf_log \
--label_nc 6

Face GAN stage

We then can apply another stage with a separate GAN focused on the face region.

# train a model specialized to the face region
python train_fullts.py \
--name MY_MODEL_NAME_face \
--dataroot MY_TRAINING_DATASET \
--load_pretrain MY_MODEL_NAME_local \
--checkpoints_dir WHERE_TO_SAVE_CHECKPOINTS \
--face_discrim \
--face_generator \
--faceGtype global \
--niter_fix_main 10 \
--netG local \
--ngf 32 \
--num_D 3 \
--resize_or_crop none \
--no_instance \
--no_flip \
--tf_log \
--label_nc 6

Testing

The full checkpoint will be loaded from --checkpoints_dir/--name (i.e. if flags: "--name foo \ ... --checkpoints_dir bar "" are included, checkpoints will be loaded from foo/bar) Replace --howmany flag with an upper bound on how many test examples you have

Global Stage

# test model at 512x256 resolution
python test_fullts.py \
--name MY_MODEL_NAME_global \
--dataroot MY_TEST_DATASET \
--checkpoints_dir CHECKPOINT_FILE_LOCATION \
--results_dir WHERE_TO_SAVE_RESULTS \
--loadSize 512 \
--no_instance \
--how_many 10000 \
--label_nc 6

Local Stage

# test model at 1024x512 resolution
python test_fullts.py \
--name MY_MODEL_NAME_local \
--dataroot MY_TEST_DATASET \
--checkpoints_dir CHECKPOINT_FILE_LOCATION \
--results_dir WHERE_TO_SAVE_RESULTS \
--netG local \
--ngf 32 \
--resize_or_crop none \
--no_instance \
--how_many 10000 \
--label_nc 6

Face GAN stage

# test model at 1024x512 resolution with face GAN
python test_fullts.py \
--name MY_MODEL_NAME_face \
--dataroot MY_TEST_DATASET \
--checkpoints_dir CHECKPOINT_FILE_LOCATION \
--results_dir WHERE_TO_SAVE_RESULTS \
--face_generator \
--faceGtype global \
--netG local \
--ngf 32 \
--resize_or_crop none \
--no_instance \
--how_many 10000 \
--label_nc 6

Dataset preparation

We also provide code for creating both training and testing datasets (including global pose normalization) in the data_prep folder. See the sample_data folder for examples on how to prepare the code for training. Note the original_img is not necessary at test time and is provided only for reference.

Our dataset preparation code is based on output formats from OpenPose and currently supports the COCO, BODY_23, and BODY_25 pose output format as well as hand and face keypoints. To install and run OpenPose please follow the directions at the OpenPose repository.

graph_train.py

will prepare a train dataset with subfolders

train_label (contains 1024x512 inputs)
train_img (contains 1024x512 targets)
train_facetexts128 (contains face 128x128 bounding box coordinates in .txt files) No smoothing

python graph_train.py \
--keypoints_dir /data/scratch/caroline/keypoints/jason_keys \
--frames_dir /data/scratch/caroline/frames/jason_frames \
--save_dir /data/scratch/caroline/savefolder \
--spread 4000 25631 1 \
--facetexts

graph_avesmooth.py

will prepare a dataset with averaged smoothed keypoints with subfolders (usually for validation)

test_label (contains 1024x512 inputs)
test_img (contains 1024x512 targets)
test_factexts128 (contains face 128x128 bounding box coordinates in .txt files)

python graph_avesmooth.py \
--keypoints_dir /data/scratch/caroline/keypoints/wholedance_keys \
--frames_dir /data/scratch/caroline/frames/wholedance \
--save_dir /data/scratch/caroline/savefolder \
--spread 500 29999 4 \
--facetexts

graph_posenorm.py

will prepare a dataset with global pose normalization + median smoothing

test_label (contains 1024x512 inputs)
test_img (contains 1024x512 targets)
test_factexts128 (contains face 128x128 bounding box coordinates in .txt files)

python graph_posenorm.py \
--target_keypoints /data/scratch/caroline/keypoints/wholedance_keys \
--source_keypoints /data/scratch/caroline/keypoints/dubstep_keypointsFOOT \
--target_shape 1080 1920 3 \
--source_shape 1080 1920 3 \
--source_frames /data/scratch/caroline/frames/dubstep_frames \
--results /data/scratch/caroline/savefolder \
--target_spread 30003 178780 \
--source_spread 200 4800 \
--calculate_scale_translation
--facetexts

Acknowledgements

Model code adapted from pix2pixHD and pytorch-CycleGAN-and-pix2pix

Data Preparation code adapted from Realtime_Multi-Person_Pose_Estimation

Data Preparation code based on outputs from OpenPose

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
data		data
data_prep		data_prep
models		models
options		options
sample_data		sample_data
util		util
LICENSE.txt		LICENSE.txt
README.md		README.md
_config.yml		_config.yml
eMERGE_MIMICIV.ipynb		eMERGE_MIMICIV.ipynb
test_fullts.py		test_fullts.py
train_fullts.py		train_fullts.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Form Trainer: A Computer Vision Assistant for Physical Therapy

Anthony Tafoya, Derek Xu, Leon Jia

UC Berkeley, Presented to the Berkeley EECS & AI Research Symposium

[paper work is based on]

Abstract

Prerequisites

Training

Global Stage

Local Stage

Face GAN stage

Testing

Global Stage

Local Stage

Face GAN stage

Dataset preparation

graph_train.py

graph_avesmooth.py

graph_posenorm.py

Acknowledgements

About

Releases

Packages

Languages

License

Anthony-Tafoya/Form_Trainer

Folders and files

Latest commit

History

Repository files navigation

Form Trainer: A Computer Vision Assistant for Physical Therapy

Anthony Tafoya, Derek Xu, Leon Jia

UC Berkeley, Presented to the Berkeley EECS & AI Research Symposium

[paper work is based on]

Abstract

Prerequisites

Training

Global Stage

Local Stage

Face GAN stage

Testing

Global Stage

Local Stage

Face GAN stage

Dataset preparation

graph_train.py

graph_avesmooth.py

graph_posenorm.py

Acknowledgements

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages