Hand Gesture Recognition and Application

National Action Council for Minorities in Engineering(NACME) Google Applied Machine Learning Intensive (AMLI) at the `University of Arkansas`

Developed by:

N'kira Brooks - New York University
Lizbet Rivera - University of Arkansas
Steve Liang - University of Arkansas

Description

This project identifies and labels 27 human hand gestures and connects them to keys on a keyboard. Originally, this model was designed to use simple hand gestures to control videos and presentations; however, it can be modificed to apply the gestures to different actions.

Usage instructions

Make sure your computer has a GPU, otherwise the project will not run sucessfully
You need a webcam to record your gestures to use this model.

TSM Online Hand Gesture Recognition Demo

@inproceedings{lin2019tsm,
  title={TSM: Temporal Shift Module for Efficient Video Understanding},
  author={Lin, Ji and Gan, Chuang and Han, Song},
  booktitle={Proceedings of the IEEE International Conference on Computer Vision},
  year={2019}
}

See the [full video] of our demo on NVIDIA Jetson Nano.

[NEW!] We have updated the environment set up by using onnx-simplifier, which makes the deployment easy. Thanks for the advice from @poincarelee!

Overview

We show how to deploy an online hand gesture recognition system on NVIDIA Jetson Nano. The model is based on MobileNetV2 backbone with Temporal Shift Module (TSM) to model the temporal relationship. It is compiled with TVM [1] for acceleration.

The model can achieve real-time recognition. Without considering the data IO time, it can achieve >70 FPS on Nano GPU.

[1] Tianqi Chen et al., TVM: An automated end-to-end optimizing compiler for deep learning, in OSDI 2018

Model

We used an online version of Temporal Shift Module in this demo. The model design is shown below:

After compiled with TVM, our model can efficient run on low-power devices.

Step-by-step Tutorial

We show how to set up the environment on Jetson Nano, compile the PyTorch model with TVM, and perform the online demo from camera streaming.

Get an NVIDIA Jeston Nano board (it is only $99!).
Get a micro SD card and burn the Nano system image into it following here. Insert the card and boot the Nano. Note: you may want to get a power adaptor for a stable power supply.
Check if OpenCv 4.X is installed (it is now included in SD card image from r32.3.1)

 $ Python3
 >> Import cv2
 >> cv2.__version__

It should show 4.X. If not, build OpenCV 4.0.0 using this script, so that we can enable camera access (It may take a while due to the weak CPU). You also need add cv2 package to path import search path.

export PYTHONPATH=/usr/local/python

Follow here to install PyTorch and torchvision.
Build TVM with following commands

sudo apt install llvm # install llvm which is required by tvm
git clone -b v0.6 https://github.com/apache/incubator-tvm.git
cd incubator-tvm
git submodule update --init
mkdir build
cp cmake/config.cmake build/
cd build
#[
#edit config.cmake to change
# 32 line: USE_CUDA OFF -> USE_CUDA ON
#104 line: USE_LLVM OFF -> USE_LLVM ON
#]
cmake ..
make -j4
cd ..
cd python; sudo python3 setup.py install; cd ..
cd topi/python; sudo python3 setup.py install; cd ../..

Install ONNX

# install onnx
sudo apt-get install protobuf-compiler libprotoc-dev
pip3 install onnx

Install onnx-simplifier

git clone https://github.com/daquexian/onnx-simplifier
cd onnx-simplifier
# remove requirement 'onnxruntime >= 1.2.0' in setup.py, as it is not actually used
pip install .
cd ..

export cuda toolkit binary to path

export PATH=$PATH:/usr/local/cuda/bin

Finally, run the demo. The first run will compile the PyTorch TSM model into TVM binary first and then run it. Later run will directly execute the compiled TVM model.

python3 main.py

Press Q or Esc to quit. Press F to enter/exit full-screen.

Supported Gestures

No gesture
Stop Sign
Drumming Fingers
Thumb Up
Thumb Down
Zooming In With Full Hand
Zooming In With Two Fingers
Zooming Out With Full Hand
Zooming Out With Two Fingers
Swiping Down
Swiping Left
Swiping Right
Swiping Up
Sliding Two Fingers Down
Sliding Two Fingers Left
Sliding Two Fingers Right
Sliding Two Fingers Up
Pulling Hand In
Pulling Two Fingers In

Contact

Since we used the repo from MIT lab, if you have any questions, please contact the following people:

Ji Lin, [email protected]

Yaoyao Ding, [email protected]

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
archs		archs
online_demo		online_demo
ops		ops
scripts		scripts
tools		tools
tsm_fpga		tsm_fpga
.gitignore		.gitignore
LICENSE		LICENSE
README 2.md		README 2.md
README.md		README.md
main.py		main.py
opts.py		opts.py
test_models.py		test_models.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Hand Gesture Recognition and Application

National Action Council for Minorities in Engineering(NACME) Google Applied Machine Learning Intensive (AMLI) at the `University of Arkansas`

Description

Usage instructions

TSM Online Hand Gesture Recognition Demo

Overview

Model

Step-by-step Tutorial

Supported Gestures

Contact

About

Releases

Packages

Contributors 3

Languages

License

Applied-Machine-Learning-2021/final-project-ua-team-2

Folders and files

Latest commit

History

Repository files navigation

Hand Gesture Recognition and Application

National Action Council for Minorities in Engineering(NACME) Google Applied Machine Learning Intensive (AMLI) at the University of Arkansas

Description

Usage instructions

TSM Online Hand Gesture Recognition Demo

Overview

Model

Step-by-step Tutorial

Supported Gestures

Contact

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

National Action Council for Minorities in Engineering(NACME) Google Applied Machine Learning Intensive (AMLI) at the `University of Arkansas`

Packages