Most of the time while working we dont have space on screen for app to control the background activities like music. This project is an example of use of handsigns as input for web-automation.
-
Python 3.8 or above
Python libraries:
- Pytorch
- CV2
pip install opencv-python
- Selenium
pip install selenium
-
Geckodriver (present in the repository)
-
Webcam
-
Python 3.8 or above
Python libraries:
- Pytorch
- CV2
pip install opencv-python
- Matplotlib
pip install matplotlib
Run the code using command line
A webcam window will open.
Make these handsigns in the visible region of output of webcam.
Until the open browser sign is made all instructions will be ignored.
Input image size=28x28x1
DataSet:
Train Set = 9000 Images
Dev Set = 1800 Images
Architecture used:
(FC-Fully Connected Dense Layer,F-Features,S-Strides,C-Channels)
Conv Layer - F(3x3) S(1x1) C(1,64)
BatchNorm
Max Pool - F(2x2) S(2x2)
Conv Layer - F(3x3) S(1x1) C(64,128)
BatchNorm
Max Pool - F(2x2) S(2x2)
FC(5*5*128,128)
BatchNorm
FC(128,9)
Optimizer Used: ADAM
Learning Rate(alpha)=0.01
Decay rate=(0.95^i) {i is epoch number}
Number of Epochs = 10
Mini Batch Size = 1000