Metadata-Version: 2.1
Name: hand_gesture_recognizer_2DCNN
Version: 1.4.2
Summary: A library for recognizing hand gestures using 2D CNN
Author: Umesh Singh Verma
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Requires-Python: >=3.6
Description-Content-Type: text/markdown
Requires-Dist: opencv-python
Requires-Dist: mediapipe
Requires-Dist: numpy

# hand-gesture-recognition 2D CNN

This is a sample program that recognizes hand signs and finger gestures with a simple 2D CNN using the detected key points.***.***

![mqlrf-s6x16](https://user-images.githubusercontent.com/37477845/102222442-c452cd00-3f26-11eb-93ec-c387c98231be.gif)This repository contains the following contents.

- Sample program
- Hand sign recognition model(TFLite)
- Finger gesture recognition model(TFLite)
- Learning data for hand sign recognition and notebook for learning
- Learning data for finger gesture recognition and notebook for learning

# Requirements

- mediapipe 0.8.1
- OpenCV 3.4.2 or Later
- Tensorflow 2.3.0 or Later&lt;br&gt;tf-nightly 2.5.0.dev or later (Only when creating a TFLite for an LSTM model)
- scikit-learn 0.23.2 or Later (Only if you want to display the confusion matrix)
- matplotlib 3.3.2 or Later (Only if you want to display the confusion matrix)

# Demo

Here's how to install the project.

```bash
pip install hand-gesture-recognizer
```

The following options can be specified when running the demo.

- \--device&lt;br&gt;Specifying the camera device number (Default：0)
- \--width&lt;br&gt;Width at the time of camera capture (Default：960)
- \--height&lt;br&gt;Height at the time of camera capture (Default：540)
- \--use_static_image_mode&lt;br&gt;Whether to use static_image_mode option for MediaPipe inference (Default：Unspecified)
- \--min_detection_confidence&lt;br&gt; Detection confidence threshold (Default：0.5)
- \--min_tracking_confidence&lt;br&gt; Tracking confidence threshold (Default：0.5)

### Code

```
import keyboard
from hand_gesture_recognizer import GestureRecognizer

# Dictionary to map gestures to keyboard or mouse actions
gesture_actions = {
    "fist": "down",
    "open_palm": "up",
    "two_fingers": "left",  # Click left mouse button
    "three_fingers": "right",  # Click right mouse button
}

# Track the current state of each gesture (active or not)
active_gestures = set()


def handle_gesture_state(gesture_name, state):
    """
    Handle the state of gestures and map to keyboard or mouse actions.
    :param gesture_name: Name of the gesture (e.g., 'fist', 'open_palm', 'two_fingers').
    :param state: State of the gesture ('appear', 'disappear').
    """
    action = gesture_actions.get(gesture_name)
    if not action:
        return

    if state == "appear" and gesture_name not in active_gestures:
        # Gesture appeared, perform the action
        if callable(action):
            action()  # Execute the action if it's a callable (e.g., mouse click)
        else:
            keyboard.press(action)
        active_gestures.add(gesture_name)
    elif state == "disappear" and gesture_name in active_gestures:
        # Gesture disappeared, release the key if it's a keyboard action
        if isinstance(action, str):  # Only release if it's a keyboard key
            keyboard.release(action)
        active_gestures.remove(gesture_name)


# Gesture state handlers
def handle_fist(state):
    handle_gesture_state("fist", state)


def handle_open_palm(state):
    handle_gesture_state("open_palm", state)


def handle_left_swipe(state):
    handle_gesture_state("two_fingers", state)


def handle_right_swipe(state):
    handle_gesture_state("three_fingers", state)


# Initialize the recognizer
recognizer = GestureRecognizer()

# Register gestures and their handlers
recognizer.register_gesture("fist", handle_fist)
recognizer.register_gesture("open_palm", handle_open_palm)
recognizer.register_gesture("two_fingers", handle_left_swipe)
recognizer.register_gesture("three_fingers", handle_right_swipe)

# Run the recognizer
recognizer.run()
```

# Directory

&lt;pre&gt; │ app.py │ keypoint_classification.ipynb │ point_history_classification.ipynb │\
├─model │ ├─keypoint_classifier │ │ │ keypoint.csv │ │ │ keypoint_classifier.hdf5 │ │ │ keypoint_classifier.py │ │ │ keypoint_classifier.tflite │ │ └─ keypoint_classifier_label.csv │ │\
│ └─point_history_classifier │ │ point_history.csv │ │ point_history_classifier.hdf5 │ │ point_history_classifier.py │ │ point_history_classifier.tflite │ └─ point_history_classifier_label.csv │\
└─utils └─cvfpscalc.py &lt;/pre&gt;

### keypoint_classification.ipynb

This is a model training script for hand sign recognition.

### point_history_classification.ipynb

This is a model training script for finger gesture recognition.

### model/keypoint_classifier

This directory stores files related to hand sign recognition.&lt;br&gt; The following files are stored.

- Training data(keypoint.csv)
- Trained model(keypoint_classifier.tflite)
- Label data(keypoint_classifier_label.csv)
- Inference module(keypoint_classifier.py)

### model/point_history_classifier

This directory stores files related to finger gesture recognition.&lt;br&gt; The following files are stored.

- Training data(point_history.csv)
- Trained model(point_history_classifier.tflite)
- Label data(point_history_classifier_label.csv)
- Inference module(point_history_classifier.py)

### utils/cvfpscalc.py

This is a module for FPS measurement.

# Training

Hand sign recognition and finger gesture recognition can add and change training data and retrain the model.

### Hand sign recognition training

#### 1.Learning data collection

Press "k" to enter the mode to save key points（displayed as 「MODE:Logging Key Point」）&lt;br&gt; &lt;img src="https://user-images.githubusercontent.com/37477845/102235423-aa6cb680-3f35-11eb-8ebd-5d823e211447.jpg" width="60%"&gt;&lt;br&gt;&lt;br&gt; If you press "0" to "9", the key points will be added to "model/keypoint_classifier/keypoint.csv" as shown below.&lt;br&gt; 1st column: Pressed number (used as class ID), 2nd and subsequent columns: Key point coordinates&lt;br&gt; &lt;img src="https://user-images.githubusercontent.com/37477845/102345725-28d26280-3fe1-11eb-9eeb-8c938e3f625b.png" width="80%"&gt;&lt;br&gt;&lt;br&gt; The key point coordinates are the ones that have undergone the following preprocessing up to ④.&lt;br&gt; &lt;img src="https://user-images.githubusercontent.com/37477845/102242918-ed328c80-3f3d-11eb-907c-61ba05678d54.png" width="80%"&gt; &lt;img src="https://user-images.githubusercontent.com/37477845/102244114-418a3c00-3f3f-11eb-8eef-f658e5aa2d0d.png" width="80%"&gt;&lt;br&gt;&lt;br&gt; In the initial state, three types of learning data are included: open hand (class ID: 0), close hand (class ID: 1), and pointing (class ID: 2).&lt;br&gt; If necessary, add 3 or later, or delete the existing data of csv to prepare the training data.&lt;br&gt; &lt;img src="https://user-images.githubusercontent.com/37477845/102348846-d0519400-3fe5-11eb-8789-2e7daec65751.jpg" width="25%"&gt;　&lt;img src="https://user-images.githubusercontent.com/37477845/102348855-d2b3ee00-3fe5-11eb-9c6d-b8924092a6d8.jpg" width="25%"&gt;　&lt;img src="https://user-images.githubusercontent.com/37477845/102348861-d3e51b00-3fe5-11eb-8b07-adc08a48a760.jpg" width="25%"&gt;

#### 2.Model training

Open "[keypoint_classification.ipynb](keypoint_classification.ipynb)" in Jupyter Notebook and execute from top to bottom.&lt;br&gt; To change the number of training data classes, change the value of "NUM_CLASSES = 3" &lt;br&gt;and modify the label of "model/keypoint_classifier/keypoint_classifier_label.csv" as appropriate.&lt;br&gt;&lt;br&gt;

#### X.Model structure

The image of the model prepared in "[keypoint_classification.ipynb](keypoint_classification.ipynb)" is as follows. &lt;img src="https://user-images.githubusercontent.com/37477845/102246723-69c76a00-3f42-11eb-8a4b-7c6b032b7e71.png" width="50%"&gt;&lt;br&gt;&lt;br&gt;

### Finger gesture recognition training

#### 1.Learning data collection

Press "h" to enter the mode to save the history of fingertip coordinates (displayed as "MODE:Logging Point History").&lt;br&gt; &lt;img src="https://user-images.githubusercontent.com/37477845/102249074-4d78fc80-3f45-11eb-9c1b-3eb975798871.jpg" width="60%"&gt;&lt;br&gt;&lt;br&gt; If you press "0" to "9", the key points will be added to "model/point_history_classifier/point_history.csv" as shown below.&lt;br&gt; 1st column: Pressed number (used as class ID), 2nd and subsequent columns: Coordinate history&lt;br&gt; &lt;img src="https://user-images.githubusercontent.com/37477845/102345850-54ede380-3fe1-11eb-8d04-88e351445898.png" width="80%"&gt;&lt;br&gt;&lt;br&gt; The key point coordinates are the ones that have undergone the following preprocessing up to ④.&lt;br&gt; &lt;img src="https://user-images.githubusercontent.com/37477845/102244148-49e27700-3f3f-11eb-82e2-fc7de42b30fc.png" width="80%"&gt;&lt;br&gt;&lt;br&gt; In the initial state, 4 types of learning data are included: stationary (class ID: 0), clockwise (class ID: 1), counterclockwise (class ID: 2), and moving (class ID: 4). &lt;br&gt; If necessary, add 5 or later, or delete the existing data of csv to prepare the training data.&lt;br&gt; &lt;img src="https://user-images.githubusercontent.com/37477845/102350939-02b0c080-3fe9-11eb-94d8-54a3decdeebc.jpg" width="20%"&gt;　&lt;img src="https://user-images.githubusercontent.com/37477845/102350945-05131a80-3fe9-11eb-904c-a1ec573a5c7d.jpg" width="20%"&gt;　&lt;img src="https://user-images.githubusercontent.com/37477845/102350951-06444780-3fe9-11eb-98cc-91e352edc23c.jpg" width="20%"&gt;　&lt;img src="https://user-images.githubusercontent.com/37477845/102350942-047a8400-3fe9-11eb-9103-dbf383e67bf5.jpg" width="20%"&gt;

#### 2.Model training

Open "[point_history_classification.ipynb](point_history_classification.ipynb)" in Jupyter Notebook and execute from top to bottom.&lt;br&gt; To change the number of training data classes, change the value of "NUM_CLASSES = 4" and &lt;br&gt;modify the label of "model/point_history_classifier/point_history_classifier_label.csv" as appropriate. &lt;br&gt;&lt;br&gt;

#### X.Model structure

The image of the model prepared in "[point_history_classification.ipynb](point_history_classification.ipynb)" is as follows. &lt;img src="https://user-images.githubusercontent.com/37477845/102246771-7481ff00-3f42-11eb-8ddf-9e3cc30c5816.png" width="50%"&gt;&lt;br&gt; The model using "LSTM" is as follows. &lt;br&gt;Please change "use_lstm = False" to "True" when using (tf-nightly required (as of 2020/12/16))&lt;br&gt; &lt;img src="https://user-images.githubusercontent.com/37477845/102246817-8368b180-3f42-11eb-9851-23a7b12467aa.png" width="60%"&gt;

# Reference

- Dynamic gesture recognition based on 2D convolutional neural network and feature fusion
- Fine-Grained Gesture Control for Mobile Devices in Driving Environments

# Contributors

- Umesh Singh Verma
- Ankit Yadav
- Manan Patel
- Sukrit Malpani
- Siddhant Mukund
