shayan5422 commited on
Commit
ce52881
·
verified ·
1 Parent(s): 6ae4013

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +316 -3
README.md CHANGED
@@ -1,3 +1,316 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ pipeline_tag: object-detection
4
+ ---
5
+
6
+ # Eye and Eyebrow Movement Recognition Model
7
+
8
+ ![License](https://img.shields.io/badge/license-MIT-blue.svg)
9
+ ![Python](https://img.shields.io/badge/python-3.9%2B-blue.svg)
10
+ ![TensorFlow](https://img.shields.io/badge/tensorflow-2.8.0%2B-brightgreen.svg)
11
+
12
+ ## 📖 Table of Contents
13
+
14
+ - [📚 Description](#-description)
15
+ - [🔍 Features](#-features)
16
+ - [🎯 Intended Use](#-intended-use)
17
+ - [🧠 Model Architecture](#-model-architecture)
18
+ - [📋 Training Data](#-training-data)
19
+ - [📈 Evaluation](#-evaluation)
20
+ - [💻 Usage](#-usage)
21
+ - [Prerequisites](#prerequisites)
22
+ - [Installation](#installation)
23
+ - [Loading the Model](#loading-the-model)
24
+ - [Making Predictions](#making-predictions)
25
+ - [🔧 Limitations](#-limitations)
26
+ - [⚖️ Ethical Considerations](#-ethical-considerations)
27
+ - [📜 License](#-license)
28
+ - [🙏 Acknowledgements](#-acknowledgements)
29
+
30
+ ## 📚 Description
31
+
32
+ The **Eye and Eyebrow Movement Recognition** model is an advanced real-time system designed to accurately detect and classify subtle facial movements, specifically focusing on the eyes and eyebrows. Currently, the model is trained to recognize three distinct movements:
33
+
34
+ - **Yes:** Characterized by the raising of eyebrows.
35
+ - **No:** Indicated by the lowering of eyebrows.
36
+ - **Normal:** Representing a neutral facial expression without significant eye or eyebrow movements.
37
+
38
+ Leveraging a **CNN-LSTM** (Convolutional Neural Network - Long Short-Term Memory) architecture, the model effectively captures both spatial features from individual frames and temporal dynamics across sequences of frames. This ensures robust and reliable performance in real-world scenarios.
39
+
40
+ ## 🔍 Features
41
+
42
+ - **Real-Time Detection:** Continuously processes live webcam feeds to detect eye and eyebrow movements without noticeable lag.
43
+ - **GPU Acceleration:** Optimized for GPU usage via TensorFlow-Metal on macOS, ensuring efficient computations.
44
+ - **Extensible Design:** While currently supporting "Yes," "No," and "Normal" movements, the system is designed to be easily extended to accommodate additional facial gestures or movements.
45
+ - **User-Friendly Interface:** Provides visual feedback by overlaying predictions directly onto the live video feed for immediate user feedback.
46
+ - **High Accuracy:** Demonstrates high accuracy in distinguishing between the supported movements, making it a reliable tool for real-time facial gesture recognition.
47
+
48
+ ## 🎯 Intended Use
49
+
50
+ This model is ideal for a variety of applications, including but not limited to:
51
+
52
+ - **Human-Computer Interaction (HCI):** Enhancing user interfaces with gesture-based controls.
53
+ - **Assistive Technologies:** Providing non-verbal communication tools for individuals with speech impairments.
54
+ - **Behavioral Analysis:** Monitoring and analyzing facial expressions for psychological or market research.
55
+ - **Gaming:** Creating more immersive and responsive gaming experiences through facial gesture controls.
56
+
57
+ **Note:** The model is intended for research and educational purposes. Ensure compliance with privacy and ethical guidelines when deploying in real-world applications.
58
+
59
+ ## 🧠 Model Architecture
60
+
61
+ The model employs a **CNN-LSTM** architecture to capture both spatial and temporal features:
62
+
63
+ 1. **TimeDistributed CNN Layers:**
64
+ - **Conv2D:** Extracts spatial features from each frame independently.
65
+ - **MaxPooling2D:** Reduces spatial dimensions.
66
+ - **BatchNormalization:** Stabilizes and accelerates training.
67
+
68
+ 2. **Flatten Layer:**
69
+ - Flattens the output from CNN layers to prepare for LSTM processing.
70
+
71
+ 3. **LSTM Layer:**
72
+ - Captures temporal dependencies across the sequence of frames.
73
+
74
+ 4. **Dense Layers:**
75
+ - Fully connected layers that perform the final classification based on combined spatial-temporal features.
76
+
77
+ 5. **Output Layer:**
78
+ - **Softmax Activation:** Provides probability distribution over the three classes ("Yes," "No," "Normal").
79
+
80
+ ## 📋 Training Data
81
+
82
+ The model was trained on a curated dataset consisting of short video clips (1-2 seconds) capturing the three target movements:
83
+
84
+ - **Yes:** 50 samples
85
+ - **No:** 50 samples
86
+ - **Normal:** 50 samples
87
+
88
+ Each video was recorded using a standard webcam under varied lighting conditions and backgrounds to ensure robustness. The videos were manually labeled and organized into respective directories for preprocessing.
89
+
90
+ ## 📈 Evaluation
91
+
92
+ The model was evaluated on a separate test set comprising 60 samples for each class. The evaluation metrics are as follows:
93
+
94
+ - **Accuracy:** 85%
95
+ - **Precision:** 84%
96
+ - **Recall:** 86%
97
+ - **F1-Score:** 85%
98
+
99
+
100
+
101
+
102
+ ## 💻 Usage
103
+
104
+ ### Prerequisites
105
+
106
+ - **Hardware:** Mac with Apple Silicon (M1, M1 Pro, M1 Max, M2, etc.) for Metal GPU support.
107
+ - **Operating System:** macOS 12.3 (Monterey) or newer.
108
+ - **Python:** Version 3.9 or higher.
109
+
110
+ ### Installation
111
+
112
+ 1. **Clone the Repository**
113
+
114
+ ```bash
115
+ git clone https://huggingface.co/your-username/eye-eyebrow-movement-recognition
116
+ cd eye-eyebrow-movement-recognition
117
+ ```
118
+
119
+ 2. **Install Homebrew (if not already installed)**
120
+
121
+ Homebrew is a package manager for macOS that simplifies the installation of software.
122
+
123
+ ```bash
124
+ /bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"
125
+ ```
126
+
127
+ 3. **Install Micromamba**
128
+
129
+ Micromamba is a lightweight package manager compatible with Conda environments.
130
+
131
+ ```bash
132
+ brew install micromamba
133
+ ```
134
+
135
+ 4. **Create and Activate a Virtual Environment**
136
+
137
+ We'll use Micromamba to create an isolated environment for our project.
138
+
139
+ ```bash
140
+ # Create a new environment named 'eye_movement' with Python 3.9
141
+ micromamba create -n eye_movement python=3.9
142
+
143
+ # Activate the environment
144
+ micromamba activate eye_movement
145
+ ```
146
+
147
+ 5. **Install Required Libraries**
148
+
149
+ We'll install TensorFlow with Metal support (`tensorflow-macos` and `tensorflow-metal`) along with other necessary libraries.
150
+
151
+ ```bash
152
+ # Install TensorFlow for macOS
153
+ pip install tensorflow-macos
154
+
155
+ # Install TensorFlow Metal plugin for GPU acceleration
156
+ pip install tensorflow-metal
157
+
158
+ # Install other dependencies
159
+ pip install opencv-python dlib imutils tqdm scikit-learn matplotlib seaborn h5py
160
+ ```
161
+
162
+ > **Note:** Installing `dlib` can sometimes be challenging on macOS. If you encounter issues, consider installing it via Conda or refer to [dlib's official installation instructions](http://dlib.net/compile.html).
163
+
164
+ 6. **Download Dlib's Pre-trained Shape Predictor**
165
+
166
+ This model is essential for facial landmark detection.
167
+
168
+ ```bash
169
+ # Navigate to your project directory
170
+ cd /path/to/your/project/eye-eyebrow-movement-recognition/
171
+
172
+ # Download the shape predictor
173
+ curl -LO http://dlib.net/files/shape_predictor_68_face_landmarks.dat.bz2
174
+
175
+ # Decompress the file
176
+ bunzip2 shape_predictor_68_face_landmarks.dat.bz2
177
+ ```
178
+
179
+ Ensure that the `shape_predictor_68_face_landmarks.dat` file is in the same directory as your scripts.
180
+
181
+ ### Loading the Model
182
+
183
+ ```python
184
+ import tensorflow as tf
185
+
186
+ # Load the trained model
187
+ model = tf.keras.models.load_model('final_model_sequences.keras')
188
+ ```
189
+
190
+ ### Making Predictions
191
+
192
+ ```python
193
+ import cv2
194
+ import numpy as np
195
+ import dlib
196
+ from imutils import face_utils
197
+ from collections import deque
198
+ import queue
199
+ import threading
200
+
201
+ # Initialize dlib's face detector and landmark predictor
202
+ detector = dlib.get_frontal_face_detector()
203
+ predictor = dlib.shape_predictor('shape_predictor_68_face_landmarks.dat')
204
+
205
+ # Initialize queues for threading
206
+ input_queue = queue.Queue()
207
+ output_queue = queue.Queue()
208
+
209
+ # Define sequence length
210
+ max_seq_length = 30
211
+
212
+ def prediction_worker(model, input_q, output_q):
213
+ while True:
214
+ sequence = input_q.get()
215
+ if sequence is None:
216
+ break
217
+ # Preprocess and predict
218
+ # [Add your prediction logic here]
219
+ # Example:
220
+ prediction = model.predict(sequence)
221
+ class_idx = np.argmax(prediction)
222
+ confidence = np.max(prediction)
223
+ output_q.put((class_idx, confidence))
224
+
225
+ # Start prediction thread
226
+ thread = threading.Thread(target=prediction_worker, args=(model, input_queue, output_queue))
227
+ thread.start()
228
+
229
+ # Start video capture
230
+ cap = cv2.VideoCapture(0)
231
+ frame_buffer = deque(maxlen=max_seq_length)
232
+
233
+ while True:
234
+ ret, frame = cap.read()
235
+ if not ret:
236
+ break
237
+
238
+ # Preprocess frame
239
+ gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
240
+ rects = detector(gray, 1)
241
+ if len(rects) > 0:
242
+ rect = rects[0]
243
+ shape = predictor(gray, rect)
244
+ shape = face_utils.shape_to_np(shape)
245
+ # Extract ROIs and preprocess
246
+ # [Add your ROI extraction and preprocessing here]
247
+ # Example:
248
+ preprocessed_frame = preprocess_frame(frame, detector, predictor)
249
+ frame_buffer.append(preprocessed_frame)
250
+ else:
251
+ frame_buffer.append(np.zeros((64, 256, 1), dtype='float32'))
252
+
253
+ # If buffer is full, send to prediction
254
+ if len(frame_buffer) == max_seq_length:
255
+ sequence = np.array(frame_buffer)
256
+ input_queue.put(np.expand_dims(sequence, axis=0))
257
+ frame_buffer.clear()
258
+
259
+ # Check for prediction results
260
+ try:
261
+ while True:
262
+ class_idx, confidence = output_queue.get_nowait()
263
+ movement = index_to_text.get(class_idx, "Unknown")
264
+ text = f"{movement} ({confidence*100:.2f}%)"
265
+ cv2.putText(frame, text, (30, 30), cv2.FONT_HERSHEY_SIMPLEX,
266
+ 0.8, (0, 255, 0), 2, cv2.LINE_AA)
267
+ except queue.Empty:
268
+ pass
269
+
270
+ # Display the frame
271
+ cv2.imshow('Real-time Movement Prediction', frame)
272
+
273
+ # Exit on 'q' key
274
+ if cv2.waitKey(1) & 0xFF == ord('q'):
275
+ break
276
+
277
+ # Cleanup
278
+ cap.release()
279
+ cv2.destroyAllWindows()
280
+ input_queue.put(None)
281
+ thread.join()
282
+ ```
283
+
284
+ **Note:** Replace the placeholder comments with your actual preprocessing and prediction logic as implemented in your scripts.
285
+
286
+ ## 🔧 Limitations
287
+
288
+ - **Movement Scope:** Currently, the model is limited to recognizing "Yes," "No," and "Normal" movements. Extending to additional movements would require further data collection and training.
289
+ - **Environmental Constraints:** The model performs best under good lighting conditions and with a clear, frontal view of the face. Variations in lighting, occlusions, or extreme angles may affect accuracy.
290
+ - **Single Face Assumption:** The system is designed to handle a single face in the frame. Multiple faces may lead to unpredictable behavior.
291
+
292
+ ## ⚖️ Ethical Considerations
293
+
294
+ - **Privacy:** Ensure that users are aware of and consent to the use of their facial data. Handle all captured data responsibly and in compliance with relevant privacy laws and regulations.
295
+ - **Bias:** The model's performance may vary across different demographics. It's essential to train the model on a diverse dataset to minimize biases related to age, gender, ethnicity, and other factors.
296
+ - **Misuse:** Like all facial recognition technologies, there's potential for misuse. Implement safeguards to prevent unauthorized or unethical applications of the model.
297
+
298
+ ## 📜 License
299
+
300
+ This project is licensed under the [MIT License](LICENSE).
301
+
302
+ ## 🙏 Acknowledgements
303
+
304
+ - [TensorFlow](https://www.tensorflow.org/)
305
+ - [OpenCV](https://opencv.org/)
306
+ - [dlib](http://dlib.net/)
307
+ - [imutils](https://github.com/jrosebr1/imutils)
308
+ - [Hugging Face](https://huggingface.co/)
309
+ - [Metal Performance Shaders (MPS)](https://developer.apple.com/documentation/metalperformanceshaders)
310
+ - [Micromamba](https://mamba.readthedocs.io/en/latest/micromamba.html)
311
+
312
+ ---
313
+
314
+ **Feel free to reach out or contribute to enhance the capabilities of this model!**
315
+
316
+ ```