Update README.md
Browse files
README.md
CHANGED
@@ -10,4 +10,147 @@ pinned: false
|
|
10 |
license: apache-2.0
|
11 |
---
|
12 |
|
13 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
10 |
license: apache-2.0
|
11 |
---
|
12 |
|
13 |
+
<img src="appendix/icon.jpeg" width="100" alt="alt text">
|
14 |
+
|
15 |
+
# Facial-Expression-Anomaly-Detection
|
16 |
+
|
17 |
+
This repository contains an algorithm for detecting anomalies in facial expressions over the timeline of a video using time series analysis, specifically utilizing an LSTM autoencoder.
|
18 |
+
|
19 |
+
The tool extracts faces from video frames, detects unique facial features, and analyzes emotional facial expression to identify anomalies. This is particularly useful for forensic analysis and human intelligence (HUMINT) operations.
|
20 |
+
|
21 |
+
## Practical Applications
|
22 |
+
### Forensic Analysis
|
23 |
+
- Identify suspicious behavior in surveillance footage.
|
24 |
+
- Detect stress or duress in interrogation videos.
|
25 |
+
|
26 |
+
### Human Intelligence (HUMINT)
|
27 |
+
- Analyze micro-expressions.
|
28 |
+
- Monitor and assess emotional states in communications.
|
29 |
+
|
30 |
+
## Key Features
|
31 |
+
- **Face Extraction**: Extracts faces from video frames.
|
32 |
+
- **Face Alignment**: Aligns and normalizes faces.
|
33 |
+
- **Feature Embeddings**: Extracts facial feature embeddings using the InceptionResnetV1/VGG-Face model.
|
34 |
+
- **Emotion Detection**: Identifies facial expressions and categorizes emotions.
|
35 |
+
- **Anomaly Detection**: Uses an LSTM autoencoder to detect anomalies in facial expressions.
|
36 |
+
|
37 |
+
<img src="appendix/diagram.svg" width="1050" alt="alt text">
|
38 |
+
|
39 |
+
## Micro-Expressions
|
40 |
+
Paul Ekman’s work on facial expressions of emotion identified universal micro-expressions that reveal true emotions. These fleeting expressions, which last only milliseconds, are incredibly difficult for humans to detect but can be captured and analyzed using computer vision algorithms.
|
41 |
+
|
42 |
+
## InceptionResnetV1
|
43 |
+
The InceptionResnetV1 model is a deep convolutional neural network. It is widely used for facial recognition and facial attributes extraction.
|
44 |
+
- **Accuracy and Reliability**: The InceptionResnetV1 model is pre-trained on the VGGFace2 dataset, which consists of millions of facial images. It achieves very high accuracy in recognizing and differentiating between faces.
|
45 |
+
- **Feature Richness**: The embeddings generated by InceptionResnetV1 capture rich facial details, which are essential for recognizing subtle expressions and variations.
|
46 |
+
- **Global Recognition**: This model is widely adopted in various facial recognition applications, demonstrating its reliability and robustness across different scenarios.
|
47 |
+
|
48 |
+
## FER
|
49 |
+
The Facial Expression Recognition (FER) model used in our pipeline is a pre-trained neural network designed to identify emotional states from facial expressions. Key details about the FER model include:
|
50 |
+
|
51 |
+
- **Accuracy and Reliability**: The FER model is pre-trained on a large dataset of facial images labeled with emotional states, achieving high accuracy in identifying seven basic emotions: Anger, Disgust, Fear, Happiness, Sadness, Surprise, and Neutral.
|
52 |
+
- **Robustness**: The model is capable of recognizing emotions in varying lighting conditions, facial orientations, and occlusions, making it highly reliable for practical applications.
|
53 |
+
|
54 |
+
## LSTM Autoencoder
|
55 |
+
|
56 |
+
An LSTM (Long Short-Term Memory) Autoencoder is a neural network designed for sequential data. It consists of an encoder that compresses input sequences into a fixed-length representation and a decoder that reconstructs the sequence from this representation.
|
57 |
+
|
58 |
+
In our facial-expression anomaly detection:
|
59 |
+
|
60 |
+
1. **Input Preparation**: Facial embeddings are extracted from video frames.
|
61 |
+
2. **Sequence Creation**: These embeddings form a chronological sequence.
|
62 |
+
3. **Training**: The LSTM autoencoder learns typical patterns in these sequences.
|
63 |
+
4. **Anomaly Detection**: High reconstruction errors highlight frames with unusual facial expressions, indicating potential anomalies.
|
64 |
+
|
65 |
+
This approach effectively captures temporal dependencies and subtle changes in facial expressions, providing robust anomaly detection.
|
66 |
+
|
67 |
+
In our facial-expression anomaly detection, we leverage the LSTM autoencoder in three different ways:
|
68 |
+
|
69 |
+
1. **Using All Features**:
|
70 |
+
- We consider the feature's components and emotion scores as input features.
|
71 |
+
- The LSTM autoencoder is trained to detect anomalies based on the full set of features.
|
72 |
+
|
73 |
+
2. **Using Reduced Components**:
|
74 |
+
- We use UMAP (Uniform Manifold Approximation and Projection) to reduce the dimensionality of the facial embeddings into N components.
|
75 |
+
- These reduced components are then used as input for the LSTM autoencoder to detect anomalies based on the compressed feature set.
|
76 |
+
|
77 |
+
3. **Using Full-Dimensional Embeddings**:
|
78 |
+
- The raw facial embeddings, without any dimensionality reduction, are used directly.
|
79 |
+
- The LSTM autoencoder is trained on these high-dimensional embeddings to identify anomalies.
|
80 |
+
|
81 |
+
Each method provides a different perspective on the data, enhancing our capability to detect subtle and varied anomalies in facial expressions.
|
82 |
+
|
83 |
+
## An Example from a Death Sentence Verdict
|
84 |
+
|
85 |
+
<img src="appendix/wade_wilson_2.jpg" width="250" alt="alt text">
|
86 |
+
|
87 |
+
Wade Wilson, a 30-year-old from Fort Myers, Florida, was convicted in June 2024 for the October 2019 murders of Kristine Melton and Diane Ruiz in Cape Coral. During the trial, Wilson was notably cold and calm, displaying a "smug, soulless" demeanor that drew significant attention. He showed a lack of emotion throughout the proceedings, which many found unsettling. The jury recommended the death penalty, with the final sentencing set for July 23, 2024.
|
88 |
+
|
89 |
+
<p align="left">
|
90 |
+
<img src="appendix/1.jpg" width="50" alt="alt text">
|
91 |
+
<img src="appendix/2.jpg" width="50" alt="alt text">
|
92 |
+
<img src="appendix/3.jpg" width="50" alt="alt text">
|
93 |
+
<img src="appendix/4.jpg" width="50" alt="alt text">
|
94 |
+
<img src="appendix/5.jpg" width="50" alt="alt text">
|
95 |
+
<img src="appendix/6.jpg" width="50" alt="alt text">
|
96 |
+
<p/>
|
97 |
+
|
98 |
+
Sources:
|
99 |
+
[1] https://www.foxnews.com/us/florida-double-murderer-viral-smug-soulless-courtroom-demeanor
|
100 |
+
[2] https://winknews.com/2024/06/13/wade-wilsons-lack-emotion-double-murder-trial/
|
101 |
+
[3] https://www.youtube.com/watch?v=8j8psgKXmRg
|
102 |
+
|
103 |
+
### Detected Anomalies (Facial Features)
|
104 |
+
<p align="left">
|
105 |
+
<img src="appendix/anomaly_scores_all_features_plot.png" width="250" alt="alt text">
|
106 |
+
<img src="appendix/anomaly_scores_components_plot.png" width="250" alt="alt text">
|
107 |
+
<img src="appendix/anomaly_scores_embeddings_plot.png" width="250" alt="alt text">
|
108 |
+
<p/>
|
109 |
+
|
110 |
+
### Detected Anomalies (Emotions)
|
111 |
+
<p align="left">
|
112 |
+
<img src="appendix/angry_scores_plot.png" width="250" alt="alt text">
|
113 |
+
<img src="appendix/fear_scores_plot.png" width="250" alt="alt text">
|
114 |
+
<img src="appendix/sad_scores_plot.png" width="250" alt="alt text">
|
115 |
+
<p/>
|
116 |
+
|
117 |
+
### Results and Observations
|
118 |
+
The anomaly detection results highlighted significant anomalies primarily during time points where the penalty of death was discussed during Wade Wilson's trial (we set the numbers of anomalies to 10). Despite Wilson's cold and detached demeanor to human eyes, the LSTM autoencoder detected subtle emotional leaks through his facial expressions. These insights suggest that critical moments, such as mentions of the death penalty, had a marked impact on Wilson, which was reflected in the anomalous changes in his facial expressions.
|
119 |
+
|
120 |
+
## Setup Parameters
|
121 |
+
- `NUM_ANOMALIES`
|
122 |
+
- `DESIRED_FPS`
|
123 |
+
- `NUM_COMPONENTS`
|
124 |
+
- `batch_size`
|
125 |
+
- `VIDEO_FILE_PATH`
|
126 |
+
|
127 |
+
## Output
|
128 |
+
- Organized faces by detected persons in the `organized_faces` folder.
|
129 |
+
- Anomalies detection results as a CSV file in the project directory.
|
130 |
+
|
131 |
+
### Face Extraction and Alignment
|
132 |
+
The algorithm extracts faces from the video, aligns, and normalizes them using MediaPipe and MTCNN.
|
133 |
+
|
134 |
+
### Feature Embedding Extraction
|
135 |
+
Utilizes the InceptionResnetV1 model to extract facial features and FER to detect emotions.
|
136 |
+
|
137 |
+
### Clustering and Outlier Detection
|
138 |
+
Clusters faces to organize them by person.
|
139 |
+
|
140 |
+
### LSTM Autoencoder for Anomaly Detection
|
141 |
+
Trains an LSTM autoencoder to identify anomalies in facial expressions over time. This model helps capture temporal dependencies and irregularities in the sequence of facial expressions and feature embeddings.
|
142 |
+
|
143 |
+
### Dependencies
|
144 |
+
- `torch`
|
145 |
+
- `facenet-pytorch`
|
146 |
+
- `mediapipe`
|
147 |
+
- `FER`
|
148 |
+
- `sklearn`
|
149 |
+
- `umap-learn`
|
150 |
+
- `tqdm`
|
151 |
+
- `opencv-python`
|
152 |
+
- `scipy`
|
153 |
+
- `pandas`
|
154 |
+
-
|
155 |
+
## Conclusion
|
156 |
+
This tool offers robust solutions for detecting emotional anomalies in video-based facial expressions, beneficial for both forensic analysis and HUMINT operations. By leveraging advanced computer vision techniques and the power of LSTM autoencoders, it provides timely and crucial insights into human behavior.
|