Update README.md
Browse files
README.md
CHANGED
@@ -1,50 +1,54 @@
|
|
1 |
-
---
|
2 |
-
license: mit
|
3 |
-
---
|
4 |
<html>
|
5 |
|
6 |
# Gender Prediction from Names using Neural Network
|
7 |
|
8 |
### π **Project Overview**
|
|
|
9 |
This project uses a **Neural Network** model trained on **TF-IDF vectors** of names to predict the gender (Male/Female). The model is deployed using **Streamlit**, making it easy to interact and predict the gender from a user-inputted name.
|
10 |
|
11 |
---
|
12 |
|
13 |
## πΈ **Application Screenshot**
|
14 |
-
|
15 |
-
|
16 |
|
17 |
## π **How It Works (End-to-End)**
|
18 |
|
19 |
### 1. **Data Preparation**
|
20 |
-
|
21 |
-
|
22 |
-
|
23 |
-
|
|
|
24 |
|
25 |
### 2. **Feature Extraction (TF-IDF Vectorization)**
|
26 |
-
|
27 |
-
|
|
|
28 |
|
29 |
### 3. **Model Training**
|
30 |
-
|
31 |
-
|
32 |
-
|
33 |
-
|
34 |
-
|
|
|
35 |
|
36 |
### 4. **Saving the Model and Vectorizer**
|
37 |
-
|
38 |
-
|
|
|
39 |
|
40 |
### 5. **Streamlit Application**
|
41 |
-
|
42 |
-
|
43 |
-
|
|
|
44 |
|
45 |
---
|
46 |
|
47 |
## π **Project File Structure**
|
|
|
48 |
```
|
49 |
.
|
50 |
βββ TrainImprove.py # Training script for the model
|
@@ -60,28 +64,34 @@ This project uses a **Neural Network** model trained on **TF-IDF vectors** of na
|
|
60 |
## π **How to Run the Project**
|
61 |
|
62 |
### 1. **Clone the Repository**
|
|
|
63 |
```bash
|
64 |
$ git clone <repository-url>
|
65 |
$ cd <repository-folder>
|
66 |
```
|
67 |
|
68 |
### 2. **Install Dependencies**
|
|
|
69 |
```bash
|
70 |
$ pip install -r requirements.txt
|
71 |
```
|
72 |
|
73 |
### 3. **Train the Model (Optional)**
|
|
|
74 |
If you want to retrain the model, run the training script:
|
|
|
75 |
```bash
|
76 |
$ python TrainImprove.py
|
77 |
```
|
78 |
|
79 |
### 4. **Run the Streamlit Application**
|
|
|
80 |
```bash
|
81 |
$ streamlit run final.py
|
82 |
```
|
83 |
|
84 |
### 5. **Access the App**
|
|
|
85 |
Open your browser and go to: [http://localhost:8501](http://localhost:8501)
|
86 |
|
87 |
---
|
@@ -89,6 +99,7 @@ Open your browser and go to: [http://localhost:8501](http://localhost:8501)
|
|
89 |
## π‘ **How the Code Works**
|
90 |
|
91 |
### **Training (TrainImprove.py)**
|
|
|
92 |
1. **Data Loading:** Reads the dataset from `gender.xlsx`.
|
93 |
2. **Preprocessing:** Converts names to TF-IDF vectors.
|
94 |
3. **Model Building:** Defines a neural network with regularization.
|
@@ -96,6 +107,7 @@ Open your browser and go to: [http://localhost:8501](http://localhost:8501)
|
|
96 |
5. **Saving Artifacts:** Stores the trained model (`.h5`) and vectorizer (`.joblib`).
|
97 |
|
98 |
### **Application (final.py)**
|
|
|
99 |
1. **Load Model and Vectorizer:** Loads the pre-trained model and TF-IDF vectorizer.
|
100 |
2. **User Input:** Accepts a name input from the user.
|
101 |
3. **Prediction:** Transforms the name using TF-IDF and makes a prediction.
|
@@ -104,6 +116,7 @@ Open your browser and go to: [http://localhost:8501](http://localhost:8501)
|
|
104 |
---
|
105 |
|
106 |
## π¦ **Dependencies**
|
|
|
107 |
- Python 3.x
|
108 |
- TensorFlow
|
109 |
- Scikit-learn
|
@@ -111,6 +124,7 @@ Open your browser and go to: [http://localhost:8501](http://localhost:8501)
|
|
111 |
- Streamlit
|
112 |
|
113 |
Install them using:
|
|
|
114 |
```bash
|
115 |
$ pip install tensorflow scikit-learn pandas streamlit joblib
|
116 |
```
|
@@ -118,6 +132,7 @@ $ pip install tensorflow scikit-learn pandas streamlit joblib
|
|
118 |
---
|
119 |
|
120 |
## π¨ **Future Enhancements**
|
|
|
121 |
- Improve the UI design.
|
122 |
- Include more diverse datasets for better generalization.
|
123 |
- Add confidence scores for predictions.
|
@@ -126,14 +141,13 @@ $ pip install tensorflow scikit-learn pandas streamlit joblib
|
|
126 |
---
|
127 |
|
128 |
## π€ **Contributing**
|
|
|
129 |
Feel free to fork the project and submit a pull request for improvements.
|
130 |
|
131 |
---
|
132 |
|
133 |
## π **License**
|
134 |
-
This project is licensed under the MIT License.
|
135 |
-
|
136 |
-
|
137 |
|
|
|
138 |
|
139 |
</html>
|
|
|
|
|
|
|
|
|
1 |
<html>
|
2 |
|
3 |
# Gender Prediction from Names using Neural Network
|
4 |
|
5 |
### π **Project Overview**
|
6 |
+
|
7 |
This project uses a **Neural Network** model trained on **TF-IDF vectors** of names to predict the gender (Male/Female). The model is deployed using **Streamlit**, making it easy to interact and predict the gender from a user-inputted name.
|
8 |
|
9 |
---
|
10 |
|
11 |
## πΈ **Application Screenshot**
|
12 |
+
|
13 |
+
## <a href="https://ibb.co/JjhspDnr"><img src="https://ibb.co/JjhspDnr" alt="Screenshot-2025-02-11-222451" border="0" /></a>
|
14 |
|
15 |
## π **How It Works (End-to-End)**
|
16 |
|
17 |
### 1. **Data Preparation**
|
18 |
+
|
19 |
+
- The dataset `gender.xlsx` contains names and their corresponding genders (Male/Female).
|
20 |
+
- The `Gender` column is mapped to numerical values:
|
21 |
+
- **Male (M)** is mapped to `1`
|
22 |
+
- **Female (F)** is mapped to `0`
|
23 |
|
24 |
### 2. **Feature Extraction (TF-IDF Vectorization)**
|
25 |
+
|
26 |
+
- The names are converted to **TF-IDF vectors** using character n-grams (1 to 3 characters).
|
27 |
+
- This helps the model learn important patterns in names.
|
28 |
|
29 |
### 3. **Model Training**
|
30 |
+
|
31 |
+
- A **Neural Network** is built using **Keras Sequential API**:
|
32 |
+
- Dense layers with **ReLU activation**
|
33 |
+
- **Batch Normalization** and **Dropout layers** to prevent overfitting
|
34 |
+
- Output layer with **Sigmoid activation** for binary classification
|
35 |
+
- The model is trained with **callbacks** like early stopping and learning rate reduction.
|
36 |
|
37 |
### 4. **Saving the Model and Vectorizer**
|
38 |
+
|
39 |
+
- The trained model is saved as `gender_prediction_model_Improve.h5`
|
40 |
+
- The TF-IDF vectorizer is saved as `tfidf_vectorizer_Improve.joblib`
|
41 |
|
42 |
### 5. **Streamlit Application**
|
43 |
+
|
44 |
+
- Loads the pre-trained model and vectorizer.
|
45 |
+
- Accepts user input (name) and predicts gender.
|
46 |
+
- Displays the predicted gender in a clean UI.
|
47 |
|
48 |
---
|
49 |
|
50 |
## π **Project File Structure**
|
51 |
+
|
52 |
```
|
53 |
.
|
54 |
βββ TrainImprove.py # Training script for the model
|
|
|
64 |
## π **How to Run the Project**
|
65 |
|
66 |
### 1. **Clone the Repository**
|
67 |
+
|
68 |
```bash
|
69 |
$ git clone <repository-url>
|
70 |
$ cd <repository-folder>
|
71 |
```
|
72 |
|
73 |
### 2. **Install Dependencies**
|
74 |
+
|
75 |
```bash
|
76 |
$ pip install -r requirements.txt
|
77 |
```
|
78 |
|
79 |
### 3. **Train the Model (Optional)**
|
80 |
+
|
81 |
If you want to retrain the model, run the training script:
|
82 |
+
|
83 |
```bash
|
84 |
$ python TrainImprove.py
|
85 |
```
|
86 |
|
87 |
### 4. **Run the Streamlit Application**
|
88 |
+
|
89 |
```bash
|
90 |
$ streamlit run final.py
|
91 |
```
|
92 |
|
93 |
### 5. **Access the App**
|
94 |
+
|
95 |
Open your browser and go to: [http://localhost:8501](http://localhost:8501)
|
96 |
|
97 |
---
|
|
|
99 |
## π‘ **How the Code Works**
|
100 |
|
101 |
### **Training (TrainImprove.py)**
|
102 |
+
|
103 |
1. **Data Loading:** Reads the dataset from `gender.xlsx`.
|
104 |
2. **Preprocessing:** Converts names to TF-IDF vectors.
|
105 |
3. **Model Building:** Defines a neural network with regularization.
|
|
|
107 |
5. **Saving Artifacts:** Stores the trained model (`.h5`) and vectorizer (`.joblib`).
|
108 |
|
109 |
### **Application (final.py)**
|
110 |
+
|
111 |
1. **Load Model and Vectorizer:** Loads the pre-trained model and TF-IDF vectorizer.
|
112 |
2. **User Input:** Accepts a name input from the user.
|
113 |
3. **Prediction:** Transforms the name using TF-IDF and makes a prediction.
|
|
|
116 |
---
|
117 |
|
118 |
## π¦ **Dependencies**
|
119 |
+
|
120 |
- Python 3.x
|
121 |
- TensorFlow
|
122 |
- Scikit-learn
|
|
|
124 |
- Streamlit
|
125 |
|
126 |
Install them using:
|
127 |
+
|
128 |
```bash
|
129 |
$ pip install tensorflow scikit-learn pandas streamlit joblib
|
130 |
```
|
|
|
132 |
---
|
133 |
|
134 |
## π¨ **Future Enhancements**
|
135 |
+
|
136 |
- Improve the UI design.
|
137 |
- Include more diverse datasets for better generalization.
|
138 |
- Add confidence scores for predictions.
|
|
|
141 |
---
|
142 |
|
143 |
## π€ **Contributing**
|
144 |
+
|
145 |
Feel free to fork the project and submit a pull request for improvements.
|
146 |
|
147 |
---
|
148 |
|
149 |
## π **License**
|
|
|
|
|
|
|
150 |
|
151 |
+
This project is licensed under the MIT License.
|
152 |
|
153 |
</html>
|