File size: 3,761 Bytes
c272f9c bfacd17 008a594 23025fa bfacd17 008a594 bfacd17 008a594 bfacd17 008a594 bfacd17 008a594 bfacd17 008a594 bfacd17 008a594 bfacd17 008a594 bfacd17 008a594 bfacd17 008a594 bfacd17 008a594 bfacd17 008a594 bfacd17 3aee5b6 bfacd17 008a594 bfacd17 008a594 bfacd17 008a594 bfacd17 008a594 bfacd17 008a594 bfacd17 008a594 bfacd17 008a594 bfacd17 008a594 bfacd17 c272f9c |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 |
---
license: mit
---
## πΈ **Application Screenshot**
## <a href="https://ibb.co/JjhspDnr"><img src="https://i.ibb.co/ZzZMgqdc/Screenshot-2025-02-11-231313.png" alt="Screenshot-2025-02-11-231313" border="0" /></a>
## π **How It Works (End-to-End)**
### 1. **Data Preparation**
- The dataset `gender.xlsx` contains names and their corresponding genders (Male/Female).
- The `Gender` column is mapped to numerical values:
- **Male (M)** is mapped to `1`
- **Female (F)** is mapped to `0`
### 2. **Feature Extraction (TF-IDF Vectorization)**
- The names are converted to **TF-IDF vectors** using character n-grams (1 to 3 characters).
- This helps the model learn important patterns in names.
### 3. **Model Training**
- A **Neural Network** is built using **Keras Sequential API**:
- Dense layers with **ReLU activation**
- **Batch Normalization** and **Dropout layers** to prevent overfitting
- Output layer with **Sigmoid activation** for binary classification
- The model is trained with **callbacks** like early stopping and learning rate reduction.
### 4. **Saving the Model and Vectorizer**
- The trained model is saved as `gender_prediction_model_Improve.h5`
- The TF-IDF vectorizer is saved as `tfidf_vectorizer_Improve.joblib`
### 5. **Streamlit Application**
- Loads the pre-trained model and vectorizer.
- Accepts user input (name) and predicts gender.
- Displays the predicted gender in a clean UI.
---
## π **Project File Structure**
```
.
βββ TrainImprove.py # Training script for the model
βββ ml-st1.py # Streamlit app for gender prediction
βββ gender.xlsx # Dataset with names and gender
βββ gender_prediction_model_Improve.h5 # Saved Keras model
βββ tfidf_vectorizer_Improve.joblib # Saved TF-IDF vectorizer
βββ screenshot.png # Screenshot of the app UI
```
---
## π **How to Run the Project**
### 1. **Clone the Repository**
```bash
$ git clone <repository-url>
$ cd <repository-folder>
```
### 2. **Install Dependencies**
```bash
$ pip install -r requirements.txt
```
### 3. **Train the Model (Optional)**
If you want to retrain the model, run the training script:
```bash
$ python TrainImprove.py
```
### 4. **Run the Streamlit Application**
```bash
$ python -m streamlit run ml-st.py
```
### 5. **Access the App**
Open your browser and go to: [http://localhost:8501](http://localhost:8501)
---
## π‘ **How the Code Works**
### **Training (TrainImprove.py)**
1. **Data Loading:** Reads the dataset from `gender.xlsx`.
2. **Preprocessing:** Converts names to TF-IDF vectors.
3. **Model Building:** Defines a neural network with regularization.
4. **Model Training:** Trains the model with early stopping.
5. **Saving Artifacts:** Stores the trained model (`.h5`) and vectorizer (`.joblib`).
### **Application (final.py)**
1. **Load Model and Vectorizer:** Loads the pre-trained model and TF-IDF vectorizer.
2. **User Input:** Accepts a name input from the user.
3. **Prediction:** Transforms the name using TF-IDF and makes a prediction.
4. **Output:** Displays the predicted gender (Male/Female) in the app.
---
## π¦ **Dependencies**
- Python 3.x
- TensorFlow
- Scikit-learn
- Pandas
- Streamlit
Install them using:
```bash
$ pip install tensorflow scikit-learn pandas streamlit joblib
```
---
## π¨ **Future Enhancements**
- Improve the UI design.
- Include more diverse datasets for better generalization.
- Add confidence scores for predictions.
- Deploy the app online for public access.
---
## π€ **Contributing**
Feel free to fork the project and submit a pull request for improvements.
---
## π **License**
This project is licensed under the MIT License.
</html> |