πΈ Application Screenshot
π How It Works (End-to-End)
1. Data Preparation
- The dataset
gender.xlsx
contains names and their corresponding genders (Male/Female). - The
Gender
column is mapped to numerical values:- Male (M) is mapped to
1
- Female (F) is mapped to
0
- Male (M) is mapped to
2. Feature Extraction (TF-IDF Vectorization)
- The names are converted to TF-IDF vectors using character n-grams (1 to 3 characters).
- This helps the model learn important patterns in names.
3. Model Training
- A Neural Network is built using Keras Sequential API:
- Dense layers with ReLU activation
- Batch Normalization and Dropout layers to prevent overfitting
- Output layer with Sigmoid activation for binary classification
- The model is trained with callbacks like early stopping and learning rate reduction.
4. Saving the Model and Vectorizer
- The trained model is saved as
gender_prediction_model_Improve.h5
- The TF-IDF vectorizer is saved as
tfidf_vectorizer_Improve.joblib
5. Streamlit Application
- Loads the pre-trained model and vectorizer.
- Accepts user input (name) and predicts gender.
- Displays the predicted gender in a clean UI.
π Project File Structure
.
βββ TrainImprove.py # Training script for the model
βββ ml-st1.py # Streamlit app for gender prediction
βββ gender.xlsx # Dataset with names and gender
βββ gender_prediction_model_Improve.h5 # Saved Keras model
βββ tfidf_vectorizer_Improve.joblib # Saved TF-IDF vectorizer
βββ screenshot.png # Screenshot of the app UI
π How to Run the Project
1. Clone the Repository
$ git clone <repository-url>
$ cd <repository-folder>
2. Install Dependencies
$ pip install -r requirements.txt
3. Train the Model (Optional)
If you want to retrain the model, run the training script:
$ python TrainImprove.py
4. Run the Streamlit Application
$ python -m streamlit run ml-st.py
5. Access the App
Open your browser and go to: http://localhost:8501
π‘ How the Code Works
Training (TrainImprove.py)
- Data Loading: Reads the dataset from
gender.xlsx
. - Preprocessing: Converts names to TF-IDF vectors.
- Model Building: Defines a neural network with regularization.
- Model Training: Trains the model with early stopping.
- Saving Artifacts: Stores the trained model (
.h5
) and vectorizer (.joblib
).
Application (final.py)
- Load Model and Vectorizer: Loads the pre-trained model and TF-IDF vectorizer.
- User Input: Accepts a name input from the user.
- Prediction: Transforms the name using TF-IDF and makes a prediction.
- Output: Displays the predicted gender (Male/Female) in the app.
π¦ Dependencies
- Python 3.x
- TensorFlow
- Scikit-learn
- Pandas
- Streamlit
Install them using:
$ pip install tensorflow scikit-learn pandas streamlit joblib
π¨ Future Enhancements
- Improve the UI design.
- Include more diverse datasets for better generalization.
- Add confidence scores for predictions.
- Deploy the app online for public access.
π€ Contributing
Feel free to fork the project and submit a pull request for improvements.
π License
This project is licensed under the MIT License.
Inference Providers
NEW
This model is not currently available via any of the supported third-party Inference Providers, and
HF Inference API was unable to determine this model's library.