# Overview:
This project is a comprehensive Resume Parsing tool built using Python,
integrating the Mistral-Nemo-Instruct-2407 model for primary parsing.
# Installation Guide:
1. Create and Activate a Virtual Environment
python -m venv venv
source venv/bin/activate # For Linux/Mac
# or
venv\Scripts\activate # For Windows
# NOTE: If the virtual environment (venv) is already created, you can skip the creation step and just activate.
- For Linux/Mac:
source venv/bin/activate
- For Windows:
venv\Scripts\activate
2. Install Required Libraries
pip install -r requirements.txt
# Ensure the following dependencies are included:
- Flask
- spaCy
- huggingface_hub
- PyMuPDF
- python-docx
- Tesseract-OCR (for image-based parsing)
; NOTE : If any model or library is not installed, you can install it using:
pip install
_Replace with the specific model or library you need to install_
3. Set up Hugging Face Token
- Add your Hugging Face token to the .env file as:
HF_TOKEN=
# File Structure Overview:
Spacy_Model_creator/
│
├── Models/
│ └── ner_model_05_3 # Pretrained spaCy model directory for resume parsing
│
├── data/
│ └── Json_data.json
│ └── resume_text.txt
│ └── Spacy_data.spacy
│
├── templates/
│ ├── anoter.html
│ └── result.html
│ └── guide.html
│ └── savejson.html
│ └── savespacy.html
│ └── text.html
│ └── upload.html
│ └── data_files.html
│
├── JSON/
│ └── Json_data.json
│
├── utils/
│ ├── model.py # Code for calling Mistral API and handling responses
│ ├── json_to_spacy.py # spaCy fallback model for parsing resumes
│ ├── anoter_to_json.py # Error handling utilities
│ └── file_To_text.py # Functions to extract text from different file formats (PDF, DOCX, etc.)
│
├── venv/ # Virtual environment
│
├── .env # Environment variables file (contains Hugging Face token)
│
├── app.py # Flask app handling API routes for uploading and processing resumes
│
└── requirements.txt # Dependencies required for the project
# References:
- [Flask Documentation](https://flask.palletsprojects.com/)
- [spaCy Documentation](https://spacy.io/usage)
- [Hugging Face Hub API](https://huggingface.co/docs/huggingface_hub/index)
- [PyMuPDF (MuPDF) Documentation](https://pymupdf.readthedocs.io/en/latest/)
- [python-docx Documentation](https://python-docx.readthedocs.io/en/latest/)
- [Tesseract OCR Documentation](https://github.com/UB-Mannheim/tesseract/wiki)
- [Virtual Environments in Python](https://docs.python.org/3/tutorial/venv.html)