Spacy Model Creator

# Overview: This project is a comprehensive Resume Parsing tool built using Python, integrating the Mistral-Nemo-Instruct-2407 model for primary parsing. # Installation Guide: 1. Create and Activate a Virtual Environment python -m venv venv source venv/bin/activate # For Linux/Mac # or venv\Scripts\activate # For Windows # NOTE: If the virtual environment (venv) is already created, you can skip the creation step and just activate. - For Linux/Mac: source venv/bin/activate - For Windows: venv\Scripts\activate 2. Install Required Libraries pip install -r requirements.txt # Ensure the following dependencies are included: - Flask - spaCy - huggingface_hub - PyMuPDF - python-docx - Tesseract-OCR (for image-based parsing) ; NOTE : If any model or library is not installed, you can install it using: pip install _Replace with the specific model or library you need to install_ 3. Set up Hugging Face Token - Add your Hugging Face token to the .env file as: HF_TOKEN= # File Structure Overview: Spacy_Model_creator/ │ ├── Models/ │ └── ner_model_05_3 # Pretrained spaCy model directory for resume parsing │ ├── data/ │ └── Json_data.json │ └── resume_text.txt │ └── Spacy_data.spacy │ ├── templates/ │ ├── anoter.html │ └── result.html │ └── guide.html │ └── savejson.html │ └── savespacy.html │ └── text.html │ └── upload.html │ └── data_files.html │ ├── JSON/ │ └── Json_data.json │ ├── utils/ │ ├── model.py # Code for calling Mistral API and handling responses │ ├── json_to_spacy.py # spaCy fallback model for parsing resumes │ ├── anoter_to_json.py # Error handling utilities │ └── file_To_text.py # Functions to extract text from different file formats (PDF, DOCX, etc.) │ ├── venv/ # Virtual environment │ ├── .env # Environment variables file (contains Hugging Face token) │ ├── app.py # Flask app handling API routes for uploading and processing resumes │ └── requirements.txt # Dependencies required for the project # References: - [Flask Documentation](https://flask.palletsprojects.com/) - [spaCy Documentation](https://spacy.io/usage) - [Hugging Face Hub API](https://huggingface.co/docs/huggingface_hub/index) - [PyMuPDF (MuPDF) Documentation](https://pymupdf.readthedocs.io/en/latest/) - [python-docx Documentation](https://python-docx.readthedocs.io/en/latest/) - [Tesseract OCR Documentation](https://github.com/UB-Mannheim/tesseract/wiki) - [Virtual Environments in Python](https://docs.python.org/3/tutorial/venv.html)