Spaces:
Sleeping
Sleeping
<div style="display: flex; align-items: center; justify-content: center;"> | |
<div style="margin-right: 20px;"> | |
<img src="https://cdn-lfs-us-1.hf.co/repos/de/fb/defb007867acd8852f4a283e9b06a933778826b18ed58ade01da945f5903795d/8b7831230df7d554c74f5e249e23be57165d143fea0ea7b5dde56dde5c13c95b?response-content-disposition=inline%3B+filename*%3DUTF-8%27%27turing-test.gif%3B+filename%3D%22turing-test.gif%22%3B&response-content-type=image%2Fgif&Expires=1730008247&Policy=eyJTdGF0ZW1lbnQiOlt7IkNvbmRpdGlvbiI6eyJEYXRlTGVzc1RoYW4iOnsiQVdTOkVwb2NoVGltZSI6MTczMDAwODI0N319LCJSZXNvdXJjZSI6Imh0dHBzOi8vY2RuLWxmcy11cy0xLmhmLmNvL3JlcG9zL2RlL2ZiL2RlZmIwMDc4NjdhY2Q4ODUyZjRhMjgzZTliMDZhOTMzNzc4ODI2YjE4ZWQ1OGFkZTAxZGE5NDVmNTkwMzc5NWQvOGI3ODMxMjMwZGY3ZDU1NGM3NGY1ZTI0OWUyM2JlNTcxNjVkMTQzZmVhMGVhN2I1ZGRlNTZkZGU1YzEzYzk1Yj9yZXNwb25zZS1jb250ZW50LWRpc3Bvc2l0aW9uPSomcmVzcG9uc2UtY29udGVudC10eXBlPSoifV19&Signature=GBUn-4z3PMBTqT0NdT3H-NyZxNMGcN4zDNzK8ql%7ESwLF8pXzkH783GSCZQQYWwE-v1g90JTulsOt7z5szigK49ApFju6bkS2zwUAYxNttcl3c-VYrxGuFWYnkHpTQ73qbs3ELF2-5LzDy1ARpj3BOlSEXtH9ShwCRm-R0llQJ6EDx2eOyBIDg-Pgrx%7EKIxrdAZCNln9tJk74TrSN5survdIvcSZrSIGXc3tpFLm-BwpY6qtID3ltrPEHYWDrQ5ALV8lXqKmpVlFSq3lOEFlSa-opFJwe%7E8FIIwP5mJgtCZzlQQylRhsVLxDQ2cJYpTbZSvEVkfjyTxOP4dc%7EDz1tVQ__&Key-Pair-Id=K24J24Z295AEI9" | |
alt="AI App Icon" width="100" height="50" | |
style="border-radius: 20px; border: 2px solid #333;"> | |
</div> | |
<div> | |
<p style="font-size: 50px; font-weight: bold; text-align: center; margin: 0;"> | |
Spacy Model Creator | |
</p> | |
</div> | |
</div> | |
<hr> | |
<hr> | |
# Overview: | |
This project is a comprehensive Resume Parsing tool built using Python, | |
integrating the Mistral-Nemo-Instruct-2407 model for primary parsing. | |
# Installation Guide: | |
1. Create and Activate a Virtual Environment | |
python -m venv venv | |
source venv/bin/activate # For Linux/Mac | |
# or | |
venv\Scripts\activate # For Windows | |
# NOTE: If the virtual environment (venv) is already created, you can skip the creation step and just activate. | |
- For Linux/Mac: | |
source venv/bin/activate | |
- For Windows: | |
venv\Scripts\activate | |
2. Install Required Libraries | |
pip install -r requirements.txt | |
# Ensure the following dependencies are included: | |
- Flask | |
- spaCy | |
- huggingface_hub | |
- PyMuPDF | |
- python-docx | |
- Tesseract-OCR (for image-based parsing) | |
; NOTE : If any model or library is not installed, you can install it using: | |
pip install <model_name> | |
_Replace <model_name> with the specific model or library you need to install_ | |
3. Set up Hugging Face Token | |
- Add your Hugging Face token to the .env file as: | |
HF_TOKEN=<your_huggingface_token> | |
# File Structure Overview: | |
Spacy_Model_creator/ | |
β | |
βββ Models/ | |
β βββ ner_model_05_3 # Pretrained spaCy model directory for resume parsing | |
β | |
βββ data/ | |
β βββ Json_data.json | |
β βββ resume_text.txt | |
β βββ Spacy_data.spacy | |
β | |
βββ templates/ | |
β βββ anoter.html | |
β βββ result.html | |
β βββ guide.html | |
β βββ savejson.html | |
β βββ savespacy.html | |
β βββ text.html | |
β βββ upload.html | |
β βββ data_files.html | |
β | |
βββ JSON/ | |
β βββ Json_data.json | |
β | |
βββ utils/ | |
β βββ model.py # Code for calling Mistral API and handling responses | |
β βββ json_to_spacy.py # spaCy fallback model for parsing resumes | |
β βββ anoter_to_json.py # Error handling utilities | |
β βββ file_To_text.py # Functions to extract text from different file formats (PDF, DOCX, etc.) | |
β | |
βββ venv/ # Virtual environment | |
β | |
βββ .env # Environment variables file (contains Hugging Face token) | |
β | |
βββ app.py # Flask app handling API routes for uploading and processing resumes | |
β | |
βββ requirements.txt # Dependencies required for the project | |
# References: | |
- [Flask Documentation](https://flask.palletsprojects.com/) | |
- [spaCy Documentation](https://spacy.io/usage) | |
- [Hugging Face Hub API](https://huggingface.co/docs/huggingface_hub/index) | |
- [PyMuPDF (MuPDF) Documentation](https://pymupdf.readthedocs.io/en/latest/) | |
- [python-docx Documentation](https://python-docx.readthedocs.io/en/latest/) | |
- [Tesseract OCR Documentation](https://github.com/UB-Mannheim/tesseract/wiki) | |
- [Virtual Environments in Python](https://docs.python.org/3/tutorial/venv.html) |