Spaces:
Runtime error
Runtime error
File size: 5,062 Bytes
997496f 06f01a0 997496f 06f01a0 c525ead 06f01a0 c525ead 06f01a0 c525ead 06f01a0 c525ead 06f01a0 c525ead 06f01a0 c525ead 06f01a0 c525ead 06f01a0 c525ead 06f01a0 c525ead 06f01a0 c525ead 06f01a0 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 |
_\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\_
_\\-------- **Image Data Extractor** -------\\_
_\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\_
# Overview:
The **Image Data Extractor** is a Python-based tool designed to extract and structure text data from images of visiting cards using **PaddleOCR**. The tool processes the extracted text to recognize key information such as name, designation, contact number, address, and company name, organizing the output into a well-defined structure. The **Mistral 7B model** is used for advanced text analysis, and if it becomes unavailable, the system seamlessly switches to the **Gliner urchade/gliner_mediumv2.1** model.
# Installation Guide:
1. **Create and Activate a Virtual Environment**
```bash
python -m venv venv
source venv/bin/activate # For Linux/Mac
# or
venv\Scripts\activate # For Windows
```
2. **Install Required Libraries**
```bash
pip install -r requirements.txt
```
3. **Run the Application**
- If Docker is being used:
```bash
docker-compose up --build
```
- Without Docker:
```bash
python app.py
```
4. **Set up Hugging Face Token**
- Add your Hugging Face token in the `.env` file:
```bash
HF_TOKEN=<your_huggingface_token>
```
# File Structure Overview:
```
ImageDataExtractor/
β
βββ app.py # Main Flask app
βββ requirements.txt # Dependencies
βββ Dockerfile # Docker container setup
βββ docker-compose.yml # Docker Compose setup
β
βββ utility/
β βββ utils.py # PaddleOCR integration, Image preprocessing and Mistral model processing
β
βββ template/
β βββ index.html # UI for image uploads
β βββ result.html # Display extracted results
β
βββ Backup/
β βββ modules/ # Base classes for data processing models
β β βββ base.py
β β βββ data_proc.py
β β βββ evaluator.py
β β βββ layers.py
β β βββ run_evaluation.py
β β βββ span_rep.py
β β βββ token_rep.py
β βββ backup.py # Backup handling
β βββ model.py # Gliner Model integration and backup logic
β βββ save_load.py # Mistral 7B model integration and backup logic
β βββ train.py # Mistral 7B model integration and backup logic
β
βββ .env # Environment variables (includes Hugging Face token)
```
# Program Overview:
### PaddleOCR Integration (utility/utils.py):
- **Text Extraction**: The tool utilizes **PaddleOCR** to extract text from image-based inputs (PNG, JPG, JPEG) of visiting cards.
- **Preprocessing**: Handles basic image preprocessing to enhance text recognition for OCR.
### Mistral 7B Integration (utility/utils.py):
- **Data Structuring**: After text extraction, the **Mistral 7B model** processes the extracted data, structuring it into fields such as name, designation, contact number, address, and company name.
### Fallback Mechanism (Backup/backup.py):
- **Gliner urchade/gliner_mediumv2.1 Model**: If the Mistral model is unavailable, the system uses the **Gliner urchade/gliner_mediumv2.1 model** to perform the same task, ensuring continuous service.
- **Error Handling**: Manages failures in model availability and ensures smooth fallback.
### Web Interface (app.py):
- **Flask API**: Provides endpoints for image uploads and displays the results in a structured manner.
- **HTML Interface**: A frontend for users to upload images of visiting cards and view the parsed results.
# Tree Map of the Program:
```
app.py
βββ Handles Flask API and web interface
βββ Manages file upload
βββ Extracts text with PaddleOCR
βββ Processes text with Mistral 7B
βββ Displays structured results
utility/utils.py
βββ PaddleOCR for text extraction
βββ Mistral 7B for data structuring
Backup/backup.py
βββ Gliner urchade/gliner_mediumv2.1 as fallback
βββ Backup and error handling
Backup/model.py
βββ Mistral 7B integration and processing logic
```
# Main Task:
The main objective is to extract and structure text data from visiting cards. The system identifies and organizes:
- **Name**
- **Designation**
- **Phone Number**
- **Address**
- **Company Name**
# References:
- [PaddleOCR Documentation](https://github.com/PaddlePaddle/PaddleOCR)
- [Mistral 7B Documentation](https://huggingface.co/)
- [Gliner urchade/gliner_mediumv2.1 Documentation](https://huggingface.co/)
- [Flask Documentation](https://flask.palletsprojects.com/)
- [Docker Documentation](https://docs.docker.com/)
- [Virtual Environments in Python](https://docs.python.org/3/tutorial/venv.html)
--- |