Spaces:
Sleeping
Sleeping
File size: 1,599 Bytes
9d76a63 ab599b4 9d76a63 ab599b4 9d76a63 ab599b4 b8fca79 ab599b4 44df236 f30c298 ab599b4 f30c298 ab599b4 f30c298 ab599b4 f30c298 ab599b4 f30c298 ab599b4 f30c298 ab599b4 f30c298 ab599b4 f30c298 ab599b4 f30c298 ab599b4 f30c298 ab599b4 f30c298 ab599b4 f30c298 ab599b4 f30c298 ab599b4 f30c298 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 |
---
title: MinerU PDF Processor
emoji: π
colorFrom: blue
colorTo: indigo
sdk: docker
pinned: false
license: apache-2.0
app_port: 7860
---
# MinerU PDF API
A simple API for extracting text and tables from PDF documents using MinerU's magic-pdf library.
## Features
- Extract text from PDF documents
- Identify and extract tables from PDFs
- Works with both regular and scanned PDFs
- Simple JSON response format
## API Endpoints
### Health Check
```
GET /health
```
Returns the current status of the service.
### Extract PDF Content
```
POST /extract
```
Upload a PDF file to extract its text and tables.
#### Request
- `file`: The PDF file to process (multipart/form-data)
#### Response
JSON object containing:
- `filename`: Original filename
- `pages`: Array of pages with text and tables
## Deployment
This application is deployed as a Hugging Face Space using Docker.
## Local Development
To run this application locally:
1. Install the requirements:
```
pip install -r requirements.txt
```
2. Run the application:
```
python app.py
```
3. Access the API at `http://localhost:7860`
## Docker
You can also build and run with Docker:
```bash
docker build -t mineru-pdf-api .
docker run -p 7860:7860 mineru-pdf-api
```
## About
This API is built on top of MinerU and magic-pdf, a powerful PDF extraction tool.
## API Documentation
Once deployed, you can access the auto-generated Swagger documentation at:
```
https://marcosremar2-docker-mineru.hf.space/docs
```
For ReDoc documentation:
```
https://marcosremar2-docker-mineru.hf.space/redoc
``` |