prithivMLmods commited on
Commit
5d5a1ee
·
verified ·
1 Parent(s): 16a8f2a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +86 -2
README.md CHANGED
@@ -15,5 +15,89 @@ tags:
15
  - finance
16
  ---
17
 
18
- > [!Important]
19
- > Coming Soon ...
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
15
  - finance
16
  ---
17
 
18
+
19
+ # **Document-Type-Detection**
20
+
21
+ > **Document-Type-Detection** is a multi-class image classification model based on `google/siglip2-base-patch16-224`, trained to detect and classify **types of documents** from scanned or photographed images. This model is helpful for **automated document sorting**, **OCR pipelines**, and **digital archiving systems**.
22
+
23
+ ---
24
+
25
+ ## **Label Classes**
26
+
27
+ The model classifies images into the following document types:
28
+
29
+ ```
30
+ 0: Advertisement-Doc
31
+ 1: Hand-Written-Doc
32
+ 2: Invoice-Doc
33
+ 3: Letter-Doc
34
+ 4: News-Article-Doc
35
+ 5: Resume-Doc
36
+ ```
37
+
38
+ ---
39
+
40
+ ## **Installation**
41
+
42
+ ```bash
43
+ pip install transformers torch pillow gradio
44
+ ```
45
+
46
+ ---
47
+
48
+ ## **Example Inference Code**
49
+
50
+ ```python
51
+ import gradio as gr
52
+ from transformers import AutoImageProcessor, SiglipForImageClassification
53
+ from PIL import Image
54
+ import torch
55
+
56
+ # Load model and processor
57
+ model_name = "prithivMLmods/Document-Type-Detection"
58
+ model = SiglipForImageClassification.from_pretrained(model_name)
59
+ processor = AutoImageProcessor.from_pretrained(model_name)
60
+
61
+ # ID to label mapping
62
+ id2label = {
63
+ "0": "Advertisement-Doc",
64
+ "1": "Hand-Written-Doc",
65
+ "2": "Invoice-Doc",
66
+ "3": "Letter-Doc",
67
+ "4": "News-Article-Doc",
68
+ "5": "Resume-Doc"
69
+ }
70
+
71
+ def detect_doc_type(image):
72
+ image = Image.fromarray(image).convert("RGB")
73
+ inputs = processor(images=image, return_tensors="pt")
74
+
75
+ with torch.no_grad():
76
+ outputs = model(**inputs)
77
+ logits = outputs.logits
78
+ probs = torch.nn.functional.softmax(logits, dim=1).squeeze().tolist()
79
+
80
+ prediction = {id2label[str(i)]: round(probs[i], 3) for i in range(len(probs))}
81
+ return prediction
82
+
83
+ # Gradio Interface
84
+ iface = gr.Interface(
85
+ fn=detect_doc_type,
86
+ inputs=gr.Image(type="numpy"),
87
+ outputs=gr.Label(num_top_classes=6, label="Document Type"),
88
+ title="Document-Type-Detection",
89
+ description="Upload a document image to classify it as one of: Advertisement, Hand-Written, Invoice, Letter, News Article, or Resume."
90
+ )
91
+
92
+ if __name__ == "__main__":
93
+ iface.launch()
94
+ ```
95
+
96
+ ---
97
+
98
+ ## **Applications**
99
+
100
+ * **Automated Document Sorting**
101
+ * **Digital Libraries and Archives**
102
+ * **OCR Preprocessing**
103
+ * **Enterprise Document Management**