apkonsta commited on
Commit
097f1bc
·
verified ·
1 Parent(s): 378e6d5

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +50 -0
README.md CHANGED
@@ -8,3 +8,53 @@ library_name: transformers
8
 
9
  This repository contains a fine-tuned version of the Table Transformer model, specifically adapted for detecting tables in IFRS (International Financial Reporting Standards) PDFs. The model is based on the Table Transformer architecture, which is designed to extract tables from unstructured documents such as PDFs and images.
10
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8
 
9
  This repository contains a fine-tuned version of the Table Transformer model, specifically adapted for detecting tables in IFRS (International Financial Reporting Standards) PDFs. The model is based on the Table Transformer architecture, which is designed to extract tables from unstructured documents such as PDFs and images.
10
 
11
+ # Usage
12
+
13
+ ```python
14
+ from transformers import DetrForObjectDetection, DetrImageProcessor
15
+ from PIL import Image
16
+ import torch
17
+
18
+ # Load the image processor and model
19
+ # DetrImageProcessor is used to preprocess the images before feeding them to the model
20
+ image_processor = DetrImageProcessor()
21
+
22
+ # Load the pre-trained TableTransformer model for object detection
23
+ # This model is specifically trained for detecting tables in IFRS documents
24
+ model = TableTransformerForObjectDetection.from_pretrained(
25
+ "apkonsta/table-transformer-detection-ifrs",
26
+ )
27
+
28
+ # Prepare the image
29
+ # Open the image file and convert it to RGB format
30
+ image = Image.open("path/to/your/ifrs_pdf_page.png").convert("RGB")
31
+
32
+ # Table detection threshold
33
+ # Set a threshold for detecting tables; only detections with a confidence score above this threshold will be considered
34
+ TD_th = 0.5
35
+
36
+ # Preprocess the image using the image processor
37
+ # The image is encoded into a format that the model can understand
38
+ encoding = image_processor(image, return_tensors="pt")
39
+
40
+ # Perform inference without computing gradients (saves memory and computations)
41
+ with torch.no_grad():
42
+ outputs = model(**encoding)
43
+
44
+ # Get the probabilities for each detected object
45
+ # The softmax function is applied to the logits to get probabilities
46
+ probas = outputs.logits.softmax(-1)[0, :, :-1]
47
+
48
+ # Keep only the detections with a confidence score above the threshold
49
+ keep = probas.max(-1).values > TD_th
50
+
51
+ # Get the target sizes for post-processing
52
+ # The target sizes are the dimensions of the original image
53
+ target_sizes = torch.tensor(image.size[::-1]).unsqueeze(0)
54
+
55
+ # Post-process the model outputs to get the final bounding boxes
56
+ # The bounding boxes are scaled back to the original image size
57
+ postprocessed_outputs = image_processor.post_process(outputs, target_sizes)
58
+ bboxes_scaled = postprocessed_outputs[0]["boxes"][keep]
59
+ ```
60
+