microsoft/table-transformer-detection · How to fine tune microsoft/table-transformer-detection in huggingface?

Mar 18, 2024

Dear All,

After reading all the threads available in the internet I am using below script to fine tune table-transformer-detection
https://github.com/NielsRogge/Transformers-Tutorials/blob/master/DETR/Fine_tuning_DetrForObjectDetection_on_custom_dataset_(balloon).ipynb

I have
Replace:

processor = DetrImageProcessor.from_pretrained("facebook/detr-resnet-50")

for:

processor = DetrImageProcessor()

also
Replace:

DetrForObjectDetection.from_pretrained("facebook/detr-resnet-50",
revision="no_timm",
num_labels=len(id2label),
ignore_mismatched_sizes=True)
for:

TableTransformerForObjectDetection.from_pretrained(
"microsoft/table-transformer-structure-recognition",
ignore_mismatched_sizes=True,
)

My dataset size is:
Number of training examples: 159
Number of validation examples: 19

Any thoughts on this ?
P.S. I know about this project https://github.com/microsoft/table-transformer/ , I know how to finetune using this project , I also know about convert_table_transformer_original_pytorch_checkpoint_to_pytorch.py, present is transformers repo. But my question is why the above finetune is giving me such a low accuracy ? should I increase my dataset size or am I missing anything. I am using a proprietary dataset

mali17361

Apr 17, 2024

Hi @Spondon . How did you pre-processed(create dataset) your own custom data ? Any Code or links for it ?
Thanks.

Spondon

Apr 18, 2024

Hi @mali17361 ,
Code is proprietary, but the dataset format is COCO.

Interestingly when I convert the dataset to PASCAL VOC format and finetune using table transformer source script (https://github.com/microsoft/table-transformer/) , I got below accuracy on 16 epochs

Any thoughts on this?

mali17361

Apr 18, 2024

Hi @Spondon . Thanks for the reply.
I was able to convert my own dataset to COCO and create my own custom dataset and fine tune the model.
I'm aslo getting very low Accuracy scores.
May be we lack the size of the dataset. What do you think ?
Have you tried any other method ?
Also I have issues when it comes to the output classes as the model has 6 output classes and in the balloon dataset case it is only 1 output class and in my case there are 2 output classes. When I'm inferencing on outside data, I see the outputs are coming only in 1 or 2 tensors i.e., only 2 classes rather than all the classes as "table-structure-recognition" has 6 output classes.

nielsr

May 8, 2024

Hi @Spondon ,

We just updated our object detection guide (for easier mAP calculation with the Trainer API): https://huggingface.co/docs/transformers/main/en/tasks/object_detection, and we now also added official object detection scripts (both with Trainer API and Accelerate): https://github.com/huggingface/transformers/tree/main/examples/pytorch/object-detection.

Definitely recommend these guides for fine-tuning Table Transformer on a custom dataset.

icecandyman

Jan 22

Hi @Spondon , @mali17361

Can you guys tell me, how did you create custom dataset? I am not asking for the code just the general information about the dataset labeling tool and structure would help. I want to create a custom dataset for fine-tuning. can you also tell me the minimum number of tables needed to finetune table transformer model? your input would help alot!

Thanks

Paul21777

Jan 24

Hi @icecandyman

I created my dataset using Label Studio based on this template : https://labelstud.io/templates/image_bbox

You can customize the labels as you wish, here's my template if it can help :

<View>
  <View style="display:flex; align-items:start; flex-direction:row; gap:8px;">
    <View display="block" style="width:100%;">
      <Image name="image" value="$ocr" zoom="true" width="100%" height="100%" zoomControl="false" rotateControl="false" horizontalAlignment="center" crosshair="true"/>
    </View>
   </View>

  <RectangleLabels name="label" toName="image">
    <Label value="table" background="green"/>
    <Label value="header" background="red"/>
    <Label value="row" background="blue"/>
    <Label value="column" background="yellow"/>
  </RectangleLabels>
</View>

Once finished, you can export in COCO or PASCAL VOC format.

icecandyman

Jan 24

Hi @Paul21777
Thanks for the reply!
What was the structure of your dataset? and which method/script did you use to fine-tune as there are multiple and the src script here https://github.com/microsoft/table-transformer/ is not working for me! it is giving import issues!

Paul21777

Jan 24

•

edited Jan 24

@icecandyman
I'm currently working on it following https://huggingface.co/docs/transformers/tasks/object_detection . From there you can found the format expected by Table Transformer (in the link it is done for DETR but it works the same for TATR, we just have to change the model name and the way images are processed).

 The examples in the dataset have the following fields:

    image_id: the example image id
    image: a PIL.Image.Image object containing the image
    width: width of the image
    height: height of the image
    objects: a dictionary containing bounding box metadata for the objects in the image:
        id: the annotation id
        area: the area of the bounding box
        bbox: the object’s bounding box (in the COCO format )
        category: the object’s category, with possible values including Coverall (0), Face_Shield (1), Gloves (2), Goggles (3) and Mask (4)

Apparently, even the annotations containing the bounding boxes in COCO format that Label Studio allows us to export are not directly adapted for the training. I'm currently writing the script to convert it, I will give send it here ASAP if you want

Kangditya

Feb 10

•

edited Feb 10

@Paul21777
Hi, probably i need that script to convert for my current dataset for fine-tune this model. Could you send/share it?

Nihel13

Mar 15

Hi guys, can someone provide me with the code of finetuning , i ve been trying for 2 weeks but the model is performing so bad

nielsr

Mar 15

To fine-tune, normally you can just follow this guide: https://huggingface.co/docs/transformers/en/tasks/object_detection.

Nihel13

Mar 16

nielsr

Mar 16

I recommend this blog post: https://huggingface.co/blog/samuellimabraz/signature-detection-model. You can look into hyperparameter optimization, more epochs etc.