tadkt commited on
Commit
6adb3be
·
verified ·
1 Parent(s): ad78d7a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +46 -3
README.md CHANGED
@@ -1,3 +1,46 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ language:
4
+ - vi
5
+ - en
6
+ pipeline_tag: image-text-to-text
7
+ library_name: transformers
8
+ tags:
9
+ - got
10
+ - vision-language
11
+ - ocr2.0
12
+ - got_vietnamese
13
+ ---
14
+ ## Usage
15
+ Inference using Huggingface transformers on NVIDIA GPUs. Requirements tested on python 3.10:
16
+ ```
17
+ torch==2.0.1
18
+ torchvision==0.15.2
19
+ transformers==4.37.2
20
+ tiktoken==0.6.0
21
+ verovio==4.3.1
22
+ accelerate==0.28.0
23
+ ```
24
+ ```python
25
+ from transformers import AutoModel, AutoTokenizer
26
+ tokenizer = AutoTokenizer.from_pretrained('tadkt/GOT_Vietnamese', trust_remote_code=True)
27
+ model = AutoModel.from_pretrained('tadkt/GOT_Vietnamese', trust_remote_code=True, low_cpu_mem_usage=True, device_map='cuda', use_safetensors=True, pad_token_id=tokenizer.eos_token_id)
28
+ model = model.eval().cuda()
29
+ # input your test image
30
+ image_file = 'xxx.jpg'
31
+ # plain texts OCR
32
+ res = model.chat(tokenizer, image_file, ocr_type='ocr')
33
+ # format texts OCR:
34
+ # res = model.chat(tokenizer, image_file, ocr_type='format')
35
+ # fine-grained OCR:
36
+ # res = model.chat(tokenizer, image_file, ocr_type='ocr', ocr_box='')
37
+ # res = model.chat(tokenizer, image_file, ocr_type='format', ocr_box='')
38
+ # res = model.chat(tokenizer, image_file, ocr_type='ocr', ocr_color='')
39
+ # res = model.chat(tokenizer, image_file, ocr_type='format', ocr_color='')
40
+ # multi-crop OCR:
41
+ # res = model.chat_crop(tokenizer, image_file, ocr_type='ocr')
42
+ # res = model.chat_crop(tokenizer, image_file, ocr_type='format')
43
+ # render the formatted OCR results:
44
+ # res = model.chat(tokenizer, image_file, ocr_type='format', render=True, save_render_file = './demo.html')
45
+ print(res)
46
+ ```