prithivMLmods commited on
Commit
748ca0d
·
verified ·
1 Parent(s): 6ee72d6

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +11 -16
README.md CHANGED
@@ -11,43 +11,42 @@ tags:
11
  - text-generation-inference
12
  - label
13
  ---
14
- ![10.png](https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/tCeb4MVz79GCiOmMxkLcN.png)
15
- # **LabelPRO**
16
 
17
- **LabelPRO** is an advanced image caption and annotation generator optimized for generating detailed, structured JSON outputs. Built upon a powerful vision-language architecture with enhanced OCR and multilingual support, LabelPRO extracts high-quality captions and annotations from images for seamless integration into your applications.
 
 
18
 
19
  #### Key Enhancements:
20
 
21
- * **Advanced Image Understanding**: Fine-tuned on millions of annotated images, LabelPRO delivers precise comprehension and interpretation of visual content.
22
  * **Optimized for JSON Output**: Produces structured JSON data containing captions and detailed annotations—perfect for integration with databases, APIs, and automation pipelines.
23
  * **Enhanced OCR Capabilities**: Accurately extracts textual content from images in multiple languages, including English, Chinese, Japanese, Korean, Arabic, and more.
24
  * **Multimodal Processing**: Seamlessly handles both image and text inputs, generating comprehensive annotations based on the provided image.
25
  * **Multilingual Support**: Recognizes and processes text within images across various languages.
26
  * **Secure and Optimized Model Weights**: Employs safetensors for efficient and secure model loading.
27
 
28
- ---
29
-
30
  ### How to Use
31
 
32
  ```python
33
  from transformers import Qwen2VLForConditionalGeneration, AutoTokenizer, AutoProcessor
34
  from qwen_vl_utils import process_vision_info
35
 
36
- # Load the LabelPRO model with optimized parameters
37
  model = Qwen2VLForConditionalGeneration.from_pretrained(
38
- "prithivMLmods/LabelPRO", torch_dtype="auto", device_map="auto"
39
  )
40
 
41
  # Recommended acceleration for performance optimization:
42
  # model = Qwen2VLForConditionalGeneration.from_pretrained(
43
- # "prithivMLmods/LabelPRO",
44
  # torch_dtype=torch.bfloat16,
45
  # attn_implementation="flash_attention_2",
46
  # device_map="auto",
47
  # )
48
 
49
- # Load the default processor for LabelPRO
50
- processor = AutoProcessor.from_pretrained("prithivMLmods/LabelPRO")
51
 
52
  # Define the input messages with both an image and a text prompt
53
  messages = [
@@ -88,8 +87,6 @@ output_text = processor.batch_decode(
88
  print(output_text)
89
  ```
90
 
91
- ---
92
-
93
  ### **Key Features**
94
 
95
  1. **Annotation-Ready Training Data**
@@ -110,6 +107,4 @@ print(output_text)
110
  6. **Secure and Efficient Model Weights**
111
  - Uses safetensors for enhanced security and optimized model performance.
112
 
113
- ---
114
-
115
- **LabelPRO** streamlines the process of generating image captions and annotations, making it an ideal solution for applications that require detailed visual content analysis and structured data integration.
 
11
  - text-generation-inference
12
  - label
13
  ---
14
+ ![VSXzdfgvsdxf.png](https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/nNF_6UCnmgHKjNmLaA2QA.png)
 
15
 
16
+ # **Caption-Pro**
17
+
18
+ **Caption-Pro** is an advanced image caption and annotation generator optimized for generating detailed, structured JSON outputs. Built upon a powerful vision-language architecture with enhanced OCR and multilingual support, Caption-Pro extracts high-quality captions and annotations from images for seamless integration into your applications.
19
 
20
  #### Key Enhancements:
21
 
22
+ * **Advanced Image Understanding**: Fine-tuned on millions of annotated images, Caption-Pro delivers precise comprehension and interpretation of visual content.
23
  * **Optimized for JSON Output**: Produces structured JSON data containing captions and detailed annotations—perfect for integration with databases, APIs, and automation pipelines.
24
  * **Enhanced OCR Capabilities**: Accurately extracts textual content from images in multiple languages, including English, Chinese, Japanese, Korean, Arabic, and more.
25
  * **Multimodal Processing**: Seamlessly handles both image and text inputs, generating comprehensive annotations based on the provided image.
26
  * **Multilingual Support**: Recognizes and processes text within images across various languages.
27
  * **Secure and Optimized Model Weights**: Employs safetensors for efficient and secure model loading.
28
 
 
 
29
  ### How to Use
30
 
31
  ```python
32
  from transformers import Qwen2VLForConditionalGeneration, AutoTokenizer, AutoProcessor
33
  from qwen_vl_utils import process_vision_info
34
 
35
+ # Load the Caption-Pro model with optimized parameters
36
  model = Qwen2VLForConditionalGeneration.from_pretrained(
37
+ "prithivMLmods/Caption-Pro", torch_dtype="auto", device_map="auto"
38
  )
39
 
40
  # Recommended acceleration for performance optimization:
41
  # model = Qwen2VLForConditionalGeneration.from_pretrained(
42
+ # "prithivMLmods/Caption-Pro",
43
  # torch_dtype=torch.bfloat16,
44
  # attn_implementation="flash_attention_2",
45
  # device_map="auto",
46
  # )
47
 
48
+ # Load the default processor for Caption-Pro
49
+ processor = AutoProcessor.from_pretrained("prithivMLmods/Caption-Pro")
50
 
51
  # Define the input messages with both an image and a text prompt
52
  messages = [
 
87
  print(output_text)
88
  ```
89
 
 
 
90
  ### **Key Features**
91
 
92
  1. **Annotation-Ready Training Data**
 
107
  6. **Secure and Efficient Model Weights**
108
  - Uses safetensors for enhanced security and optimized model performance.
109
 
110
+ **Caption-Pro** streamlines the process of generating image captions and annotations, making it an ideal solution for applications that require detailed visual content analysis and structured data integration.