Prompt template
Can you please share what is prompt template of this model? I want to load this model to ollama. Thanks
@jitvimol Sure! A structured step-by-step approach is used to parse content types such as receipts, tables, graphs, and image captions using the Chain-of-Thought (CoT) prompt template. Here's how it looks:
prompt_parsing = (
"You are parsing a document with diverse content types, including a restaurant receipt, graphs, tables, and image captions. "
"Please proceed with a step-by-step approach for each content type:\n\n"
"1. **Receipt Parsing**:\n"
" - First, locate the total amount paid and the tip amount within the receipt. Identify the currency, if present. "
" - Ensure that both amounts are clearly extracted and labeled.\n\n"
"2. **Tables**:\n"
" - Step 1: Identify the table headers and list them as column labels.\n"
" - Step 2: Extract each row, preserving the order and structure exactly as shown. Note any visual cues, like color coding or symbols, "
" to indicate significance (e.g., green for high values, yellow for moderate values).\n"
" - Step 3: Summarize any key metrics or summary statistics, such as CAGR or average YoY growth, and include units as applicable.\n\n"
"3. **Graphs**:\n"
" - Step 1: Determine the graph type (e.g., line, bar) and list the axes labels (e.g., 'Mn Users', '%').\n"
" - Step 2: Describe key trends, such as growth or decline patterns, and highlight any major changes over time.\n"
" - Step 3: Identify significant data points and extract approximate (X, Y) values from each line in the graph. "
" Record these values systematically, capturing the trend and any inflection points or peaks.\n"
" - Step 4: Note any percentage growth indicators or additional annotations along the line.\n"
" - Step 5: If there are multiple lines with different colors, create a separate list of (X, Y) values for each line based on the color-coding in the legend.\n\n"
"4. **Image Captions**:\n"
" - Summarize the image caption by identifying the main description and context provided. Keep the summary concise, capturing "
" only the essential information conveyed.\n\n"
"Output each section's extracted data in a structured markdown format, with headers separating each type (e.g., '## Receipt', '## Table', "
"'## Graph', '## Image Caption'). Use bullet points for lists, maintain tables for tabular data, and ensure content is clearly organized for easy reference. "
"For graph data, provide a table of X and Y values, with columns labeled appropriately, such as 'Month' and 'Mn Users' for each line."
)
@jitvimol Sure! A structured step-by-step approach is used to parse content types such as receipts, tables, graphs, and image captions using the Chain-of-Thought (CoT) prompt template. Here's how it looks:
prompt_parsing = ( "You are parsing a document with diverse content types, including a restaurant receipt, graphs, tables, and image captions. " "Please proceed with a step-by-step approach for each content type:\n\n" "1. **Receipt Parsing**:\n" " - First, locate the total amount paid and the tip amount within the receipt. Identify the currency, if present. " " - Ensure that both amounts are clearly extracted and labeled.\n\n" "2. **Tables**:\n" " - Step 1: Identify the table headers and list them as column labels.\n" " - Step 2: Extract each row, preserving the order and structure exactly as shown. Note any visual cues, like color coding or symbols, " " to indicate significance (e.g., green for high values, yellow for moderate values).\n" " - Step 3: Summarize any key metrics or summary statistics, such as CAGR or average YoY growth, and include units as applicable.\n\n" "3. **Graphs**:\n" " - Step 1: Determine the graph type (e.g., line, bar) and list the axes labels (e.g., 'Mn Users', '%').\n" " - Step 2: Describe key trends, such as growth or decline patterns, and highlight any major changes over time.\n" " - Step 3: Identify significant data points and extract approximate (X, Y) values from each line in the graph. " " Record these values systematically, capturing the trend and any inflection points or peaks.\n" " - Step 4: Note any percentage growth indicators or additional annotations along the line.\n" " - Step 5: If there are multiple lines with different colors, create a separate list of (X, Y) values for each line based on the color-coding in the legend.\n\n" "4. **Image Captions**:\n" " - Summarize the image caption by identifying the main description and context provided. Keep the summary concise, capturing " " only the essential information conveyed.\n\n" "Output each section's extracted data in a structured markdown format, with headers separating each type (e.g., '## Receipt', '## Table', " "'## Graph', '## Image Caption'). Use bullet points for lists, maintain tables for tabular data, and ensure content is clearly organized for easy reference. " "For graph data, provide a table of X and Y values, with columns labeled appropriately, such as 'Month' and 'Mn Users' for each line." )
Thank you. Btw above is not template that used with ollama modelfile right? The usage is per below, I used what you provided and import to ollama but imported model seem to malfunction.
Modelfile
FROM "~/typhoon2-qwen2vl-7b-vision-instruct.Q4_K_S.gguf"
TEMPLATE """ """
My bad, you can find it in files: https://huggingface.co/scb10x/typhoon2-qwen2vl-7b-vision-instruct/blob/main/chat_template.json
Follow these steps:
FROM typhoon2-qwen2vl-7b-vision-instruct.gguf
PARAMETER stop <|im_start|>
PARAMETER stop <|im_end|>
TEMPLATE """
{% set image_count = namespace(value=0) %}
{% set video_count = namespace(value=0) %}
{% for message in messages %}
{% if loop.first and message['role'] != 'system' %}
<|im_start|>system
You are a helpful assistant.
<|im_end|>
{% endif %}
<|im_start|>{{ message['role'] }}
{% if message['content'] is string %}
{{ message['content'] }}<|im_end|>
{% else %}
{% for content in message['content'] %}
{% if content['type'] == 'image' or 'image' in content or 'image_url' in content %}
{% set image_count.value = image_count.value + 1 %}
{% if add_vision_id %}Picture {{ image_count.value }}: {% endif %}
<|vision_start|><|image_pad|><|vision_end|>
{% elif content['type'] == 'video' or 'video' in content %}
{% set video_count.value = video_count.value + 1 %}
{% if add_vision_id %}Video {{ video_count.value }}: {% endif %}
<|vision_start|><|video_pad|><|vision_end|>
{% elif 'text' in content %}
{{ content['text'] }}
{% endif %}
{% endfor %}
<|im_end|>
{% endif %}
{% endfor %}
{% if add_generation_prompt %}
<|im_start|>assistant
{% endif %}
"""
Add on: https://www.ollama.com/jmorgan/qwen2vl-test/blobs/41190096a061
@jitvimol
My bad, you can find it in files: https://huggingface.co/scb10x/typhoon2-qwen2vl-7b-vision-instruct/blob/main/chat_template.json
Follow these steps:
FROM typhoon2-qwen2vl-7b-vision-instruct.gguf PARAMETER stop <|im_start|> PARAMETER stop <|im_end|> TEMPLATE """ {% set image_count = namespace(value=0) %} {% set video_count = namespace(value=0) %} {% for message in messages %} {% if loop.first and message['role'] != 'system' %} <|im_start|>system You are a helpful assistant. <|im_end|> {% endif %} <|im_start|>{{ message['role'] }} {% if message['content'] is string %} {{ message['content'] }}<|im_end|> {% else %} {% for content in message['content'] %} {% if content['type'] == 'image' or 'image' in content or 'image_url' in content %} {% set image_count.value = image_count.value + 1 %} {% if add_vision_id %}Picture {{ image_count.value }}: {% endif %} <|vision_start|><|image_pad|><|vision_end|> {% elif content['type'] == 'video' or 'video' in content %} {% set video_count.value = video_count.value + 1 %} {% if add_vision_id %}Video {{ video_count.value }}: {% endif %} <|vision_start|><|video_pad|><|vision_end|> {% elif 'text' in content %} {{ content['text'] }} {% endif %} {% endfor %} <|im_end|> {% endif %} {% endfor %} {% if add_generation_prompt %} <|im_start|>assistant {% endif %} """
Add on: https://www.ollama.com/jmorgan/qwen2vl-test/blobs/41190096a061
@jitvimol
Thank you very much. I'm checking it.