File size: 1,685 Bytes
677f6d7
 
 
 
 
 
 
 
34867cc
677f6d7
 
63e97f5
 
 
 
677f6d7
 
63e97f5
 
677f6d7
 
 
 
 
 
be1cad1
677f6d7
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
---
base_model: ytu-ce-cosmos/Turkish-LLaVA-v0.1
license: mit
language:
- tr
tags:
- LLaVA
- llava_llama
pipeline_tag: image-text-to-text
---

# Turkish-LLaVA-v0.1-Q4_K_M-GGUF

This model is a converted and quantized version of [ytu-ce-cosmos/Turkish-LLaVA-v0.1](https://huggingface.co/ytu-ce-cosmos/Turkish-LLaVA-v0.1) vision-language model using [llama.cpp](https://github.com/ggerganov/llama.cpp).

## Usage

You can use the model with [`llama-cpp-python`](https://github.com/abetlen/llama-cpp-python) package as following:

```py
from llama_cpp import Llama
from llama_cpp.llama_chat_format import Llama3VisionAlphaChatHandler

llm = Llama(
  model_path="Turkish-LLaVA-v0.1-Q4_K_M.gguf", # path to language model
  n_gpu_layers=-1, # for running on GPU
  chat_handler=Llama3VisionAlphaChatHandler(
      # path to image encoder
      clip_model_path="Turkish-LLaVA-v0.1-mmproj-F16.gguf",
  ),
  seed=1337, # for reproducing same results
  n_ctx=4096, # n_ctx should be increased to accommodate the image embedding
  verbose=False, # disable the logging
)

# url for the input image
url = "https://huggingface.co/ytu-ce-cosmos/Turkish-LLaVA-v0.1/resolve/main/example.jpg"

messages = [
    {"role": "system", "content": "Sen yardımsever bir asistansın."},
    {
        "role": "user",
        "content": [
            {"type" : "text", "text": "Bu resimde neler görüyorsun?"},
            {"type": "image_url", "image_url": {"url": url}}
        ]
    },
]

response = llm.create_chat_completion(
    messages=messages,
    max_tokens=64,
)

print(response["choices"][0]["message"]["content"])
# Output: Resimde, sarı çiçeklerle çevrili bir köpek yavrusu görülüyor.
```