KevinHuSh commited on
Commit
640c593
·
1 Parent(s): 8f9784a

Support new feature about Ollama (#262)

Browse files

### What problem does this PR solve?

Issue link:#221

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

Files changed (4) hide show
  1. README.md +3 -1
  2. README_ja.md +3 -1
  3. README_zh.md +3 -1
  4. api/apps/conversation_app.py +8 -5
README.md CHANGED
@@ -101,6 +101,7 @@
101
 
102
  ```bash
103
  $ cd ragflow/docker
 
104
  $ docker compose up -d
105
  ```
106
 
@@ -165,12 +166,13 @@ $ git clone https://github.com/infiniflow/ragflow.git
165
  $ cd ragflow/
166
  $ docker build -t infiniflow/ragflow:v1.0 .
167
  $ cd ragflow/docker
 
168
  $ docker compose up -d
169
  ```
170
 
171
  ## 🆕 Latest Features
172
 
173
- - Support [Ollam](./docs/ollama.md) for local LLM deployment.
174
  - Support Chinese UI.
175
 
176
  ## 📜 Roadmap
 
101
 
102
  ```bash
103
  $ cd ragflow/docker
104
+ $ chmod +x ./entrypoint.sh
105
  $ docker compose up -d
106
  ```
107
 
 
166
  $ cd ragflow/
167
  $ docker build -t infiniflow/ragflow:v1.0 .
168
  $ cd ragflow/docker
169
+ $ chmod +x ./entrypoint.sh
170
  $ docker compose up -d
171
  ```
172
 
173
  ## 🆕 Latest Features
174
 
175
+ - Support [Ollama](./docs/ollama.md) for local LLM deployment.
176
  - Support Chinese UI.
177
 
178
  ## 📜 Roadmap
README_ja.md CHANGED
@@ -101,6 +101,7 @@
101
 
102
  ```bash
103
  $ cd ragflow/docker
 
104
  $ docker compose up -d
105
  ```
106
 
@@ -165,12 +166,13 @@ $ git clone https://github.com/infiniflow/ragflow.git
165
  $ cd ragflow/
166
  $ docker build -t infiniflow/ragflow:v1.0 .
167
  $ cd ragflow/docker
 
168
  $ docker compose up -d
169
  ```
170
 
171
  ## 🆕 最新の新機能
172
 
173
- - [Ollam](./docs/ollama.md) を使用した大規模モデルのローカライズされたデプロイメントをサポートします。
174
  - 中国語インターフェースをサポートします。
175
 
176
  ## 📜 ロードマップ
 
101
 
102
  ```bash
103
  $ cd ragflow/docker
104
+ $ chmod +x ./entrypoint.sh
105
  $ docker compose up -d
106
  ```
107
 
 
166
  $ cd ragflow/
167
  $ docker build -t infiniflow/ragflow:v1.0 .
168
  $ cd ragflow/docker
169
+ $ chmod +x ./entrypoint.sh
170
  $ docker compose up -d
171
  ```
172
 
173
  ## 🆕 最新の新機能
174
 
175
+ - [Ollama](./docs/ollama.md) を使用した大規模モデルのローカライズされたデプロイメントをサポートします。
176
  - 中国語インターフェースをサポートします。
177
 
178
  ## 📜 ロードマップ
README_zh.md CHANGED
@@ -101,6 +101,7 @@
101
 
102
  ```bash
103
  $ cd ragflow/docker
 
104
  $ docker compose -f docker-compose-CN.yml up -d
105
  ```
106
 
@@ -165,12 +166,13 @@ $ git clone https://github.com/infiniflow/ragflow.git
165
  $ cd ragflow/
166
  $ docker build -t infiniflow/ragflow:v1.0 .
167
  $ cd ragflow/docker
 
168
  $ docker compose up -d
169
  ```
170
 
171
  ## 🆕 最近新特性
172
 
173
- - 支持用 [Ollam](./docs/ollama.md) 对大模型进行本地化部署。
174
  - 支持中文界面。
175
 
176
  ## 📜 路线图
 
101
 
102
  ```bash
103
  $ cd ragflow/docker
104
+ $ chmod +x ./entrypoint.sh
105
  $ docker compose -f docker-compose-CN.yml up -d
106
  ```
107
 
 
166
  $ cd ragflow/
167
  $ docker build -t infiniflow/ragflow:v1.0 .
168
  $ cd ragflow/docker
169
+ $ chmod +x ./entrypoint.sh
170
  $ docker compose up -d
171
  ```
172
 
173
  ## 🆕 最近新特性
174
 
175
+ - 支持用 [Ollama](./docs/ollama.md) 对大模型进行本地化部署。
176
  - 支持中文界面。
177
 
178
  ## 📜 路线图
api/apps/conversation_app.py CHANGED
@@ -20,7 +20,7 @@ from flask_login import login_required
20
  from api.db.services.dialog_service import DialogService, ConversationService
21
  from api.db import LLMType
22
  from api.db.services.knowledgebase_service import KnowledgebaseService
23
- from api.db.services.llm_service import LLMService, LLMBundle
24
  from api.settings import access_logger, stat_logger, retrievaler, chat_logger
25
  from api.utils.api_utils import server_error_response, get_data_error_result, validate_request
26
  from api.utils import get_uuid
@@ -184,8 +184,11 @@ def chat(dialog, messages, **kwargs):
184
  assert messages[-1]["role"] == "user", "The last content of this conversation is not from user."
185
  llm = LLMService.query(llm_name=dialog.llm_id)
186
  if not llm:
187
- raise LookupError("LLM(%s) not found" % dialog.llm_id)
188
- llm = llm[0]
 
 
 
189
  questions = [m["content"] for m in messages if m["role"] == "user"]
190
  embd_mdl = LLMBundle(dialog.tenant_id, LLMType.EMBEDDING)
191
  chat_mdl = LLMBundle(dialog.tenant_id, LLMType.CHAT, dialog.llm_id)
@@ -227,11 +230,11 @@ def chat(dialog, messages, **kwargs):
227
  gen_conf = dialog.llm_setting
228
  msg = [{"role": m["role"], "content": m["content"]}
229
  for m in messages if m["role"] != "system"]
230
- used_token_count, msg = message_fit_in(msg, int(llm.max_tokens * 0.97))
231
  if "max_tokens" in gen_conf:
232
  gen_conf["max_tokens"] = min(
233
  gen_conf["max_tokens"],
234
- llm.max_tokens - used_token_count)
235
  answer = chat_mdl.chat(
236
  prompt_config["system"].format(
237
  **kwargs), msg, gen_conf)
 
20
  from api.db.services.dialog_service import DialogService, ConversationService
21
  from api.db import LLMType
22
  from api.db.services.knowledgebase_service import KnowledgebaseService
23
+ from api.db.services.llm_service import LLMService, LLMBundle, TenantLLMService
24
  from api.settings import access_logger, stat_logger, retrievaler, chat_logger
25
  from api.utils.api_utils import server_error_response, get_data_error_result, validate_request
26
  from api.utils import get_uuid
 
184
  assert messages[-1]["role"] == "user", "The last content of this conversation is not from user."
185
  llm = LLMService.query(llm_name=dialog.llm_id)
186
  if not llm:
187
+ llm = TenantLLMService.query(tenant_id=dialog.tenant_id, llm_name=dialog.llm_id)
188
+ if not llm:
189
+ raise LookupError("LLM(%s) not found" % dialog.llm_id)
190
+ max_tokens = 1024
191
+ else: max_tokens = llm[0].max_tokens
192
  questions = [m["content"] for m in messages if m["role"] == "user"]
193
  embd_mdl = LLMBundle(dialog.tenant_id, LLMType.EMBEDDING)
194
  chat_mdl = LLMBundle(dialog.tenant_id, LLMType.CHAT, dialog.llm_id)
 
230
  gen_conf = dialog.llm_setting
231
  msg = [{"role": m["role"], "content": m["content"]}
232
  for m in messages if m["role"] != "system"]
233
+ used_token_count, msg = message_fit_in(msg, int(max_tokens * 0.97))
234
  if "max_tokens" in gen_conf:
235
  gen_conf["max_tokens"] = min(
236
  gen_conf["max_tokens"],
237
+ max_tokens - used_token_count)
238
  answer = chat_mdl.chat(
239
  prompt_config["system"].format(
240
  **kwargs), msg, gen_conf)