add using jina deploy local llm in deploy_local_llm.mdx (#1872)
Browse files### What problem does this PR solve?
add using jina deploy local llm in deploy_local_llm.mdx
### Type of change
- [x] Documentation Update
---------
Co-authored-by: Zhedong Cen <[email protected]>
docs/guides/deploy_local_llm.mdx
CHANGED
|
@@ -15,6 +15,40 @@ RAGFlow seamlessly integrates with Ollama and Xinference, without the need for f
|
|
| 15 |
This user guide does not intend to cover much of the installation or configuration details of Ollama or Xinference; its focus is on configurations inside RAGFlow. For the most current information, you may need to check out the official site of Ollama or Xinference.
|
| 16 |
:::
|
| 17 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 18 |
## Deploy a local model using Ollama
|
| 19 |
|
| 20 |
[Ollama](https://github.com/ollama/ollama) enables you to run open-source large language models that you deployed locally. It bundles model weights, configurations, and data into a single package, defined by a Modelfile, and optimizes setup and configurations, including GPU usage.
|
|
|
|
| 15 |
This user guide does not intend to cover much of the installation or configuration details of Ollama or Xinference; its focus is on configurations inside RAGFlow. For the most current information, you may need to check out the official site of Ollama or Xinference.
|
| 16 |
:::
|
| 17 |
|
| 18 |
+
# Deploy a local model using jina
|
| 19 |
+
|
| 20 |
+
[Jina](https://github.com/jina-ai/jina) lets you build AI services and pipelines that communicate via gRPC, HTTP and WebSockets, then scale them up and deploy to production.
|
| 21 |
+
|
| 22 |
+
To deploy a local model, e.g., **gpt2**, using Jina:
|
| 23 |
+
|
| 24 |
+
### 1. Check firewall settings
|
| 25 |
+
|
| 26 |
+
Ensure that your host machine's firewall allows inbound connections on port 12345.
|
| 27 |
+
|
| 28 |
+
```bash
|
| 29 |
+
sudo ufw allow 12345/tcp
|
| 30 |
+
```
|
| 31 |
+
|
| 32 |
+
### 2.install jina package
|
| 33 |
+
|
| 34 |
+
```bash
|
| 35 |
+
pip install jina
|
| 36 |
+
```
|
| 37 |
+
|
| 38 |
+
### 3. deployment local model
|
| 39 |
+
|
| 40 |
+
Step 1: Navigate to the rag/svr directory.
|
| 41 |
+
|
| 42 |
+
```bash
|
| 43 |
+
cd rag/svr
|
| 44 |
+
```
|
| 45 |
+
|
| 46 |
+
Step 2: Use Python to run the jina_server.py script and pass in the model name or the local path of the model (the script only supports loading models downloaded from Huggingface)
|
| 47 |
+
|
| 48 |
+
```bash
|
| 49 |
+
python jina_server.py --model_name gpt2
|
| 50 |
+
```
|
| 51 |
+
|
| 52 |
## Deploy a local model using Ollama
|
| 53 |
|
| 54 |
[Ollama](https://github.com/ollama/ollama) enables you to run open-source large language models that you deployed locally. It bundles model weights, configurations, and data into a single package, defined by a Modelfile, and optimizes setup and configurations, including GPU usage.
|