File size: 1,288 Bytes
a431f91
 
 
 
 
e015dcb
c44274f
 
 
 
ef12370
a431f91
 
3e753c0
a431f91
 
 
 
 
3e753c0
a431f91
3e753c0
a431f91
3e753c0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
a431f91
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
---
title: LLaMA 7B Server
emoji: 🤖
colorFrom: blue
colorTo: purple
sdk: docker
sdk_version: "1.0.0"
app_file: app.py
pinned: false
---

# LLaMA 7B Server

A FastAPI-based server for interacting with the LLaMA 7B model.

## Features

- [x] Text generation
- [x] Model parameters configuration
- [x] REST API interface

## API Usage

Make a POST request to `/generate` with the following JSON body:

```json
{
    "prompt": "your prompt here",
    "max_length": 2048,
    "num_beams": 3,
    "early_stopping": true,
    "no_repeat_ngram_size": 3
}
```

Example using curl:

```bash
curl -X POST http://localhost:7860/generate \
  -H "Content-Type: application/json" \
  -d '{"prompt": "Hello, how are you?"}'
```

Example using Python:

```python
import requests

url = "http://localhost:7860/generate"
data = {
    "prompt": "Hello, how are you?",
    "max_length": 2048,
    "num_beams": 3,
    "early_stopping": True,
    "no_repeat_ngram_size": 3
}

response = requests.post(url, json=data)
result = response.json()
print(result["generated_text"])  # This will contain your generated text
```

## Model Details

- Model: LLaMA 7B
- Parameters: 7 billion
- Language: Multilingual

## Technical Details

- Framework: Gradio
- Python Version: 3.9+
- Dependencies: See requirements.txt