Spaces:
Configuration error
Configuration error
File size: 1,517 Bytes
93cf301 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 |
# Fine-Tuned LLM API
This is a FastAPI-based API service for the fine-tuned model "ManojINaik/Strength_weakness". The model is optimized for text generation with 4-bit quantization for efficient inference.
## API Endpoints
### GET /
Health check endpoint that confirms the API is running.
### POST /generate/
Generate text based on a prompt with optional parameters.
#### Request Body
```json
{
"prompt": "What are the strengths of Python?",
"history": [], // Optional: List of previous conversation messages
"system_prompt": "You are a very powerful AI assistant.", // Optional
"max_length": 200, // Optional: Maximum length of generated text
"temperature": 0.7 // Optional: Controls randomness (0.0 to 1.0)
}
```
#### Response
```json
{
"response": "Generated text response..."
}
```
## Model Details
- Base Model: ManojINaik/Strength_weakness
- Quantization: 4-bit quantization using bitsandbytes
- Device: Automatically uses GPU if available, falls back to CPU
- Memory Efficient: Uses device mapping for optimal resource utilization
## Technical Details
- Framework: FastAPI
- Python Version: 3.9+
- Key Dependencies:
- transformers
- torch
- bitsandbytes
- accelerate
- peft
## Example Usage
```python
import requests
url = "https://your-space-name.hf.space/generate"
payload = {
"prompt": "What are the strengths of Python?",
"temperature": 0.7,
"max_length": 200
}
response = requests.post(url, json=payload)
print(response.json()["response"])
```
|