File size: 1,856 Bytes
86d2a32 58a6c30 86d2a32 58a6c30 86d2a32 58a6c30 86d2a32 58a6c30 86d2a32 58a6c30 86d2a32 58a6c30 86d2a32 58a6c30 1768094 58a6c30 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 |
---
base_model: unsloth/Llama-3.2-3B-Instruct
tags:
- text-generation
- mongodb
- query-generation
- transformers
- unsloth
- llama
- trl
- gguf
- quantized
license: apache-2.0
language:
- en
datasets:
- skshmjn/mongo_prompt_query
pipeline_tag: text-generation
library_name: transformers
---
# MongoDB Query Generator - Llama-3.2-3B (Fine-tuned)
- **Developed by:** skshmjn
- **License:** apache-2.0
- **Finetuned from model:** [unsloth/Llama-3.2-3B-Instruct](https://huggingface.co/unsloth/Llama-3.2-3B-Instruct)
- **Dataset Used:** [skshmjn/mongodb-chat-query](https://huggingface.co/datasets/skshmjn/mongodb-chat-query)
- **Supports:** Transformers & GGUF (for fast inference on CPU/GPU)
## ๐ **Model Overview**
This model is designed to **generate MongoDB queries** from natural language prompts. It supports:
- **Basic CRUD operations:** `find`, `insert`, `update`, `delete`
- **Aggregation Pipelines:** `$group`, `$match`, `$lookup`, `$sort`, etc.
- **Indexing & Performance Queries**
- **Nested Queries & Joins (`$lookup`)**
Trained using **Unsloth** for efficient fine-tuning and **GGUF quantization** for fast inference.
---
## ๐ **Example Usage (Transformers)**
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "skshmjn/Llama-3.2-3B-Mongo-Instruct"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)
schema = {} # Pass your mongodb schema here, leave empty for generic queries. Sample available in hugging face's repository
prompt = "Here is mongodb schema {schema} and Find all employees older than 30 in the 'employees' collection."
inputs = tokenizer(prompt, return_tensors="pt")
output = model.generate(**inputs, max_length=100)
query = tokenizer.decode(output[0], skip_special_tokens=True)
print(query) |