π§ Gemma 2B - MongoDB Query Generator (LoRA)
This is a LoRA fine-tuned version of unsloth/gemma-2b-it
that converts natural language instructions into MongoDB query strings like:
db.users.find({ "isActive": true, "age": { "$gt": 30 } })
The model is instruction-tuned to support a text-to-query use case for MongoDB across typical collections like users
, orders
, and products
.
β¨ Model Details
- Base model:
unsloth/gemma-2b-it
- Fine-tuned with: LoRA (4-bit quantized)
- Framework: Unsloth + PEFT
- Dataset: Synthetic instructions paired with MongoDB queries (300+ examples)
- Use case: Text-to-MongoDB query generation
π¦ How to Use
from unsloth import FastLanguageModel
model, tokenizer = FastLanguageModel.from_pretrained(
model_name = "unsloth/gemma-2b-it",
max_seq_length = 1024,
dtype = torch.float16,
load_in_4bit = True,
)
# Load LoRA adapter
model = FastLanguageModel.get_peft_model(
model,
r=16,
lora_alpha=32,
target_modules=["q_proj", "k_proj", "v_proj", "o_proj"],
lora_dropout=0.05,
bias="none",
)
# Load parameter
model.load_adapter("kihyun1998/gemma-2b-it-mongodb-lora", adapter_name="default")
prompt = """### Instruction:
Convert to MongoDB query string.
### Input:
Collection: users
Fields:
- name (string)
- age (int)
- isActive (boolean)
- country (string)
Question: Show all active users from Korea older than 30.
### Response:
"""
inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
output = model.generate(**inputs, max_new_tokens=100)
print(tokenizer.decode(output[0], skip_special_tokens=True))
π‘ Example Output
db.users.find({ "isActive": true, "country": "Korea", "age": { "$gt": 30 } })
π Intended Use
- Converting business-friendly questions into executable MongoDB queries
- Powering internal dashboards, query builders, or no-code tools
- Works best on structured fields and simple query logic
Out-of-scope:
- Complex joins or aggregation pipelines
- Nested or dynamic schema reasoning
π Training Details
- LoRA rank: 16
- Epochs: 3
- Dataset: 300+ synthetic natural language β MongoDB query pairs
- Training hardware: Google Colab (T4 GPU)
π§ Limitations
- Model assumes collection and fields are already known (RAG context required)
- May hallucinate field names not present in context
- Limited handling of advanced MongoDB features like
$lookup
,$aggregate
π§Ύ License
The base model is under Gemma license.
This LoRA adapter inherits the same conditions.
π§βπ» Author
- π± @kihyun1998
- π¬ Questions? Open an issue or contact via Hugging Face.
π Citation
@misc{kihyun2025mongodb,
title={Gemma 2B MongoDB Query Generator (LoRA)},
author={Kihyun Park},
year={2025},
howpublished={\\url{https://huggingface.co/kihyun1998/gemma-2b-it-mongodb-lora}}
}
- Downloads last month
- 26
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
π
Ask for provider support
HF Inference deployability: The model has no pipeline_tag.
Model tree for kihyun1998/gemma-2b-it-mongodb-lora
Base model
unsloth/gemma-2b-it