jackcwf's picture
Update README.md
34708b9 verified
---
base_model: unsloth/DeepSeek-R1-Distill-Llama-8B
tags:
- text-generation-inference
- transformers
- unsloth
- llama
- trl
license: apache-2.0
language:
- en
---
# About Model
Fine-tuning is used to convert SQL language into natural language, making it easier for users to understand the business meaning of SQL queries. This fine-tuned model is based on the unsloth framework AND uses the DeepSeek-R1-Distill-Llama-8B pre-trained model under unsloth.
# DataSet
[b-mc2/sql-create-context](https://huggingface.co/datasets/b-mc2/sql-create-context)
# Model train
![](https://github.com/datalablife/DeepSeek-R1-Distill-Llama-sql-8B/blob/main/images/train.png?raw=true)
1. **train/loss**: This chart shows the model's loss during training. As the training steps (global step) increase, the loss value drops sharply 和 then stabilizes, indicating that the model is gradually converging.
2. **train/learning_rate**: This chart shows how the learning rate changes over training steps. From the chart, we can see that the learning rate decreases as training progresses, which is likely part of a learning rate decay strategy to prevent the model from oscillating in the later stages of training.
3. **train/grad_norm**: This chart displays the change in gradient norm over training steps. The decrease in gradient norm suggests that the gradients are stabilizing, reducing instability during training.
4. **train/global_step**: This chart shows the increase in global training steps. As the training progresses, the step count gradually increases, indicating the progress of the training process.
5. **train/epoch**: This chart represents the progress of each training epoch. As the global steps increase, the epoch count also steadily grows.
# Inference results before 和 after model training:
## Prompt
![](https://github.com/datalablife/DeepSeek-R1-Distill-Llama-sql-8B/blob/1b7d2b95c5a1df0a753fa362fc1d7130f828308c/images/train-prompt.jpeg)
## Define SQL query for testing
This is a complex customer analysis query used to test the understanding of the model
query1 = """
```sql
SELECT
pc.category_name,
p.product_name,
COUNT(DISTINCT o.customer_id) AS unique_customers,
COUNT(oi.order_id) AS total_sales,
SUM(oi.quantity) AS total_quantity_sold,
ROUND(AVG(oi.unit_price), 2) AS avg_selling_price,
SUM(oi.quantity * oi.unit_price) AS total_revenue,
ROUND(SUM(oi.quantity * oi.unit_price) / COUNT(DISTINCT o.customer_id), 2) AS revenue_per_customer,
MAX(o.order_date) AS last_sale_date,
MIN(o.order_date) AS first_sale_date
FROM product_categories pc
JOIN products p ON pc.category_id = p.category_id
JOIN order_items oi ON p.product_id = oi.product_id
JOIN orders o ON oi.order_id = o.order_id
WHERE
o.order_date >= '2024-01-01'
AND o.order_status = 'completed'
GROUP BY
pc.category_name,
p.product_name
HAVING
total_revenue > 10000
ORDER BY
total_revenue DESC,
unique_customers DESC
LIMIT 15;
```
Explain use case of this query.
"""
<think>
CREATE TABLE product_categories (category_name VARCHAR, product_name VARCHAR, customer_id INT, order_id INT, order_date DATETIME, status VARCHAR)
FROM product_categories
JOIN products
JOIN order_items
JOIN orders
WHERE order_date >= '2024-01-01' AND order_status = 'completed'
GROUP BY category_name, product_name
HAVING total_revenue > 10000
This query analyzes the relationship between product categories and customer orders to identify categories and products with the highest revenue in completed orders since 2024. It can help a business understand which products are generating significant revenue and which products are performing well in the category to inform inventory planning and pricing strategies.
</think>
Below is the business use case for the query:
### Use Case:
This query analyzes the relationship between product categories and customer orders to identify categories and products with the highest revenue in completed orders since 2024. It can help a business understand which products are generating significant revenue in the category and which products are performing well in the category to inform inventory planning and pricing strategies.<|end▁of▁sentence|>
# Model Download
| **Model** | **Base Model** | **下载** |
| -------------------------------- | ------------------------------------------------------------ | ------------------------------------------------------------ |
| unsloth/DeepSeek-R1-Distill-Llama-8B | [DeepSeek-R1-Distill-Llama-8B](https://huggingface.co/unsloth/DeepSeek-R1-Distill-Llama-8B) | [🤗 HuggingFace](https://huggingface.co/jackcwf/DeepSeek-R1-Distill-Llama-4bit-sql-model/raw/main/unsloth.Q4_K_M.gguf) |
# Usage
If you are unsure how to use GGUF files, refer to one of [TheBloke's READMEs](https://huggingface.co/TheBloke/KafkaLM-70B-German-V0.1-GGUF) for more details, including on how to concatenate multi-part files.
# Uploaded model
- **Developed by:** datalabs-ai
- **License:** apache-2.0
- **Finetuned from model :** unsloth/DeepSeek-R1-Distill-Llama-8B
This llama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)