|
--- |
|
base_model: unsloth/DeepSeek-R1-Distill-Llama-8B |
|
tags: |
|
- text-generation-inference |
|
- transformers |
|
- unsloth |
|
- llama |
|
- trl |
|
license: apache-2.0 |
|
language: |
|
- en |
|
--- |
|
|
|
# About Model |
|
|
|
Fine-tuning is used to convert SQL language into natural language, making it easier for users to understand the business meaning of SQL queries. This fine-tuned model is based on the unsloth framework AND uses the DeepSeek-R1-Distill-Llama-8B pre-trained model under unsloth. |
|
|
|
|
|
# DataSet |
|
|
|
[b-mc2/sql-create-context](https://huggingface.co/datasets/b-mc2/sql-create-context) |
|
|
|
# Model train |
|
|
|
 |
|
|
|
1. **train/loss**: This chart shows the model's loss during training. As the training steps (global step) increase, the loss value drops sharply 和 then stabilizes, indicating that the model is gradually converging. |
|
2. **train/learning_rate**: This chart shows how the learning rate changes over training steps. From the chart, we can see that the learning rate decreases as training progresses, which is likely part of a learning rate decay strategy to prevent the model from oscillating in the later stages of training. |
|
3. **train/grad_norm**: This chart displays the change in gradient norm over training steps. The decrease in gradient norm suggests that the gradients are stabilizing, reducing instability during training. |
|
4. **train/global_step**: This chart shows the increase in global training steps. As the training progresses, the step count gradually increases, indicating the progress of the training process. |
|
5. **train/epoch**: This chart represents the progress of each training epoch. As the global steps increase, the epoch count also steadily grows. |
|
|
|
|
|
# Inference results before 和 after model training: |
|
|
|
## Prompt |
|
 |
|
|
|
## Define SQL query for testing |
|
This is a complex customer analysis query used to test the understanding of the model |
|
|
|
query1 = """ |
|
```sql |
|
SELECT |
|
pc.category_name, |
|
p.product_name, |
|
COUNT(DISTINCT o.customer_id) AS unique_customers, |
|
COUNT(oi.order_id) AS total_sales, |
|
SUM(oi.quantity) AS total_quantity_sold, |
|
ROUND(AVG(oi.unit_price), 2) AS avg_selling_price, |
|
SUM(oi.quantity * oi.unit_price) AS total_revenue, |
|
ROUND(SUM(oi.quantity * oi.unit_price) / COUNT(DISTINCT o.customer_id), 2) AS revenue_per_customer, |
|
MAX(o.order_date) AS last_sale_date, |
|
MIN(o.order_date) AS first_sale_date |
|
FROM product_categories pc |
|
JOIN products p ON pc.category_id = p.category_id |
|
JOIN order_items oi ON p.product_id = oi.product_id |
|
JOIN orders o ON oi.order_id = o.order_id |
|
WHERE |
|
o.order_date >= '2024-01-01' |
|
AND o.order_status = 'completed' |
|
GROUP BY |
|
pc.category_name, |
|
p.product_name |
|
HAVING |
|
total_revenue > 10000 |
|
ORDER BY |
|
total_revenue DESC, |
|
unique_customers DESC |
|
LIMIT 15; |
|
``` |
|
Explain use case of this query. |
|
""" |
|
|
|
<think> |
|
CREATE TABLE product_categories (category_name VARCHAR, product_name VARCHAR, customer_id INT, order_id INT, order_date DATETIME, status VARCHAR) |
|
FROM product_categories |
|
JOIN products |
|
JOIN order_items |
|
JOIN orders |
|
WHERE order_date >= '2024-01-01' AND order_status = 'completed' |
|
GROUP BY category_name, product_name |
|
HAVING total_revenue > 10000 |
|
|
|
This query analyzes the relationship between product categories and customer orders to identify categories and products with the highest revenue in completed orders since 2024. It can help a business understand which products are generating significant revenue and which products are performing well in the category to inform inventory planning and pricing strategies. |
|
</think> |
|
Below is the business use case for the query: |
|
|
|
### Use Case: |
|
This query analyzes the relationship between product categories and customer orders to identify categories and products with the highest revenue in completed orders since 2024. It can help a business understand which products are generating significant revenue in the category and which products are performing well in the category to inform inventory planning and pricing strategies.<|end▁of▁sentence|> |
|
|
|
# Model Download |
|
|
|
| **Model** | **Base Model** | **下载** | |
|
| -------------------------------- | ------------------------------------------------------------ | ------------------------------------------------------------ | |
|
| unsloth/DeepSeek-R1-Distill-Llama-8B | [DeepSeek-R1-Distill-Llama-8B](https://huggingface.co/unsloth/DeepSeek-R1-Distill-Llama-8B) | [🤗 HuggingFace](https://huggingface.co/jackcwf/DeepSeek-R1-Distill-Llama-4bit-sql-model/raw/main/unsloth.Q4_K_M.gguf) | |
|
|
|
|
|
# Usage |
|
|
|
If you are unsure how to use GGUF files, refer to one of [TheBloke's READMEs](https://huggingface.co/TheBloke/KafkaLM-70B-German-V0.1-GGUF) for more details, including on how to concatenate multi-part files. |
|
|
|
|
|
# Uploaded model |
|
|
|
- **Developed by:** datalabs-ai |
|
- **License:** apache-2.0 |
|
- **Finetuned from model :** unsloth/DeepSeek-R1-Distill-Llama-8B |
|
|
|
|
|
|
|
This llama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library. |
|
|
|
[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth) |