Update README.md

34708b9 verified 4 months ago

5.49 kB

	---
	base_model: unsloth/DeepSeek-R1-Distill-Llama-8B
	tags:
	- text-generation-inference
	- transformers
	- unsloth
	- llama
	- trl
	license: apache-2.0
	language:
	- en
	---

	# About Model

	Fine-tuning is used to convert SQL language into natural language, making it easier for users to understand the business meaning of SQL queries. This fine-tuned model is based on the unsloth framework AND uses the DeepSeek-R1-Distill-Llama-8B pre-trained model under unsloth.


	# DataSet

	[b-mc2/sql-create-context](https://huggingface.co/datasets/b-mc2/sql-create-context)

	# Model train

	![](https://github.com/datalablife/DeepSeek-R1-Distill-Llama-sql-8B/blob/main/images/train.png?raw=true)

	1. train/loss: This chart shows the model's loss during training. As the training steps (global step) increase, the loss value drops sharply 和 then stabilizes, indicating that the model is gradually converging.
	2. train/learning_rate: This chart shows how the learning rate changes over training steps. From the chart, we can see that the learning rate decreases as training progresses, which is likely part of a learning rate decay strategy to prevent the model from oscillating in the later stages of training.
	3. train/grad_norm: This chart displays the change in gradient norm over training steps. The decrease in gradient norm suggests that the gradients are stabilizing, reducing instability during training.
	4. train/global_step: This chart shows the increase in global training steps. As the training progresses, the step count gradually increases, indicating the progress of the training process.
	5. train/epoch: This chart represents the progress of each training epoch. As the global steps increase, the epoch count also steadily grows.


	# Inference results before 和 after model training:

	## Prompt
	![](https://github.com/datalablife/DeepSeek-R1-Distill-Llama-sql-8B/blob/1b7d2b95c5a1df0a753fa362fc1d7130f828308c/images/train-prompt.jpeg)

	## Define SQL query for testing
	This is a complex customer analysis query used to test the understanding of the model

	query1 = """
	```sql
	SELECT
	pc.category_name,
	p.product_name,
	COUNT(DISTINCT o.customer_id) AS unique_customers,
	COUNT(oi.order_id) AS total_sales,
	SUM(oi.quantity) AS total_quantity_sold,
	ROUND(AVG(oi.unit_price), 2) AS avg_selling_price,
	SUM(oi.quantity * oi.unit_price) AS total_revenue,
	ROUND(SUM(oi.quantity * oi.unit_price) / COUNT(DISTINCT o.customer_id), 2) AS revenue_per_customer,
	MAX(o.order_date) AS last_sale_date,
	MIN(o.order_date) AS first_sale_date
	FROM product_categories pc
	JOIN products p ON pc.category_id = p.category_id
	JOIN order_items oi ON p.product_id = oi.product_id
	JOIN orders o ON oi.order_id = o.order_id
	WHERE
	o.order_date >= '2024-01-01'
	AND o.order_status = 'completed'
	GROUP BY
	pc.category_name,
	p.product_name
	HAVING
	total_revenue > 10000
	ORDER BY
	total_revenue DESC,
	unique_customers DESC
	LIMIT 15;
	```
	Explain use case of this query.
	"""

	<think>
	CREATE TABLE product_categories (category_name VARCHAR, product_name VARCHAR, customer_id INT, order_id INT, order_date DATETIME, status VARCHAR)
	FROM product_categories
	JOIN products
	JOIN order_items
	JOIN orders
	WHERE order_date >= '2024-01-01' AND order_status = 'completed'
	GROUP BY category_name, product_name
	HAVING total_revenue > 10000

	This query analyzes the relationship between product categories and customer orders to identify categories and products with the highest revenue in completed orders since 2024. It can help a business understand which products are generating significant revenue and which products are performing well in the category to inform inventory planning and pricing strategies.
	</think>
	Below is the business use case for the query:

	### Use Case:
	This query analyzes the relationship between product categories and customer orders to identify categories and products with the highest revenue in completed orders since 2024. It can help a business understand which products are generating significant revenue in the category and which products are performing well in the category to inform inventory planning and pricing strategies.<｜end▁of▁sentence｜>

	# Model Download

	\| Model \| Base Model \| 下载 \|
	\| -------------------------------- \| ------------------------------------------------------------ \| ------------------------------------------------------------ \|
	\| unsloth/DeepSeek-R1-Distill-Llama-8B \| [DeepSeek-R1-Distill-Llama-8B](https://huggingface.co/unsloth/DeepSeek-R1-Distill-Llama-8B) \| [🤗 HuggingFace](https://huggingface.co/jackcwf/DeepSeek-R1-Distill-Llama-4bit-sql-model/raw/main/unsloth.Q4_K_M.gguf) \|


	# Usage

	If you are unsure how to use GGUF files, refer to one of [TheBloke's READMEs](https://huggingface.co/TheBloke/KafkaLM-70B-German-V0.1-GGUF) for more details, including on how to concatenate multi-part files.


	# Uploaded model

	- Developed by: datalabs-ai
	- License: apache-2.0
	- Finetuned from model : unsloth/DeepSeek-R1-Distill-Llama-8B



	This llama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.

	[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)