Sharpaxis
/

Llama-2-7b-chat-finetune_TEXT2SQL

Text Generation

text-generation-inference

Model card Files Files and versions Community

Llama-2-7b-chat-finetune_TEXT2SQL / README.md

Sharpaxis's picture

Update README.md

530cea1 verified 8 months ago

|

history blame contribute delete

2.19 kB

	---
	license: apache-2.0
	datasets:
	- ekshat/text-2-sql-with-context
	language:
	- en
	---
	---
	datasets:
	- ekshat/text-2-sql-with-context
	language:
	- en
	library_name: transformers
	pipeline_tag: text-generation
	tags:
	- text-2-sql
	- text-generation
	- text2sql

	# Inference
	```python
	!pip install transformers accelerate xformers bitsandbytes

	from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline

	tokenizer = AutoTokenizer.from_pretrained("ekshat/Llama-2-7b-chat-finetune-for-text2sql")

	# Loading model in 4 bit precision
	model = AutoModelForCausalLM.from_pretrained("ekshat/Llama-2-7b-chat-finetune-for-text2sql", load_in_4bit=True)

	context = "CREATE TABLE head (name VARCHAR, born_state VARCHAR, age VARCHAR)"
	question = "List the name, born state and age of the heads of departments ordered by age."

	prompt = f"""Below is an context that describes a sql query, paired with an question that provides further information. Write an answer that appropriately completes the request.
	### Context:
	{context}
	### Question:
	{question}
	### Answer:"""

	pipe = pipeline(task="text-generation", model=model, tokenizer=tokenizer, max_length=200)
	result = pipe(prompt)
	print(result[0]['generated_text'])
	```


	# Model Information
	- model_name = "NousResearch/Llama-2-7b-chat-hf"

	- dataset_name = "ekshat/text-2-sql-with-context"


	# QLoRA parameters
	- lora_r = 64

	- lora_alpha = 16

	- lora_dropout = 0.1


	# BitsAndBytes parameters
	- use_4bit = True

	- bnb_4bit_compute_dtype = "float16"

	- bnb_4bit_quant_type = "nf4"

	- use_nested_quant = False


	# Training Arguments parameters
	- num_train_epochs = 1

	- fp16 = False

	- bf16 = False

	- per_device_train_batch_size = 8

	- per_device_eval_batch_size = 4

	- gradient_accumulation_steps = 1

	- gradient_checkpointing = True

	- max_grad_norm = 0.3

	- learning_rate = 2e-4

	- weight_decay = 0.001

	- optim = "paged_adamw_32bit"

	- lr_scheduler_type = "cosine"

	- max_steps = -1

	- warmup_ratio = 0.03

	- group_by_length = True

	- save_steps = 0

	- logging_steps = 25


	# SFT parameters
	- max_seq_length = None

	- packing = False