simone-papicchio commited on
Commit
1aa7668
·
verified ·
1 Parent(s): 690d314

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +70 -0
README.md ADDED
@@ -0,0 +1,70 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ base_model:
4
+ - Qwen/Qwen2.5-Coder-7B-Instruct
5
+ ---
6
+
7
+ ## Model Information
8
+ This model is the reasoning model for Text2SQL task introduced in [Think2SQL: Reinforce LLM Reasoning Capabilities for Text2SQL](https://arxiv.org/abs/2504.15077)
9
+
10
+ ## Intended use
11
+ The best model performance are given with its System and User prompt.
12
+ The model is intended to use with three input: question, evidence and the database schema.
13
+
14
+
15
+ Starting with `transformers >= 4.43.0` onward, you can run conversational inference using the Transformers `pipeline` abstraction or by leveraging the Auto classes with the `generate()` function.
16
+
17
+ Make sure to update your transformers installation via `pip install --upgrade transformers`.
18
+
19
+ ```python
20
+ import transformers
21
+ import torch
22
+ model_id = "simone-papicchio/Think2SQL-7B"
23
+ pipeline = transformers.pipeline(
24
+ "text-generation",
25
+ model=model_id,
26
+ model_kwargs={"torch_dtype": torch.bfloat16},
27
+ device_map="auto",
28
+ )
29
+
30
+ system_message = (
31
+ "You are a helpful AI Assistant that provides well-reasoned and detailed responses. "
32
+ "You first think about the reasoning process as an internal monologue and then provide the user with the answer. "
33
+ "Respond in the following format: <think>\n...\n</think>\n<answer>\n...\n</answer>"
34
+ ).strip()
35
+
36
+ user_message = (
37
+ "Answer the following question with the SQL code. Use the piece of evidence and base your answer on the database schema. "
38
+ "Given the question, the evidence and the database schema, return in the <answer> tags only the SQL script that addresses the question.\n"
39
+ "Question:\n{question}\n\n"
40
+ "Evidence:\n{evidence}\n\n"
41
+ "Database Schema:\n{schema}\n\n"
42
+ "Return only the SQL script enclosed in <answer> tags."
43
+ ).strip()
44
+
45
+ messages = [
46
+ {"role": "system", "content": system_message},
47
+ {"role": "user", "content": user_message},
48
+ ]
49
+
50
+ outputs = pipeline(
51
+ messages,
52
+ max_new_tokens=256,
53
+ )
54
+ print(outputs[0]["generated_text"][-1])
55
+ ```
56
+
57
+
58
+ ## Citation
59
+ ```bitex
60
+ @misc{papicchio2025think2sqlreinforcellmreasoning,
61
+ title={Think2SQL: Reinforce LLM Reasoning Capabilities for Text2SQL},
62
+ author={Simone Papicchio and Simone Rossi and Luca Cagliero and Paolo Papotti},
63
+ year={2025},
64
+ eprint={2504.15077},
65
+ archivePrefix={arXiv},
66
+ primaryClass={cs.LG},
67
+ url={https://arxiv.org/abs/2504.15077},
68
+ }
69
+ ```
70
+