Commit
·
0aa22f7
1
Parent(s):
962629e
Create README.md
Browse files
README.md
ADDED
@@ -0,0 +1,53 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
language:
|
3 |
+
- en
|
4 |
+
---
|
5 |
+
## Model Description
|
6 |
+
|
7 |
+
Graphix-T5 is a graph-aware semi-pretrained text-to-text PLM specifically designed to improve multi-hop reasoning for the complex text-to-SQL task.
|
8 |
+
This novel architecture enhances the structural encoding capabilities of the T5 model while preserving its powerful contextual encoding ability.
|
9 |
+
The experimental results demonstrate the effectiveness of GRAPHIX-T5 and underscore the importance of incorporating structural information in text-to-text PLMs for tackling intricate text-to-SQL challenges.
|
10 |
+
The smaller gap in performance between the dev and test sets indicates the stronger generalization capability of Graphix-T5.
|
11 |
+
|
12 |
+
## Training Data
|
13 |
+
Graphix-3B is trained based on SPIDER, a cross-domain text-to-SQL benchmark. And it's evaluated in vanilla SPIDER dev, test, and other variants: SPIDER-SYN, SPIDER-DK,
|
14 |
+
SPIDER-REALISTIC **without additional training**. This model will continue to be fine-tuned on more complex text-to-SQL data,
|
15 |
+
i.e. BIRD to deal with harder but more real applications
|
16 |
+
|
17 |
+
## To Begin With
|
18 |
+
|
19 |
+
You can use this model directly with a pipeline for text generation. This example generates a different sequence each time it's run:
|
20 |
+
```py
|
21 |
+
from transformers import AutoTokenizer, AutoModel
|
22 |
+
|
23 |
+
tokenizer = AutoTokenizer.from_pretrained("patrickNLP/Graphix-3B")
|
24 |
+
|
25 |
+
model = AutoModel.from_pretrained("patrickNLP/Graphix-3B")
|
26 |
+
```
|
27 |
+
|
28 |
+
## Performance
|
29 |
+
Graphix-3B w/ Picard maintains state-of-the-art (SOTA) semantic parsing capabilities, as demonstrated by its performance on the
|
30 |
+
[`SPIDER`](https://yale-lily.github.io/spider) leaderboard. Its only submission achieves **74.0%** on EM and **77.6%** on EX in the testing dataset.
|
31 |
+
Please see [`Graphix Official Implementation`]() for details.
|
32 |
+
|
33 |
+
## Reference
|
34 |
+
1. [`Graphix-T5: Mixing Pre-Trained Transformers with Graph-Aware Layers for Text-to-SQL Parsing`](https://arxiv.org/abs/2301.07507)
|
35 |
+
2. [`Can LLM Already Serve as A Database Interface? A BIg Bench for Large-Scale Database Grounded Text-to-SQLs`](https://arxiv.org/abs/2305.03111)
|
36 |
+
3. [`Spider: A Large-Scale Human-Labeled Dataset for Complex and Cross-Domain Semantic Parsing and Text-to-SQL Task`](https://arxiv.org/abs/1809.08887)
|
37 |
+
4. [`PICARD: Parsing Incrementally for Constrained Auto-Regressive Decoding from Language Models`](https://arxiv.org/abs/2109.05093)
|
38 |
+
|
39 |
+
|
40 |
+
## Citation
|
41 |
+
```
|
42 |
+
@misc{li2023graphixt5,
|
43 |
+
title={Graphix-T5: Mixing Pre-Trained Transformers with Graph-Aware Layers for Text-to-SQL Parsing},
|
44 |
+
author={Jinyang Li and Binyuan Hui and Reynold Cheng and Bowen Qin and Chenhao Ma and Nan Huo and Fei Huang and Wenyu Du and Luo Si and Yongbin Li},
|
45 |
+
year={2023},
|
46 |
+
eprint={2301.07507},
|
47 |
+
archivePrefix={arXiv},
|
48 |
+
primaryClass={cs.CL}
|
49 |
+
}
|
50 |
+
```
|
51 |
+
|
52 |
+
|
53 |
+
|