arxiv:2403.02962

WikiTableEdit: A Benchmark for Table Editing by Natural Language Instruction

Published on Mar 5, 2024

Authors:

Abstract

Tabular data, as a crucial form of data representation, exists in diverse formats on the Web. When confronted with complex and irregular tables, manual modification becomes a laborious task. This paper investigates the performance of Large Language Models (LLMs) in the context of table editing tasks. Existing research mainly focuses on regular-shaped tables, wherein instructions are used to generate code in SQL, Python, or Excel Office-script for manipulating the tables. Nevertheless, editing tables with irregular structures, particularly those containing merged cells spanning multiple rows, poses a challenge when using code. To address this, we introduce the WikiTableEdit dataset. Leveraging 26,531 tables from the WikiSQL dataset, we automatically generate natural language instructions for six distinct basic operations and the corresponding outcomes, resulting in over 200,000 instances. Subsequently, we evaluate several representative large language models on the WikiTableEdit dataset to demonstrate the challenge of this task. The dataset will be released to the community to promote related researches.

View arXiv page View PDF Add to collection

Community

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2403.02962 in a model README.md to link it from this page.

Datasets citing this paper 1

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2403.02962 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.