Beaver: An Enterprise Benchmark for Text-to-SQL

tcp_handshaker1 pts0 comments

-->

-->

-->

-->

-->

-->

BEAVER: An Enterprise Benchmark for Text-to-SQL

🦫 BEAVER: An Enterprise Benchmark for Text-to-SQL

Peter Baile Chen1,

Fabian Wenz1,3,

Yi Zhang2,

Devin Yang1,

Justin Choi1,

Nesime Tatbul1,

Michael Cafarella1,

Çağatay Demiralp1,2,

Michael Stonebraker1

1MIT, 2AWS AI Labs,<br>3Technical University of Munich

arXiv

Dataset and Code

BEAVER

Beaver is an enterprise text-to-SQL dataset consisted of xxx queries and xxx tables across xxx databases. Queries and databases were collected from private organizations. Compared to previous text-to-SQL datasets focusing on public tables and<br>We also encourage the problem open-domain text-to-SQL

Usage

We have created a unified MySQL version for our dataset. A free<br>MySQL installation can be found<br>here. After<br>the installation, import the MySQL dump files from the google<br>drive to your local MySQL databases using

mysql -u root -p

To execute a SQL statement, you can either log in to the MySQL<br>interface or you can do it via<br>mysql-connector-python.

If you want to use the Oracle version of<br>DW queries, you can download the free oracle<br>database and import the CSVs.

Changelog

-->

Citation

If you find our data or the paper helpful, please cite the<br>paper:

@article{chen2024beaver,<br>title={BEAVER: an enterprise benchmark for text-to-sql},<br>author={Chen, Peter Baile and Wenz, Fabian and Zhang, Yi and Yang, Devin and Choi, Justin and Tatbul, Nesime and Cafarella, Michael and Demiralp, {\c{C}}a{\u{g}}atay and Stonebraker, Michael},<br>journal={arXiv preprint arXiv:2409.02038},<br>year={2024}

Copy

Leaderboard

Table Retrieval

Column Mapping

Join Key Detection

SQL Generation

Open-Domain SQL Generation

Rank<br>Method<br>Score

text mysql beaver enterprise benchmark michael

Related Articles