-->
-->
-->
-->
-->
-->
BEAVER: An Enterprise Benchmark for Text-to-SQL
🦫 BEAVER: An Enterprise Benchmark for Text-to-SQL
Peter Baile Chen1,
Fabian Wenz1,3,
Yi Zhang2,
Devin Yang1,
Justin Choi1,
Nesime Tatbul1,
Michael Cafarella1,
Çağatay Demiralp1,2,
Michael Stonebraker1
1MIT, 2AWS AI Labs,<br>3Technical University of Munich
arXiv
Dataset and Code
BEAVER
Beaver is an enterprise text-to-SQL dataset consisted of xxx queries and xxx tables across xxx databases. Queries and databases were collected from private organizations. Compared to previous text-to-SQL datasets focusing on public tables and<br>We also encourage the problem open-domain text-to-SQL
Usage
We have created a unified MySQL version for our dataset. A free<br>MySQL installation can be found<br>here. After<br>the installation, import the MySQL dump files from the google<br>drive to your local MySQL databases using
mysql -u root -p
To execute a SQL statement, you can either log in to the MySQL<br>interface or you can do it via<br>mysql-connector-python.
If you want to use the Oracle version of<br>DW queries, you can download the free oracle<br>database and import the CSVs.
Changelog
-->
Citation
If you find our data or the paper helpful, please cite the<br>paper:
@article{chen2024beaver,<br>title={BEAVER: an enterprise benchmark for text-to-sql},<br>author={Chen, Peter Baile and Wenz, Fabian and Zhang, Yi and Yang, Devin and Choi, Justin and Tatbul, Nesime and Cafarella, Michael and Demiralp, {\c{C}}a{\u{g}}atay and Stonebraker, Michael},<br>journal={arXiv preprint arXiv:2409.02038},<br>year={2024}
Copy
Leaderboard
Table Retrieval
Column Mapping
Join Key Detection
SQL Generation
Open-Domain SQL Generation
Rank<br>Method<br>Score