BEAVER: Enterprise benchmark for LLM Text-to-SQL from private data warehouses

-->

BEAVER: An Enterprise Benchmark for Text-to-SQL

🦫 BEAVER: An Enterprise Benchmark for Text-to-SQL

Peter Baile Chen1,

Devin Yang1,

Weiyue Li2,

Fabian Wenz1,3,

Yi Zhang4,

Nesime Tatbul1,5,

Michael Cafarella1,

Çağatay Demiralp1,6,

Michael Stonebraker1

1MIT, 2Harvard University, 3TU Munich, 4Greenshoe, Inc., 5Intel, 6AWS AI Labs

arXiv

Dataset

Code

Overview

Leaderboard

Submission Instructions

How to Submit:

Please send an email to peterbc@mit.edu, along with your method name, a brief description of the method, and, optionally, a link to your paper or codebase. We will follow up with detailed instructions.

Method

Model

Rank Submission Date Method Model Execution Accuracy

Subtask Metrics Across Settings

Citation

If you find our data, code, or the paper helpful, please cite the paper:

@article{chen2024beaver, title={BEAVER: an enterprise benchmark for text-to-sql}, author={Chen, Peter Baile and Yang, Devin and Li, Weiyue and Wenz, Fabian and Zhang, Yi and Tatbul, Nesime and Cafarella, Michael and Demiralp, {\c{C}}a{\u{g}}atay and Stonebraker, Michael}, journal={arXiv preprint arXiv:2409.02038}, year={2024}

Copy

BEAVER is a large-scale enterprise text-to-SQL dataset containing 9128 queries spanning 812 tables across 19 diverse domains. Of these, 7978 queries are publicly released, while the remaining portion is held out as a private test set. Queries and databases were collected from private organizations.

To facilitate fine-grained evaluation and analysis, we provide

annotations for five subtasks: multi-table retrieval, join key detection, column mapping, domain knowledge extraction, and query decomposition

three categories of queries: complex queries without domain knowledge, domain-specific queries with minimal complexity, and domain-specific complex queries

Example data

Representative BEAVER tasks with question, SQL, and subtask annotations.

Task id

db source

Question

SQL

Multi-Table retrieval

Tables used in SQL

Join Keys

Connections among used tables

Column Mapping

Mapping from question phrases to table columns(s)

Domain knowledge

Domain-specific predictates used in SQL

Subquery Decomposition

Decomposition of SQL into simpler sub-queries

Usage

We have created a unified MySQL version for our dataset. A free MySQL installation can be found here. After the installation, import the MySQL dump files from the google drive to your local MySQL databases using

mysql -u root -p

To execute a SQL statement, you can either log in to the MySQL interface or you can do it via mysql-connector-python.

If you want to use the Oracle version of DW queries, you can download the free oracle database and import the CSVs.

-->

Changelog

-->

Citation

If you find our data, code, or the paper helpful, please cite the paper:

article{chen2024beaver, title={BEAVER: an enterprise benchmark for text-to-sql}, author={Chen, Peter Baile and Yang, Devin and Li, Weiyue and Wenz, Fabian and Zhang, Yi and Tatbul, Nesime and Cafarella, Michael and Demiralp, {\c{C}}a{\u{g}}atay and Stonebraker, Michael}, journal={arXiv preprint arXiv:2409.02038}, year={2024}

Copy

BEAVER: Enterprise benchmark for LLM Text-to-SQL from private data warehouses

Related Articles

The Newest Instagram "Exploit" Is the Goofiest I've Seen

Apple WWDC 2026 Livestream

Claude Fable 5

US Government directive to suspend access to Fable 5 and Mythos 5

It's Not Just X. It's Y