Data Engineering Concepts: 101 Concepts Every Data Engineer Should Know

zazuke1 pts0 comments

Data Engineering Concepts

Data Engineering Concepts

Last updatedUpdated:<br>Jun 3, 2026

by Simon Späti · CreatedCreated: Aug 21, 2023 · 3 min read

Table of Contents

If you want to be a data engineer, learn these concepts. The table below is a Map of Content (MOC) with 101 entry-point ideas that, between them, link out to the rest of the Data Engineering Vault via backlinks. Read it top-to-bottom as a learning path, or jump in where you need depth.

Category<br>Data Engineering Concepts

Storage & Warehousing<br>OLTP, OLAP, ACID Transactions, Data Warehouse, Data Lake, Data Lakehouse, Cloud Data Warehouses, Compute and Storage Separation, Storage Layer, Three Vs

Open Table & File Formats<br>Data Lake File Formats, Data Lake Table Format, Apache Parquet, Apache Arrow, Apache Avro, Apache Iceberg, Delta Lake, Time Travel

Data Architectures & Stacks<br>Data Engineering Architecture, Modern Data Stack, Open Data Stack, Declarative Data Stack, Medallion Architecture, Data Mesh, Microsoft Fabric, Inmon vs Kimball

Data Modeling<br>Data Modeling, Different Levels of Data Modeling, Entity Relationship Diagram (ERD), Dimensional Modeling, Data Vault, Anchor Modeling, Entity-Centric Data Modeling (ECM), Activity Schema, Relational Model, Normalization, Denormalization, Fact Table, Dimensions, Slowly Changing Dimension, Bus Matrix, Granularity, Cardinality

Ingestion, Movement & CDC<br>ETL, ELT, EtLT, Reverse ETL, ETL vs ELT, Change Data Capture (CDC), Delta-Load Options, Stream Processing, Apache Kafka, Backfill, MapReduce, Hadoop

Transformation, Orchestration & Pipelines<br>Data Orchestration, DAG, Software-Defined Asset, Functional Data Engineering, Declarative vs Imperative, Apache Airflow, Dagster, dbt, SQLMesh, Notebooks, Idempotency, AI Orchestrators

Data Management & Governance<br>Data Governance, Data Catalog, Data Contracts, Data Lineage, Data Product, Master Data Management, Schema Evolution, Schema Registry, Schema Drift, DataOps

Semantic Layer, Metrics & Federation<br>Semantic Layer, Modern Semantic Layer, Metrics Layer, Metrics, SQL Query Engine, Analytics API, Semantic SQL, Data Federation,

OLAP, BI & Analytics Evolution<br>Traditional OLAP Cubes, Modern OLAP Systems, Cube, Business Intelligence, BI Tools, Pivot Table

Performance & Engine Internals<br>CAP Theorem, Push-downs, Materialized Views, CTE, Full Table Scan

Practice, Role & Career<br>Data Engineering Lifecycle, Data Engineering Approaches, The Role of a Data Engineer, Data Engineer vs Software Engineer, Retrieval-Augmented Generation (RAG)

# How to Use This Map

If you are new to data engineering , read what interests you most. Start from top to bottom, covering topics such as the data engineering lifecycle, the challenges of DE, and convergent evolutions in these concepts. A gentle storyline from the beginning can be found in my book, starting with the

Introduction to the Field of Data Engineering.

If you are already in the field , pick and choose the detailed topics you want to learn more about. Read Data Engineering Whitepapers, learn about Data Engineering Architecture, and Data Engineering Approaches. Take the opposite approach, learn from the bottom up. And know, every note in this vault is itself, follow its backlinks to reach the rest of the 1000+ interlinked notes in the Data Engineering Vault.

Again, read my book on

Patterns of Data Engineering: Timeless Practices from Convergent Evolution where I try to condense the field of data engineering and its major patterns into one book, or use the Data Engineering Vault with provided learning paths and history of business intelligence and the full Data Engineering Lifecycle.

Related, you find the Data Engineering Toolkit with a comprehensive guide to 70+ essential tools every data engineer should master. The Toolki lists the tools you install, the concepts are why they exist and how they fit together.

Origin:

Data Engineering, the future of Data Warehousing?

References: Data Engineering Toolkit, Data Engineering Vault, Data Engineering Approaches

Interactive Graph

Backlinks

The Data Engineering Toolkit

Data Warehouse

The Data Warehouse Toolkit by Ralph Kimball

Book: The History and State of Data Engineering

data engineering concepts table engineer vault

Related Articles