Data Engineering Concepts
Data Engineering Concepts
Last updatedUpdated:<br>Jun 3, 2026
by Simon Späti · CreatedCreated: Aug 21, 2023 · 3 min read
Table of Contents
If you want to be a data engineer, learn these concepts. The table below is a Map of Content (MOC) with 101 entry-point ideas that, between them, link out to the rest of the Data Engineering Vault via backlinks. Read it top-to-bottom as a learning path, or jump in where you need depth.
Category<br>Data Engineering Concepts
Storage & Warehousing<br>OLTP, OLAP, ACID Transactions, Data Warehouse, Data Lake, Data Lakehouse, Cloud Data Warehouses, Compute and Storage Separation, Storage Layer, Three Vs
Open Table & File Formats<br>Data Lake File Formats, Data Lake Table Format, Apache Parquet, Apache Arrow, Apache Avro, Apache Iceberg, Delta Lake, Time Travel
Data Architectures & Stacks<br>Data Engineering Architecture, Modern Data Stack, Open Data Stack, Declarative Data Stack, Medallion Architecture, Data Mesh, Microsoft Fabric, Inmon vs Kimball
Data Modeling<br>Data Modeling, Different Levels of Data Modeling, Entity Relationship Diagram (ERD), Dimensional Modeling, Data Vault, Anchor Modeling, Entity-Centric Data Modeling (ECM), Activity Schema, Relational Model, Normalization, Denormalization, Fact Table, Dimensions, Slowly Changing Dimension, Bus Matrix, Granularity, Cardinality
Ingestion, Movement & CDC<br>ETL, ELT, EtLT, Reverse ETL, ETL vs ELT, Change Data Capture (CDC), Delta-Load Options, Stream Processing, Apache Kafka, Backfill, MapReduce, Hadoop
Transformation, Orchestration & Pipelines<br>Data Orchestration, DAG, Software-Defined Asset, Functional Data Engineering, Declarative vs Imperative, Apache Airflow, Dagster, dbt, SQLMesh, Notebooks, Idempotency, AI Orchestrators
Data Management & Governance<br>Data Governance, Data Catalog, Data Contracts, Data Lineage, Data Product, Master Data Management, Schema Evolution, Schema Registry, Schema Drift, DataOps
Semantic Layer, Metrics & Federation<br>Semantic Layer, Modern Semantic Layer, Metrics Layer, Metrics, SQL Query Engine, Analytics API, Semantic SQL, Data Federation,
OLAP, BI & Analytics Evolution<br>Traditional OLAP Cubes, Modern OLAP Systems, Cube, Business Intelligence, BI Tools, Pivot Table
Performance & Engine Internals<br>CAP Theorem, Push-downs, Materialized Views, CTE, Full Table Scan
Practice, Role & Career<br>Data Engineering Lifecycle, Data Engineering Approaches, The Role of a Data Engineer, Data Engineer vs Software Engineer, Retrieval-Augmented Generation (RAG)
# How to Use This Map
If you are new to data engineering , read what interests you most. Start from top to bottom, covering topics such as the data engineering lifecycle, the challenges of DE, and convergent evolutions in these concepts. A gentle storyline from the beginning can be found in my book, starting with the
Introduction to the Field of Data Engineering.
If you are already in the field , pick and choose the detailed topics you want to learn more about. Read Data Engineering Whitepapers, learn about Data Engineering Architecture, and Data Engineering Approaches. Take the opposite approach, learn from the bottom up. And know, every note in this vault is itself, follow its backlinks to reach the rest of the 1000+ interlinked notes in the Data Engineering Vault.
Again, read my book on
Patterns of Data Engineering: Timeless Practices from Convergent Evolution where I try to condense the field of data engineering and its major patterns into one book, or use the Data Engineering Vault with provided learning paths and history of business intelligence and the full Data Engineering Lifecycle.
Related, you find the Data Engineering Toolkit with a comprehensive guide to 70+ essential tools every data engineer should master. The Toolki lists the tools you install, the concepts are why they exist and how they fit together.
Origin:
Data Engineering, the future of Data Warehousing?
References: Data Engineering Toolkit, Data Engineering Vault, Data Engineering Approaches
Interactive Graph
Backlinks
The Data Engineering Toolkit
Data Warehouse
The Data Warehouse Toolkit by Ralph Kimball
Book: The History and State of Data Engineering