#56: ETL & Data Warehouse Architecture with David Dimmen

Overview

In this episode of The Dashboard Effect, Brick Thompson and David Dimmen of Blue Margin provide a technical walkthrough of their ETL and data warehouse architecture approach, breaking down a process that is often described in abstract terms into a clear, linear sequence that both technical and non-technical audiences can follow. The conversation covers each layer of the architecture, from extraction through staging, transformation, and modeling, with enough specificity to make the decisions at each stage understandable rather than assumed.

For data engineers looking for a reference implementation of best practices and for business stakeholders who want to understand what is happening between source data and their dashboards, this episode provides an accessible and technically grounded account of how well-architected data pipelines actually work. See how Blue Margin’s Managed Data Service applies this architecture across client environments to deliver reliable, performant data infrastructure that supports the reporting and analytics the business depends on.

What This Episode Covers

The Three Pillars of ETL (0:39 – 1:12)

The hosts organize the ETL process around three fundamental pillars: data storage, which manages where data resides before and during processing; data movement and orchestration, which is the automated execution of moving data from source systems into the warehouse; and visualization, the final layer where processed data is displayed for business intelligence. The three-pillar framing gives both technical and non-technical audiences a simple map for the more detailed discussion that follows.

Extract and Load with Azure Tools (0:40 – 4:15)

The extraction layer uses tools like Azure Data Factory and Synapse Workspace to pull data from client source systems including NetSuite, Workday, and SQL Server into an Azure data warehouse. The hosts explain the tooling choices in terms of what each tool is optimized for and why those choices reflect the specific requirements of reliable, scheduled data movement rather than just available options in the Azure ecosystem.

Persistent Staging Layer (7:10 – 12:20)

The persistent staging layer creates an exact replica of the source data, which serves two critical purposes: it allows analysts to query source data without affecting the performance of the live transactional system, and it provides a reliable reference point for data validation. The team uses checksums and row counts to verify that data integrity is maintained between the source and the staging layer, catching discrepancies before they propagate into the transformation and reporting layers where they would be harder to diagnose and more impactful to correct.

Incremental Updates for Performance (10:55 – 11:25)

By focusing on incremental updates that bring in only new or changed records since the last run rather than reprocessing the full dataset, the architecture maintains efficiency as data volumes grow. The performance difference between incremental and full loads compounds over time as tables get larger, and the architectural decision to build for incremental from the start rather than retrofitting it later is one of the practices that keeps pipeline runtimes manageable as the platform matures.

Transformation and Modeling (13:25 – 18:05)

The transformation layer converts the normalized OLTP structure optimized for transactional efficiency into a denormalized OLAP structure optimized for analytical performance. SQL views and stored procedures trim unnecessary data, perform required calculations, and create the clean star schema dimensional model that reporting tools like Power BI are designed to work with. The distinction between OLTP and OLAP structures is explained in terms of their respective purposes, which makes the transformation step logical rather than arbitrary for audiences who have not encountered the distinction before.

Agility Through the View Layer (18:15 – 19:25)

The final view layer provides the agility that makes the architecture responsive to changing reporting needs without requiring structural rebuilds. When a developer needs to add a flag, adjust a calculation, or accommodate a new reporting requirement, the change can be made in the view layer without touching the underlying tables. That separation between the stable underlying structure and the flexible presentation layer is what allows the architecture to absorb business changes quickly rather than treating every reporting requirement change as a pipeline modification project.

Who It’s For

This episode is worth your time if you are a data engineer or solutions architect evaluating a reference architecture for ETL and data warehouse design and wanting a practitioner’s account of how the layers fit together and why each decision was made, a BI developer who works at the reporting layer and wants a clearer picture of the architecture upstream that determines the quality and structure of the data they work with, a technology leader responsible for data infrastructure who wants to understand the architectural principles behind the pipeline design choices your team is making, or any business stakeholder who wants a clearer mental model of what is happening between their source systems and their dashboards before the next conversation with their data team.

Why It’s Worth a Listen

The persistent staging layer discussion is the most technically important part of the episode for practitioners evaluating their own architecture. The decision to maintain an exact replica of source data rather than transforming in transit addresses a performance and validation problem that many pipelines encounter as they scale, and understanding why that layer exists changes how teams approach both pipeline design and data quality management.

The OLTP to OLAP transformation explanation is the most conceptually accessible part of the episode for non-technical stakeholders. The distinction between a database designed for transactional efficiency and one designed for analytical performance is foundational to understanding why data needs to be transformed before it reaches a reporting tool, and this episode explains it clearly enough to inform intelligent conversations between business and technical teams about what the pipeline is doing and why.

And the view layer agility discussion is the most practically valuable piece for organizations that need their data infrastructure to respond to changing business requirements without treating every change as a significant engineering effort. The architectural principle of separating stable structure from flexible presentation is what makes a data platform genuinely maintainable over time, and hearing it explained in the context of a specific implementation makes it more actionable than a general best practice statement would be.

Get Expert Insights
in Your Inbox

To subscribe, submit the short form below.

Related Insights