Overview
In this episode of The Dashboard Effect, Brick Thompson and Caleb Oaks continue their accessible terminology series, this time focusing on the concepts that describe how data is structured and organized for analytics. The episode is aimed at non-technical listeners who encounter terms like star schema, semantic layer, and dimensional modeling in data conversations and want a working understanding of what they mean and why they matter.
The explanations are clear, concrete, and grounded in how these concepts actually function in practice rather than in abstract definitions. See how Blue Margin’s Managed Data Platform applies Kimball-style dimensional modeling and semantic layer architecture to build data environments that are organized for performance, intuitive for business users, and ready for the analytics and AI use cases that depend on clean, well-structured data.
What This Episode Covers
Kimball Style and Dimensional Modeling (0:22 – 3:20)
The Kimball methodology is one of the foundational approaches to organizing data for analytics. It uses two types of tables: fact tables, which store transactional data like individual sales records, and dimension tables, which provide the descriptive context around those transactions, such as customer details, product information, or time periods. Separating facts from dimensions avoids redundant data storage and creates a structure that is both efficient and intuitive for reporting.
Data Models and Relationships (3:20 – 4:59)
A data model is the structure that defines how tables are organized and how they relate to each other. The relationships between tables, often called joins, are what allow data from multiple sources to be combined and analyzed together. Without a well-defined model, cross-table analysis becomes either impossible or unreliable.
Star vs. Snowflake Schemas (5:00 – 6:57)
A star schema is a denormalized model where a central fact table is surrounded directly by dimension tables, creating a structure that resembles a star. It is straightforward to query and generally performs well for most analytics use cases. A snowflake schema adds further normalization by breaking dimension tables into additional sub-tables, creating a more complex multi-level structure that can reduce data redundancy but introduces additional relationship traversal that can affect query performance.
Semantic Models and Layers (6:57 – 8:40)
A semantic layer sits on top of raw data, such as data stored in a lake, and provides a user-friendly view that translates technical table structures into business-meaningful terms. Rather than requiring a tool like Power BI to interpret raw, cryptically named tables directly, the semantic layer presents the data in a form that both the tool and the business users working with it can navigate intuitively. It is the layer that bridges the technical reality of how data is stored and the practical reality of how it needs to be used.
What Is a View (8:40 – 10:44)
A view is stored SQL code that acts as a virtual table. Rather than physically duplicating data, a view provides an abstraction layer that presents a specific, pre-defined slice of the underlying data as though it were its own table. Views add flexibility between raw, complex data and the reporting layer, allowing transformation logic to be centralized and updated without changing the underlying storage or requiring report developers to work directly with raw source tables.
Who It’s For
This episode is worth your time if you are a business stakeholder who participates in data architecture conversations and wants a clearer understanding of what is being described, a newer analyst or BI developer building foundational knowledge of how data is structured before working with it in a modeling tool, a project manager or product owner working with data teams who wants to follow technical design discussions more effectively, or anyone preparing to evaluate a data platform or vendor proposal and wanting to understand the architectural concepts that will inevitably come up.
Why It’s Worth a Listen
The gap between how data practitioners talk about architecture and how business stakeholders understand it creates real friction in data projects. Decisions get made without full context, requirements get lost in translation, and reviews of technical work become exercises in nodding along rather than genuine evaluation. This episode closes that gap in a format that is accessible without being condescending.
The star versus snowflake distinction is one of those comparisons that comes up frequently in design conversations and rarely gets explained in terms of the practical trade-offs rather than just the structural difference. Understanding why one schema performs better for most analytics use cases while the other reduces redundancy at the cost of complexity gives non-technical stakeholders a basis for asking better questions when those choices are being made.
And the explanation of views as stored SQL that acts as a virtual table is a foundational concept that unlocks a lot of subsequent understanding about how modern data architectures manage the gap between raw source data and the clean, interpretable outputs that reporting depends on. Getting that concept clearly is worth the time this episode requires to deliver it.