How to Structure Your Data for Microsoft Power BI Copilot

Overview

In this episode of The Dashboard Effect, Brick Thompson and Caleb Oaks make a point that cuts through a lot of the noise around AI analytics tools: the requirements for making Power BI Copilot and similar tools work reliably are not new. They are the same dimensional modeling best practices that have defined good BI work for decades, applied with the additional rigor that AI demands because it has no tolerance for ambiguity that a human analyst would quietly resolve on their own.

The episode is a practical checklist for any data team preparing a semantic model for AI-assisted querying, grounded in the Kimball methodology and organized around the specific failure modes that poor data structure creates when an LLM tries to interpret it. See how Blue Margin’s Managed Data Platform builds the well-structured, consistently named, and properly governed semantic models that give AI tools like Copilot the foundation they need to produce results that can actually be trusted.

What This Episode Covers

Table Linking (0:45)

A clean, well-defined semantic model with proper relationships between fact and dimension tables is the foundation everything else depends on. AI tools navigate the model through those relationships, and gaps or ambiguities in how tables are connected produce queries that either fail or return results that cannot be trusted.

Standardized Calculation Logic (1:47)

Metrics like total sales need to be defined once and consistently throughout the model. When the same concept is calculated differently in different measures, an AI tool generating queries against that model will produce conflicting outputs depending on which calculation it references. Standardization is not just good practice. It is what makes AI-generated results coherent.

Naming Conventions (2:48)

LLMs translate natural language queries into model references by matching user intent to the names they find in the semantic layer. Descriptive, user-friendly names like Average Customer Rating give the model significantly more to work with than cryptic abbreviations like AVG Rating. The names in a model are effectively the vocabulary an AI uses to understand what it is working with, and that vocabulary should be written for clarity.

Fact vs. Dimension Delineation (4:45)

Clearly separating fact and dimension tables using naming prefixes such as F_ and D_ helps the AI understand the structural role each table plays. This delineation reduces the risk of the model treating a dimension as a fact or vice versa, which produces aggregation errors that are difficult to diagnose after the fact.

Hierarchies (6:02)

Explicitly defined hierarchies, such as Year, Quarter, and Month in sequence, enable logical drill-down behavior that AI tools can navigate correctly. Without those definitions, Copilot has no reliable basis for understanding the natural groupings and relationships within time or categorical dimensions, which limits what it can do with drill-through queries.

Data Quality (7:36 – 9:15)

Inconsistent data types and unstandardized values are among the most common causes of AI output failures. An entry that appears as both on-hold and on hold in the same field, for example, will be treated as two distinct values by an aggregation engine that has no context for resolving the inconsistency. AI tools surface these issues rather than quietly correcting them, which means data quality problems that a human analyst might overlook become visible and disruptive in an AI-assisted environment.

KPI Definitions (10:21)

Key performance indicators need to be clearly documented and defined within the model itself, not just understood by the team that built it. When an AI tool is asked about a KPI, it should be able to locate an unambiguous definition rather than inferring one from how the metric is used elsewhere in the model.

Refresh Schedules and Data Freshness (10:38)

Including metadata about when data was last refreshed allows the AI to communicate the relevance of its outputs accurately. Without that context, a user asking about current performance may receive an answer drawn from data that is days or weeks old with no indication that this is the case.

Security and Metadata (11:41)

Robust row-level security and rich metadata are not optional additions. They are what allow an AI tool to handle queries correctly across different user contexts and data access levels. Metadata provides the contextual scaffolding that helps an LLM interpret what it is working with, and security ensures that the AI does not surface data to users who should not have access to it.

Who It’s For

This episode is worth your time if you are a Power BI developer or data modeler preparing a semantic model for Copilot or similar AI-assisted querying tools, a data team lead evaluating how ready your current model is to support AI analytics and where the gaps are most likely to cause problems, a technology leader trying to understand why AI analytics tools are producing unreliable outputs despite a significant infrastructure investment, or any organization that has enabled Copilot or a similar tool and found that the results are inconsistent or difficult to trust.

Why It’s Worth a Listen

The framing that AI readiness is really just disciplined data modeling applied more rigorously is one of the most useful reframes available to teams that feel overwhelmed by the AI preparation conversation. It converts an unfamiliar challenge into a familiar one and points directly at the work that needs to be done rather than leaving teams to interpret vague guidance about being AI-ready.

The data quality section is particularly valuable because it highlights a category of problem that human analysts have always been able to work around silently. AI tools do not have that capability. They treat on-hold and on hold as genuinely different things, and the outputs reflect that. Organizations that surface and fix those inconsistencies in preparation for AI adoption will find that the same fixes also improve the reliability of their existing reporting, which is a benefit that accrues immediately rather than waiting for AI to deliver it.

And the overall conclusion holds up as a principle worth internalizing: poor data structure produces poor AI outputs, and no amount of AI sophistication compensates for a semantic model that lacks the rigor the technology requires to function as intended.

Here is the Microsoft documentation referenced: https://learn.microsoft.com/en-us/power-bi/create-reports/copilot-evaluate-data

Get Expert Insights in Your Inbox

To subscribe, submit the short form below.