Is your Fabric Deployment Pipeline Changing your Datasets?

As part of our Digital Transformation and COE training services, we work with a variety of organizations to familiarize their teams with the Power Platform, Dynamics, and Microsoft’s Modern Work tools more generally. These days, I’m no longer surprised to find employees (or even whole departments) developing, deploying, and indeed relying on an ecosystem of Power BI reports, whether managed through apps, embed, or other experiences. Woo!

While many of our clients are familiar with the concept of a dev/test/prod setup – or a dev/prod – and may have even implemented one themselves, I’ve found that a surprising amount of their production bugs can be traced back to poor ALM practices, and specifically subpar deployment standards, wherein items like data source misconfiguration at/during deployment, incomplete testing of the production item in dev (“I’m going to have to change it for deployment anyways.”), or simply pushing changes straight into prod without testing make frequent appearances. I attribute a lot of these problems to the need to download the dev report (or use the local, indev copy) and publish directly into a prod workspace.

But wait, no longer! Microsoft has (for the last two plus years) had a tool called – well, likely various things, but currently – Fabric Deployment Pipelines. These are nifty little process maps married to a CI/CD deployment tool that you can link to various workspaces and use to define your application lifecycle and deploy Fabric components from cloud workspace to cloud workspace, no local files involved!

Our Issue

In the case of our client’s data analytics team, while they were excited to use these new pipelines in theory, there was some understandable caution about pointing new functionality at a report that is both highly visible to senior management and consumed by several thousand employees on a daily basis.

Accordingly, we walked them through the process, set up a sample workspace and pipeline, and deployed some legacy copies of their reports that they had shared with us. However, in doing so, we ran into some very odd behavior – despite swearing up and down that there was nothing for them to worry about, that the pipeline would function nearly identically to a regular publish (just without uploading the locally cached data), and that there would be no changes, well… there was something changing. Continuously, copies of one specific dataset would have the data types of the same two fields changed! This, of course, would break a number of visuals that involved those fields in the linked reports, and absolutely sapped any organizational confidence in the tool.

Keep in mind, this was a bog-standard pipeline, with only two stages and no deployment rules on any of the datasets, so to see changes like this was certainly surprising to us, we expected – due to the lack of any mention in the documentation – to Microsoft as well.

The Response

Accordingly, we raised the issue with Microsoft Support and after a lengthy back-and-forth, were surprised to discover that this is “expected behavior” (their words, not mine). Essentially, it is possible for a deployment pipeline to create changes in the data types of datasets during the deployment process, even if no rules are configured on it, due to the fact that data is not dragged along during the deployment.

The way it was explained to us; if we think of a dataset with data in it as a bucket full of water, the ‘Publish from local’ takes a bucket filled with water from your local machine and creates a copy of it in the cloud, in a target workspace.

The deployment pipeline, on the other hand, takes a bucket with water, and creates a copy of just the bucket (no water!) and adds this to a target workspace.

So far, so good – this squares with the behavior experienced by anyone who has used one of these before.

However, it appears that something about the deployment pipeline is akin to passing the bucket through a vacuum, and as there is no water in it, we discovered that it could end up a little deformed on the other side.

The Resolution

Luckily, Microsoft had a simple resolution for us. By running a refresh after deployment, the datatype issues should correct themselves. Or, to continue our analogy, filling our mangled bucket up with water will fix the crushed walls.

To their credit, they were exactly right – by running a refresh after deployment, we saw the appropriate datatypes return. Oddly enough, these were intentionally and manually set datatypes, so I’m not clear what the reason is for those fields to have changed, but as our issue was fixed, we weren’t able to get a ton more ‘why’ out of the Product Team.

Support recognized that this is something that should have already been documented, and in fact referred this point to the Product Team even before we could raise it. However, as that update doesn’t appear to be forthcoming anytime soon, here we are.

Setting up a refresh on the lower side of the deployment appears to have no impact, and redeploying the same dataset doesn’t seem to reintroduce the issue, so this is just something to keep an eye out for on initial deployments of specific datasets. We’ll keep our fingers crossed that it won’t happen to you, but if it does, now you know!


Comments

Leave a Reply

Your email address will not be published. Required fields are marked *