Pro DevEx for Power BI

Power BI Azure Pipelines This is the first article in a series about creating a professional developer experience for working with Power BI. (If that doesn't mean much to you, take a look at the short video at the end of this post 😊).

The series is aimed at people bringing DevOps or DataOps approaches to Power BI report development. Expect version control, deployment pipelines and developer workflow – Microsoft's Enterprise content publishing usage scenario for Power BI gives a great overview of many of the approaches I'll be talking about in detail here. Don't expect beautiful Power BI reports – other people are much better at that than I am!

Developing Power BI reports can be a very manual process. After building a report, you (or someone else in your team) has to publish the report into the right workspace at the right time. You might have multiple workspaces featuring the same content, for example to allow stakeholders to perform user acceptance testing (UAT) before a report is released. To manage this you might by keeping multiple versions of the report's PBIX file, having to track what version is published and where.

DevOps practices, first developed in relation to software engineering projects, use automation to make these processes easier, faster and less error-prone. Frequent, smaller deployments are less likely to fail – and if they do, failure causes fewer problems which can more easily be put right.

In this series I'll be talking a lot about workflow. Loosely speaking, report development workflow is the set of steps you have to take to create or update a Power BI report. This typically starts in Power BI desktop, and doesn't end until users are able to get business value out of a report in the Power BI service.

A good workflow frees a report developer to do the things they like and are good at – understanding business problems, and crafting beautiful reports to help solve them. Anything that that doesn't contribute to that just gets in the way – a bad workflow clutters up the development experience with peripheral tasks like version control, managing environments and performing deployments.

I'll also be talking a lot about tools. Tools aren't workflow! Workflow is a human-centred process – we need to create processes that work for humans, and build tools to support those processes. If the way you work is dictated by your tooling, that's a good indicator that the tools need improvement.

Key components of a DevOps workflow include:

  • Version control. Keeping a history of everything you've done gives you the freedom to experiment in safety. You can always revert to the last “good” version of a report.
  • Automation. Using automation to continuously integrate new development work and deploy it safely removes the fear of change. What you've published and where is always clear.
  • Quality assurance. Automating testing of report features enables you to identify errors before your users do. It also allows you to prevent previous problems from coming back (regression errors).

There are (at least) two good reasons to care about workflow:

  • Developer happiness. Supported by a good workflow, a report developer can focus on the things they're good at – things they like, and things which add value. Happy developers are more engaged, make fewer mistakes, stick around for longer and contribute to (even) better workflow 😊.

  • Business happiness. A good workflow enables responsive development – allowing frequent, well-controlled improvements to reports for business users. Built-in testing reduces errors, bolstering users' trust in reports and keeping them engaged.

Here's an example Power BI development workflow, intended to enable report developers to focus on their core activity – developing reports. I'm purposely ignoring specific tools for a moment, so that I can concentrate on the process.

  1. A report developer obtains a copy of the latest version of a report from the team's version control system.

  2. The dev makes changes to the report, isolated from the main code base. Isolation means that:
    • The latest, published version of the report won't change until report revisions are approved in UAT.
    • In the meantime, other devs can be working on different parts of the same report.

  3. The dev saves the changed report into version control, which triggers a number of tests to run automatically. If the tests pass, the report is deployed automatically into a private workspace for review by other team members.

  4. After peer review and approval, the report is automatically published into a UAT workspace for review by business stakeholders.

  5. After stakeholder review and approval, the report is automatically published into its production workspace.

In this workflow, the report developer's efforts are focussed on building the report, and on working to improve it with colleagues and stakeholders – everything else happens automatically.

A workflow for Power BI like the one I just described might sound impossible! Common obstacles include:

  • Version control is often manual – maybe you have to store multiple copies of a report in a folder somewhere.
  • Version control is often linear. Maybe you're using OneDrive or SharePoint to store PBIX files, but version numbers only go one way!
  • Parallel development isn't possible because you can't merge changes back into a single PBIX file.
  • The burden of testing only ever goes up, because you're continually having to check that previous faults haven't come back. Every bug you fix adds to the number of regression tests you need to perform.
  • You have to publish reports manually – you need to make sure you're publishing the right version, and keep track of which version is deployed to where.
  • You have to publish reports multiple times, with the same manual steps each time. Even if you're using Power BI deployment pipelines you have plenty of opportunities to deploy the wrong report at the wrong time. You're only human, after all!

The good news is that there are already a bunch of tools available to remove many of these obstacles 🎉 – you can get much closer to a workflow like the one I've described than you might think.1) Doing that is the goal of this series of articles 😊.

In the rest of this post I'll set up automation for a very basic workflow:

  1. A report developer makes changes to a report in Power BI desktop.

  2. The dev saves the changed report into version control, which publishes it automatically into a Power BI workspace.

This won't overcome the obstacles I've described above, but it illustrates what's possible and provides a foundation we can build on.

I'll be deploying this Power BI report, based on Microsoft's Adventure Works sales sample:

The original report uses a spreadsheet data source – I've refactored my version to use the AdventureWorksLT sample database. (The report looks empty because the database doesn't contain much data, but it's enough to demonstrate a report deployment process). You can create the sample DB as an Azure SQL Database directly in the Azure portal.

The PBIX file for this report is available on GitHub – there's a link at the end of this post.

GitYou're probably already finding some way of version controlling your Power BI reports – maybe you have a folder full of backup copies, maybe you're using SharePoint or OneDrive. Maybe you're already using a fully-fledged version control system – for example Git. I'll be using Git for a variety of reasons:

  • it's well-supported by common automated deployment tools like Azure Pipelines
  • it supports easy branching and merging for parallel development
  • it's fast
  • it's what pretty much everyone is using.

I'll give a brief introduction here in case you're new to Git – if you need more information there are many great tutorials already available.

From https://www.atlassian.com/git/tutorials/using-branches Git stores files in a repository (or “repo”) which tracks changes to individual files. A central copy of a repo is usually shared between team members by hosting it in a Git service like GitHub or Azure Repos. A branch contains a history of changes for files – a repo's default branch, usually called main, contains the history of completed development work.

It's good practice to avoid making changes directly in main – instead, development work takes place in other branches, in your own local copy of the repo. Branches are created specifically for individual developments and are referred to as feature branches. Like main, feature branches also track history, giving you a safety net for your own development – you can make changes experimentally, knowing that you can always roll back or abandon them without affecting main.

When a piece of development work is complete, your changes can be integrating by pushing your local branch to the Git service, then merging the feature branch into main, usually via a pull request. A pull request gives other team members the opportunity to review your proposed changes, helping you to find additional improvements or identify errors you may have overlooked.

To work with Git you will need first to install a Git client, such as Git for Windows.

In this section, I'll build an Azure DevOps pipeline to deploy my sample Power BI report into the Power BI service. Azure DevOps pipelines are triggered by changes to specified branches and folders in a Git repository – I'll be using this behaviour to deploy my report automatically as soon as it's committed to Git. Azure Pipelines can work with multiple Git service providers, and I'll be using GitHub to host my reports repository and share it.

I'm using Azure DevOps for pipelines instead of GitHub Actions simply because it has a few features I prefer. You can implement almost everything I'm doing in here with GitHub Actions, and the service is rapidly evolving.

You can write an Azure DevOps pipeline using the online “classic” editor or by defining it in a YAML file. A major advantage of using YAML is that your pipeline definition is stored as a “.yaml” file in your git repo – so the pipeline definition itself is kept under source control.

All reports & code for this series of articles is available on GitHub, in the powerbi-pro-devex-series folder of my “community” repo (link below).

In this article I'll be creating a DevOps pipeline for Power BI report deployment using a pre-existing YAML pipeline definition file. The pipeline I create in Azure DevOps is a “wrapper” for the pipeline definition – it's what tells the Azure Pipelines service that the YAML file is to be used as a pipeline definition.

DevOps pipelines are created in projects which in turn belong to organizations (collections of projects). If you don't already have an Azure DevOps organization you can sign up for free – if like me you're using a GitHub repo you should sign up with a GitHub account.

Once you have created and/or chosen your Azure DevOps organization and project, browse to the project homepage and click the Pipelines button in the left-hand sidebar, followed by New pipeline. This launches the four-step pipeline creation wizard:

Step 1: Connect. I select GitHub as the location of my git repo. You might be redirected to GitHub to sign in – if so, enter your GitHub credentials.

Step 2: Select. I choose the repo where the YAML pipeline definition is stored, alongside the report's PBIX file. You might be redirected to GitHub to install the Azure Pipelines app in your own repo – if so, select Approve and install.

Step 3: Configure. I choose Existing Azure Pipelines YAML file – this will allow me to choose an existing YAML pipeline definition file from my GitHub repo.

Step 4: Review. I've selected file “pipeline.yaml” in the repo's “powerbi-pro-devex-series/01-FirstPipeline folder” – this is a file I created earlier and committed to the repo. In the top right I have the option to Run the new pipeline, but for now I just click on the down arrow so I can Save the pipeline without running it.

Here's the contents of the “pipeline.yaml” file (with line numbers here for reference):

  1. trigger:
  2.   branches:
  3.   include:
  4. - '*'
  5.   paths:
  6.   include:
  7. - powerbi-pro-devex-series/01-FirstPipeline/ExecutiveSummary.pbix
  8.  
  9. variables:
  10.   folderPath: $(System.DefaultWorkingDirectory)/powerbi-pro-devex-series/01-FirstPipeline
  11.  
  12. pool:
  13.   vmImage: ubuntu-latest
  14.  
  15. steps:
  16. - task: PowerShell@2
  17.   displayName: Publish Power BI report
  18.   inputs:
  19.   targetType: filePath
  20.   filePath: $(folderPath)/Deploy-PbiReport.ps1
  21.   arguments: >
  22. -ReportName "Executive Summary"
  23.   -WorkspaceName "AdventureWorks Reports"
  24.   -PbixFilePath "$(folderPath)/ExecutiveSummary.pbix"
  25.   failOnStderr: true
  26.   env:
  27.   AZURE_TENANT_ID: $(AzureTenantId)
  28.   AZURE_CLIENT_ID: $(AzureClientId)
  29.   AZURE_CLIENT_SECRET: $(AzureClientSecret)

The trigger section at the top of the pipeline definition controls when the pipeline will be run. The paths value configured here means that the a pipeline run is only triggered when a change is made to the file “powerbi-pro-devex-series/01-FirstPipeline/ExecutiveSummary.pbix” in the GitHub repo – changes to any other file will not cause the pipeline to run. (The branches filter is required when a paths filter is specified, but the wildcard value * given here matches every branch name).

When a pipeline run is triggered, each job in the pipeline's definition is executed by an agent. An agent is a virtual machine – of the type indicated by pool (lines 12-13) – containing the software required to run the pipeline. A pipeline consists of one or more jobs grouped into stages. This simple pipeline specifies no stage or job details, so its steps are executed in a single job.

The actions taken by the pipeline are defined in the steps section (from line 15). There are two steps in this pipeline.

  1. The contents of the Git repository are downloaded onto the build agent. This first step happens automatically (without being specified explicitly).

  2. A PowerShell@2 task is executed to run a PowerShell script called Deploy-PbiReport.ps1 – the agent obtained this script file and the report PBIX when the repo was cloned in step 1.

    The task uses four variables: folderPath, AzureTenantId, AzureClientId and AzureClientSecret, referenced using the syntax $(variableName). The first of these is defined in the variables section on lines 9 & 10 (which in turn references system variable System.DefaultWorkingDirectory). The other three variables contain secret credential information – I'll define them later, before I run the pipeline.

The script Deploy-PbiReport.ps1 called by the pipeline's PowerShell@2 task is responsible for deploying the PBIX file to the Power BI service. As the task definition suggests, it requires as parameters the name of the report, the workspace into which it will be published and the PBIX file containing the report definition. Credential values – contained in the AzureTenantId, AzureClientId and AzureClientSecret variables – are provided as environment variables, which prevents them from appearing in the pipeline's execution log.

  1. param(
  2. [Parameter(Mandatory = $true)]
  3. [ValidateNotNullOrEmpty()]
  4. [string] $ReportName,
  5.  
  6. [Parameter(Mandatory = $true)]
  7. [ValidateNotNullOrEmpty()]
  8. [string] $WorkspaceName,
  9.  
  10. [Parameter(Mandatory = $true)]
  11. [ValidateNotNullOrEmpty()]
  12. [string] $PbixFilePath
  13. )
  14.  
  15.  
  16. # install PowerShell modules for Power BI
  17. Install-Module -Name "MicrosoftPowerBIMgmt.Workspaces" -AllowClobber -Force -Scope CurrentUser
  18. Install-Module -Name "MicrosoftPowerBIMgmt.Reports" -AllowClobber -Force -Scope CurrentUser
  19.  
  20. # log into Power BI
  21. $secureClientSecret = ConvertTo-SecureString $Env:AZURE_CLIENT_SECRET -AsPlainText -Force
  22. $credentials = New-Object PSCredential($Env:AZURE_CLIENT_ID, $secureClientSecret)
  23. Connect-PowerBIServiceAccount -Tenant $Env:AZURE_TENANT_ID -ServicePrincipal -Credential $credentials | Out-Null
  24.  
  25. # publish the report
  26. New-PowerBIReport `
  27. -Path $PbixFilePath `
  28. -Name $ReportName `
  29. -Workspace (Get-PowerBIWorkspace -Name $WorkspaceName) `
  30. -ConflictAction CreateOrOverwrite

The script:

The credentials specified by pipeline variables AzureTenantId, AzureClientId and AzureClientSecret identify and authenticate an Azure service principal which is authorised to make Power BI deployments. Here's how to set that up.

  1. Create a service principal by creating an app registration in Azure Active Directory. Make a note of its application (client) ID. Add a client secret and make a note of that somewhere safe. You don't need to add a redirect URI or to create any other credentials.

  2. Enable service principal access to your Power BI tenant (in Admin portal → Tenant settings → Developer settings → Allow service principals to use Power BI APIs):

    It's good practice to restrict access to specific security groups – if you're going to do that, you'll first need to create an AAD group and add the service principal from step 1 as a member. In my tenant I'm using a group called “PowerBiDeploymentPrincipals”.

  3. Assign Contributor access to the service principal in the target Power BI workspace:

    I've assigned access to the individual service principal, not to the group shown in step 2 – I may want to create additional deployment principals later, without allowing them all to access the same workspaces.

  4. Finally, create secret variables in the Azure DevOps deployment pipeline. Edit the pipeline, then use the Variables button (next to the Run button) to modify variables:

    Create three secret variables:

    • AzureTenantId containing the ID of your Azure tenant (available from the Azure Active Directory blade in the Azure portal)
    • AzureClientId containing the application (client) ID of the service principal created in step 1
    • AzureClientSecret containing the client secret you created for the service principal, also in step 1.

Now all the plumbing is in place, I can update my sample report in Power BI desktop. After making my changes, I sync the changes to GitHub, and the Azure DevOps pipeline automatically deploys them into Power BI. For simplicity at this stage, I'm going to make changes directly to the main branch, but I'll improve on that in a future post.

This short video shows the workflow in action 😃:

In this article I outlined a possible workflow for Power BI development, intended to reduce the overhead of report management, releasing developers to concentrate on the core task of creating great reports.

I started to develop tooling to support the workflow, using version control in Git and Azure DevOps pipelines, but as I anticipated earlier, the pipeline I've built here has its problems:

  • The PBIX file in version control contains data. This makes it larger than necessary and would pose a security risk for sensitive data.
  • The PBIX file is a binary blob – Git can't show me meaningful differences between versions, and parallel development by other report developers isn't possible.
  • I have to use a separate Git client – in the video I used VS Code to sync my changes to GitHub. This is a tooling problem! A better solution would be if PBI Desktop was aware of my Git repo and had integrated Sync capability. (Microsoft have said that this is on the roadmap – you can add your voice by upvoting the idea).
  • The workflow here doesn't include any testing, whether manual or automated.
  • I've only considered one environment – I want to support multiple environments, for example so that I can enable UAT to take place before a report is published.

The good news is that many of these problems can be solved 😊. In the next post I start to do that by decoupling the report's data model from its visualizations, allowing it to be stored without data and modified by multiple developers simultaneously.

  • Next up: In the next post, I look at maintaining dataset definitions as code, without data. These can be developed in parallel, deployed automatically and shared between multiple reports.

  • Code: The code for the series is available on Github. The three files specific to this article (the Power BI report, deployment pipeline YAML file and PowerShell script) are in the powerbi-pro-devex-series/01-FirstPipeline folder.

  • Share: If you found this article useful, please share it!


1)
I say like the one I've described, simply because workflow is team-specific – your team needs to determine the benefits required from its workflow, design a workflow that delivers them, and choose tools that support it.