Catch-22: Automating MSI access to an Azure SQL Database
A problem I ran into recently is how to automate granting access to SQL databases in different environments. I'm using Terraform to build parallel data engineering environments which – amongst other things – include one or more SQL databases and instances of Azure Data Factory (ADF). ADF pipelines need access to databases, and I want to authenticate linked service connections using the factory's managed identity (MSI). Building this automatically is harder than it sounds!
Building the environment
Here's an extract from an example Terraform file that creates some Azure resources:
- a logical SQL Server
- an Azure SQL database
- an instance of Adure Data Factory
- a resource group containing everything.
resource "azurerm_sql_server" "sql" { name = "${var.environment_name}-sql" location = azurerm_resource_group.rg.location resource_group_name = azurerm_resource_group.rg.name version = "12.0" administrator_login = "sqladmin" administrator_login_password = random_password.sql_admin_pwd.result tags = var.tags } resource "azurerm_sql_database" "db" { name = var.database_name resource_group_name = azurerm_sql_server.sql.resource_group_name location = azurerm_sql_server.sql.location server_name = azurerm_sql_server.sql.name edition = "Basic" tags = var.tags } resource "azurerm_data_factory" "adf" { name = "${var.environment_name}-adf" location = azurerm_resource_group.rg.location resource_group_name = azurerm_resource_group.rg.name tags = var.tags identity { type = "SystemAssigned" } } resource "azurerm_resource_group" "rg" { name = "${var.environment_name}-rg" location = var.resource_location tags = var.tags }
There's a bit more in the full Terraform definition (defining variables, generating a random password), but this is the major stuff.
I'm running the Terraform from a DevOps pipeline – this is convenient because I can trigger it automatically, for example when a feature branch is created, or a change is made to a standard environment definition. These three TerraformCLI@0
pipeline tasks apply the Terraform to create a brand new environment:
- task: TerraformCLI@0 displayName: Initialise Terraform inputs: command: init workingDirectory: $(WorkingDirectory)/terraform - task: TerraformCLI@0 displayName: Create Terraform plan inputs: command: plan workingDirectory: $(WorkingDirectory)/terraform environmentServiceName: $(PipelineServiceConnection) commandOptions: -out=tf.plan env: TF_VAR_environment_name: $(Environment) TF_VAR_database_name: $(DatabaseName) - task: TerraformCLI@0 displayName: Terraform environment inputs: command: apply workingDirectory: $(WorkingDirectory)/terraform environmentServiceName: $(PipelineServiceConnection) commandOptions: tf.plan
Again, the rest of the pipeline definition is available in the code.
Granting SQL access to ADF
One thing that I haven't done in the Terraform is to grant Azure SQL database access to the data factory MSI. The reason for this is simply that I can't – SQL database users are created inside the SQL database, and Terraform doesn't provide for modification of anything below the level of the database itself.
So how to grant it afterwards? An attractive approach is to use the SqlAzureDacpacDeployment@1
task to do something like this:
- task: TerraformCLI@0 displayName: Get Terraform outputs inputs: command: output workingDirectory: $(WorkingDirectory)/terraform environmentServiceName: $(PipelineServiceConnection) - task: SqlAzureDacpacDeployment@1 displayName: Grant SQL access to Data Factory MSI inputs: azureSubscription: $(PipelineServiceConnection) authenticationType: aadAuthenticationIntegrated serverName: $(TF_OUT_sql_server_name).database.windows.net databaseName: $(DatabaseName) deployType: InlineSqlTask sqlInline: | CREATE USER [$(TF_OUT_data_factory_name)] FROM EXTERNAL PROVIDER ALTER ROLE [db_owner] ADD MEMBER [$(TF_OUT_data_factory_name)]
This fails.
The error message reported in the DevOps pipeline output (you'll like this 😂) is “One or more errors occurred.” After some investigation1) I found that the SqlAzureDacpacDeployment@1
task just doesn't support Azure AD integrated authentication.
This isn't the only authenticationType
option available to the task, so I tried again using the Azure SQL database's administrator SQL login. This fails too, with the better message “Only connections established with Active Directory accounts can create other Active Directory users.” – SQL logins don't have access to AAD. This is the Catch-22 from the post's title:
- I need to connect as an AAD service principal to create an AAD database user
- The task won't allow me to connect as an AAD service principal.
The problem isn't unique to Azure Data Factory – it crops up when I want to automate granting Azure SQL database-level access to any AAD service principal.
Solutions
I call these solutions but really they're workarounds – what I'd like most is a SqlAzureDacpacDeployment@1
task that works with AAD integration out of the box.
Grant MSI access manually
Depending on how repeatable you need your process to be, granting MSI access manually in the Azure portal may be the simplest solution – this is the classic automation tradeoff between the work required to automate a task and the time saved by doing so.
If you're just Terraforming your test and production environments so that you can version control their definitions, granting access manually might not be much overhead. If you're building a disposable dev environment automatically every time you create a feature branch, this probably isn't a great choice.
Use SQL authentication from Azure Data Factory
The reason I want to grant SQL access to ADF's MSI is so that I can use the identity for authentication in Azure SQL Database linked services – but I could choose SQL Server authentication instead. A convenient way to manage this is to Terraform a key vault secret containing the administrator password or a full connection string.
If you don't want ADF to use the administrator account, you can create its own SQL account using SqlAzureDacpacDeployment@1
with admin credentials (downloaded from the vault to the pipeline using the AzureKeyVault@1
task).
Use a private build agent
This is, after all, what Microsoft advise. If you're having and/or forgetting to grant access manually all the time, this is probably the simpler of the automation choices open to you – provision an Azure VM, configure it as a build agent, and use it to run the SqlAzureDacpacDeployment@1
with integrated AAD security.
Authenticate using OAuth2
Vidar Kongsli put together this imaginitive solution. I haven't tried it (that automation tradeoff again!), but it does seem to solve the general problem of granting AAD access to SQL databases.
Your pipeline service connection needs to be an AAD admin for your Azure SQL Server – that would be necessary anyway if it were possible simply to use SqlAzureDacpacDeployment@1
with AAD-integrated authentication.
Lobby Microsoft
OK, this won't solve your problem right away, but you, Vidar & I are not the only people who've run into this. DevOps pipeline tasks are open-source and hosted on GitHub, so why not go and upvote this issue? 😀
Summary
You can't use the SqlAzureDacpacDeployment@1
task to create AAD database users from a DevOps pipeline. This can be inconvenient, but while we wait for Microsoft to fix it I've summarised a few workarounds here.
Code: The full Terraform and DevOps pipeline YAML for this post are available on
Github.
Share: If you found this article useful, please share it!
Comments
I've been caught up with this issue. To create a MI within SQL Database in an automated way.
It was not quite possible, as the scripts to create MSI only deal with SQL Auth which is not supported. but, AD auth MSI creation not checking possibilities to automate