How to deploy and configure Databricks using Bicep and Microsoft Desired State Configuration
Learn how to build an end-to-end solution using Bicep and Microsoft DSC
When managing Databricks infrastructure and configurations, teams often use Terraform as the main option. While the tool is effective, it couples infrastructure provisioning with configuration management.
An alternative approach to separating these concerns uses Azure Bicep and Microsoft Desired State Configuration (DSC). Azure Bicep is used for infrastructure provisioning, whilst Microsoft DSC configures Databricks-specific settings.
This how-to guide demonstrates:
- Deploy a Databricks workspace using Bicep and Azure Verified Modules (AVM).
- Configure Databricks resources using the
DatabricksDscPowerShell module. - Automate the entire process with Azure Pipelines.
Prerequisites
Before you begin, ensure you have:
- An active Azure subscription with permissions to create Azure resources.
- Azure CLI or Azure PowerShell installed locally.
- PowerShell 7.2 or later.
- An Azure DevOps account if you want to deploy it through Azure Pipelines.
Deploy Databricks workspace with Bicep
The first step is to provision the Databricks workspace infrastructure. To simplify the process, you can use Azure Verified Modules (AVM), which offer pre-built, tested modules for many Azure services.
Create the following Bicep file that references the AVM resource group and Databricks workspace module.
// main.bicep
targetScope = 'subscription'
@description('The name of the Databricks workspace')
param workspaceName string
@description('The Azure region for deployment')
param location string = 'westeurope'
@description('The name of the resource group')
param resourceGroupName string = 'rg-databricks'
module rg 'br/public:avm/res/resources/resource-group:0.4.2' = {
scope: subscription()
params: {
name: resourceGroupName
location: location
}
}
module databricksWorkspace 'br/public:avm/res/databricks/workspace:0.11.2' = {
scope: resourceGroup(resourceGroupName)
params: {
name: workspaceName
location: location
skuName: 'premium'
}
}
output workspaceUrl string = 'https://${databricksWorkspace.outputs.workspaceUrl}'
Save this file as main.bicep in your repository. The Bicep template creates both the resource group and the Databricks workspace.
To deploy the resource, use the Azure CLI or Azure PowerShell:
Using Azure CLI:
az deployment sub create `
--template-file main.bicep `
--parameters workspaceName=<your-workspace-name>Using Azure PowerShell:
New-AzSubscriptionDeployment `
-TemplateFile main.bicep `
-workspaceName <your-workspace-name>The deployment creates the workspace and outputs the workspace URL, which you'll use for configuration in the next step.
Install DatabricksDsc module
After the infrastructure is deployed, you can use the DatabricksDsc PowerShell module to manage workspace configuration. This module provides resources for:
- Managing users
- Service principals
- Cluster policies
- Permissions
Install the module from the PowerShell Gallery:
Install-PSResource -Name DatabricksDsc, PSDesiredStateConfiguration -Repository PSGallery -TrustRepository -Scope AllUsersThe PSDesiredStateConfiguration module can be used to verify the installation by listing available DSC resources:
Get-DscResource -Module DatabricksDscThe command displays resources such as DatabricksUser, DatabricksServicePrincipal, DatabricksClusterPolicy, and DatabricksClusterPolicyPermission.
Install Microsoft DSC
The DatabricksDsc module is installed. You can now install dsc.exe by using one more module.
Install the PSDSC PowerShell module from the PowerShell Gallery and execute Invoke-DscExe:
# Install module
Install-PSResource -Name PSDSC -Repository PSGallery
# Install dsc.exe
Install-DscExeWhen the installation is successful, you can run dsc --version to display the version installed.

dsc.exe versionCreate DSC configuration document
Microsoft DSC configurations define the desired state of your Databricks workspace. Microsoft DSC uses YAML and JSON, with YAML often preferred because it's easier to read.
You can create a configuration document that defines workspace users and service principals. This example shows how to configure both types of identities with entitlements:
# databricks-config.dsc.config.yaml
$schema: https://aka.ms/dsc/schemas/v3/bundled/config/document.json
resources:
- name: Configure workspace identities
type: Microsoft.DSC/PowerShell
properties:
resources:
- name: Data engineering user
type: DatabricksDsc/DatabricksUser
properties:
WorkspaceUrl: 'https://adb-1234567890123456.12.azuredatabricks.net'
AccessToken: '[envvar("DATABRICKS_TOKEN")]'
UserName: 'dataengineer@youremail.com'
DisplayName: 'Data Engineer'
Active: true
Emails:
- Value: 'dataengineer@youremail.com'
Type: 'work'
Primary: true
Name:
GivenName: 'Data'
FamilyName: 'Engineer'
Entitlements:
- Value: 'allow-cluster-create'
- Value: 'workspace-access'
- name: Analytics service principal
type: DatabricksDsc/DatabricksServicePrincipal
properties:
WorkspaceUrl: 'https://adb-1234567890123456.12.azuredatabricks.net'
AccessToken: '[envvar("DATABRICKS_TOKEN")]'
ApplicationId: '12345678-1234-1234-1234-123456789012'
DisplayName: 'Analytics Service Principal'
Active: true
Entitlements:
- Value: 'allow-cluster-create'
- name: Data science cluster policy
type: DatabricksDsc/DatabricksClusterPolicy
properties:
WorkspaceUrl: 'https://adb-1234567890123456.12.azuredatabricks.net'
AccessToken: '[envvar("DATABRICKS_TOKEN")]'
PolicyName: 'data-science-policy'
Definition:
spark_version:
type: unlimited
defaultValue: 'auto:latest-lts'
node_type_id:
type: allowlist
values:
- Standard_DS3_v2
- Standard_DS4_v2
autotermination_minutes:
type: fixed
value: 30
- name: Data science policy permissions
type: DatabricksDsc/DatabricksClusterPolicyPermission
properties:
WorkspaceUrl: 'https://adb-1234567890123456.12.azuredatabricks.net'
AccessToken: '[envvar("DATABRICKS_TOKEN")]'
ClusterPolicyId: '[reference(resourceId("DatabricksDsc/DatabricksClusterPolicy", "data-science-policy")).Id]'
AccessControlList:
- UserName: 'dataengineer@youremail.com'
PermissionLevel: 'CAN_USE'
- ServicePrincipalName: '12345678-1234-1234-1234-123456789012'
PermissionLevel: 'CAN_USE'Save this file as databricks-config.dsc.yaml. The configuration uses an ARM function to read the environment variable where you store your authentication and authorization token.
Apply DSC configuration document
With the configuration document created, you can now apply the document to your Databricks workspace using dsc.exe. The engine reads the YAML file, evaluates the current state, and makes the changes to achieve the desired state.
Before you can apply the configuration document, you've to set the environment variable. You can use the Get-AzAccessToken command or az account get-access-token:
Use Azure CLI:
$env:DATABRICKS_TOKEN = az account get-access-token --resource 2ff814a6-3304-4ab8-85cb-cd0e6f879c1d --query accessToken -o tsv | ConvertTo-SecureString -AsPlainText -ForceUse Azure PowerShell:
$env:DATABRICKS_TOKEN = (Get-AzAccessToken -ResourceUrl 2ff814a6-3304-4ab8-85cb-cd0e6f879c1d).TokenTime to apply the configuration document:
dsc config set --file databricks-config.dsc.yamlThe command processes each resource in order. First, users are created, then service principals, and finally, the cluster policies are defined.
To verify the configuration without making changes, use the test operation:
dsc config test --file databricks-config.dsc.yamlThis command returns whether the current state matches the desired state, which is helpful for validation.
Automate with Azure Pipelines
Combining both Bicep and DSC configurations in an Azure Pipeline gives you the complete solution you're looking for. The pipeline deploys the workspace and applies the DSC configuration in a single automated workflow.
The following YAML represents an Azure Pipeline that orchestrates both:
variables:
azureSubscription: 'your-service-connection-name'
resourceGroupName: 'rg-databricks-prod'
workspaceName: 'dbw-prod-001'
location: 'westeurope'
vmImage: 'ubuntu-latest'
stages:
- stage: DeployInfrastructure
displayName: 'Deploy infrastructure'
jobs:
- job: DeployBicep
displayName: 'Deploy Bicep template'
pool:
vmImage: $(vmImage)
steps:
- checkout: self
- task: AzurePowerShell@5
displayName: 'Deploy infrastructure'
inputs:
azureSubscription: $(azureSubscription)
azurePowerShellVersion: 'LatestVersion'
scriptType: 'inlineScript'
inline: |
$deployment = New-AzSubscriptionDeployment `
-Location $(location) `
-TemplateFile main.bicep `
-workspaceName $(workspaceName) `
-location $(location) `
-resourceGroupName $(resourceGroupName)
$workspaceUrl = $deployment.Outputs.workspaceUrl.Value
Write-Host "Workspace URL: $workspaceUrl"
Write-Host "##vso[task.setvariable variable=workspaceUrl;isOutput=true]$workspaceUrl"
name: deploymentOutput
- stage: ConfigureWorkspace
displayName: 'Configure Databricks Workspace'
dependsOn: DeployInfrastructure
condition: succeeded()
jobs:
- job: ApplyDSC
displayName: 'Apply DSC Configuration'
pool:
vmImage: $(vmImage)
variables:
workspaceUrl: $[ stageDependencies.DeployInfrastructure.DeployBicep.outputs['deploymentOutput.workspaceUrl'] ]
steps:
- checkout: self
- task: PowerShell@2
displayName: 'Install dsc executable and DatabricksDsc'
inputs:
targetType: 'inline'
pwsh: true
script: |
# Install required modules
Install-PSResource -Name PSDSC, DatabricksDsc -Repository PSGallery -TrustRepository -Scope AllUsers -Reinstall
# Install dsc.exe
Install-DscExe -Force
Write-Host "DSC executable and DatabricksDsc module installed."
- task: AzurePowerShell@5
displayName: 'Apply DSC configuration document'
inputs:
azureSubscription: $(azureSubscription)
azurePowerShellVersion: 'LatestVersion'
scriptType: 'inlineScript'
inline: |
# Get Databricks access token using Azure authentication
$token = (Get-AzAccessToken -ResourceUrl '2ff814a6-3304-4ab8-85cb-cd0e6f879c1d').Token
$env:DATABRICKS_TOKEN = $token
Write-Host "Applying DSC configuration to workspace: $(workspaceUrl)"
$result = dsc config set --file databricks-config.dsc.yaml
if ($LASTEXITCODE -eq 0) {
Write-Host "##[section]DSC configuration applied successfully"
Write-Host $result
} else {
Write-Error "##[error]DSC configuration failed with exit code: $LASTEXITCODE"
Write-Host $result
exit 1
}
Summary
This guide showcased how you can use Bicep, DatabricksDsc, and Microsoft DSC to build an end-to-end solution for Databricks.
It leveraged the Azure Verified Modules to deploy the Azure resources. Then, Microsoft DSC, the declarative configuration management tool, was used to configure Databricks users, cluster policies, and service principals.
By making this separation, you gain a clear ownership of boundaries and the ability to use native Azure tooling alongside Microsoft DSC for Databricks-specific settings.