Configure RBAC with identity provider (for example, Azure AD) groups and Databricks ACLs
This feature extends Unravel RBAC for Databricks environments where Azure Active Directory (Azure AD) groups are synchronized to Databricks workspaces. It aligns Unravel RBAC filtering with Databricks access control lists (ACLs), so Unravel mirrors Databricks access decisions for jobs and clusters.
Note
For general Unravel RBAC concepts and configuration, see RBAC Configuration in the Unravel documentation.
Before you start
Make sure the following assumptions are true for your environment.
User groups are created in Azure AD and represent the teams or access scopes you want to use.
Azure AD groups are synchronized to the Databricks workspace (for example, using SCIM-based provisioning).
Databricks jobs and clusters use users and groups, including the synced Azure AD groups, in their permission lists via Databricks ACLs.
The following Databricks REST APIs are accessible from the Unravel host:
With these prerequisites in place, Unravel can collect Databricks ACLs and group information and use them to drive RBAC filters.
https://{workspace-url}/api/2.0/permissions/jobs/{job_id}
https://{workspace-url}/api/2.0/permissions/clusters/{cluster_id}
https://{workspace-url}/api/2.0/preview/scim/v2/Groups (or the current Groups SCIM endpoint)
How the integration works
In this model, Unravel uses Databricks ACLs and Azure AD group membership to determine which jobs and clusters each user can see.
Azure AD groups are synced into the Databricks workspace as Databricks groups.
Databricks ACLs on jobs and clusters use those groups and users in their permission lists.
While a job or cluster is running, Unravel retrieves its ACL and permissions data, along with group metadata, from Databricks and converts that information into RBAC tags. Unravel then caches the resulting RBAC data for a configured period of time.
When a user signs in, Unravel dynamically builds RBAC filters that contain the user identifier and the identifiers of groups that are synced to the Databricks workspace, and applies those filters to all subsequent queries against Unravel data.
As a result, a user’s visibility of jobs and clusters in Unravel automatically matches their Databricks ACL-based access.
User sign-in and RBAC filters
When RBAC with Azure AD and Databricks ACLs is configured, the user sign-in flow works as follows:
Using IDs instead of group display names makes the mapping resilient to group renames in Azure AD or Databricks.
The user signs in to the Unravel UI using SAML and Azure AD (or another SAML-based identity provider that supplies Azure AD groups).
The Unravel UI extracts the username and list of group names from the SAML response and sends them to the Unravel backend.
The Unravel backend evaluates RBAC configuration and maps the group names from SAML to Databricks group IDs obtained via the SCIM Groups API.
The backend creates user-specific RBAC filters that include two special tag filters based on the SAML response:
rbac_groups: list of group IDs associated with the user
rbac_users: username for the user
These RBAC filters are applied to every subsequent request the user makes to the Unravel backend.
Configuration properties
Use the following properties to enable and control the feature that integrates Azure AD groups and Databricks ACLs with Unravel RBAC.
Core RBAC switches
Property Name | Default Value | Description |
|---|---|---|
com.unraveldata.rbac.enabled | false | Main RBAC switch in Unravel. Enable RBAC by running ./manager config rbac enable from the Unravel installation directory. |
When RBAC is enabled, Unravel evaluates RBAC roles and filters for each request instead of giving all users read-only admin access.
Databricks ACL tag collection and filtering
Property name | Default value (Databricks) | Description |
|---|---|---|
com.unraveldata.databricks.rbac_tags.rbac_tags_collector.enabled | The same value as com.unraveldata.rbac.enabled if Unravel is configured for Databricks platform. | When true, Unravel collects job and cluster permissions from Databricks and stores them as RBAC-related tags together with the corresponding data for later filtering. |
com.unraveldata.databricks.rbac_tags.rbac_tags_filters.enabled | The same value as com.unraveldata.rbac.enabled if Unravel is configured for Databricks platform. | When true, Unravel automatically adds tag filters based on Databricks ACLs to each user’s RBAC filters. |
Shorter cache durations make RBAC reflect Databricks ACL and group changes more quickly at the cost of more frequent Databricks API calls. Longer cache durations reduce calls to Databricks but delay propagation of permission changes into Unravel.
Example configuration steps
The following high-level steps show how to enable RBAC with Azure AD groups and Databricks ACLs.
Enable RBAC in Unravel.
Run: ./manager config rbac enable.
Stop Unravel, apply configuration changes, and restart, as described in the RBAC configuration documentation.
Verify Azure AD and Databricks integration.
Confirm Unravel is integrated with Azure AD for authentication and group mapping.
Confirm Azure AD groups are synced into the Databricks workspace and used in job and cluster ACLs.
Enable Databricks RBAC tag collection and filtering if needed.
Set com.unraveldata.databricks.rbac_tags.rbac_tags_collector.enabled=true if you want to force collection independent of the global default.
Set com.unraveldata.databricks.rbac_tags.rbac_tags_filters.enabled=true to enable ACL-based tag filters
Adjust cache expiration values.
Tune entity_permissions.cache.expire_after_minutes and user_groups.cache.expire_after_minutes to balance freshness of permissions with API usage, based on how often ACLs and group membership change in your environment.
Apply changes.
After configuration, sign in with a test user to confirm that the jobs and clusters visible in Unravel match the user’s Databricks access.
Apply configuration and restart Unravel services, following the general Unravel configuration workflow.
If you use Unravel cost center or business unit reports that rely on user groups, ensure that group-based mappings use the correct group IDs when RBAC is enabled.
When filtering is based on user groups, configure cost centers and business units with the Databricks group IDs that correspond to the relevant groups, not just the display names.
This keeps cost and business reporting aligned with the same group-based RBAC filters that control access to jobs and clusters in Unravel.