[azure-ml] Serverless Spark compute fails with missing Synapse cluster identifier #39646
Labels
Client
This issue points to a problem in the data-plane of the library.
customer-reported
Issues that are reported by GitHub users external to the Azure organization.
Machine Learning
needs-team-attention
Workflow: This issue needs attention from Azure service team or SDK team
question
The issue doesn't require a change to the product in order to be resolved. Most issues start as that
Service Attention
Workflow: This issue is responsible by Azure service team.
Describe the bug
I am trying to execute Python script on serverless Spark compute on Azure Machine Learning.
I have attached a user assigned Managed Identity to the AML workspace, and am defining the Spark component accordingly.
I have submitted the pipeline, and Spark job using the CLI.
Soon after the component starts running, I get the following Native Error:
I haven't seen any mention to setting up the environment variable AZUREML_SYNAPSE_CLUSTER_IDENTIFIER for the Spark component.
The managed identity has AI Developer role assigned.
To Reproduce
Steps to reproduce the behavior:
Expected behavior
I expected the component to run its code using a managed identity.
I am not sure why the component is failing. I assumed that when using Spark Serverless compute, there would be no need to specify a Synapse cluster manually.
Any thoughts or help will be deeply appreciated.
The text was updated successfully, but these errors were encountered: