Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dbt Databricks Workflow with SQL Warehouse Pro crashes with HTTP error 503 #893

Closed
spenaustin opened this issue Dec 30, 2024 · 1 comment
Labels
bug Something isn't working

Comments

@spenaustin
Copy link

Describe the bug

Our dbt job terminates with the error Max retries exceeded with url: ... (Caused by ResponseError('too many 503 error responses') before executing any SQL commands.

This only occurs when our SQL Warehouse is in the "Stopped" status.

This is extremely similar to #570, which was solved by Pull Request #578 , which pinned the databricks-sql-connector package back to an older version. I think this is likely to be a similar problem: databricks-sql-connector just received an upgrade to version 3.7 on December 23rd, and we started seeing this issue on December 24th.

Looking into it further, it looks like version 3.7 altered the library's retry backoff behavior, which was also the issue in #570. Pinning our version of databricks-sql-connector to version 3.6 seems to have solved the problem for us, but I wanted to raise this anyway because the current dependency set in dbt-databricks is databricks-sql-connector>=3.5.0, <4.0.0, which includes the version that was giving us problems.

Steps To Reproduce

Create a workflow with a dbt task in a Databricks workspace matching the specifications in the "System information" section below. Make sure the SQL Warehouse being referenced is in the "Stopped" status. Run a dbt command (something like dbt build or dbt run) using the workflow. This should yield the error.

Start the SQL Warehouse and attempt to run the dbt task again. This time, the dbt task should succeed.

Expected behavior

I would expect dbt to recognize that the SQL Warehouse is starting up, and keep polling it until it sees a "ready" state.

Screenshots and log output

N/A

System information

Our Databricks workspace is running on AWS using Unity Catalog.

This is run using a Databricks Workflow, so we can't really run dbt --version on it, but these are the Databricks/dbt libraries installed on the cluster being used:

PyPI libraries:
dbt-core==1.9.0
dbt-databricks==1.9.0

The cluster that the Workflow is using is running Databricks Runtime 15.4 LTS with Photon Acceleration, and utilizes i3.xlarge machines for the driver and workers. Autoscaling is enabled.

The SQL Warehouse is a Medium sized "Pro" type, with a spot instance policy of "Cost optimized".

Additional context

N/A

@spenaustin spenaustin added the bug Something isn't working label Dec 30, 2024
@spenaustin
Copy link
Author

Just realized this is an exact duplicate of #892. Whoops! I'll send my findings over there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant