Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(databricks): add the databricks backend #10223

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

cpcloud
Copy link
Member

@cpcloud cpcloud commented Sep 25, 2024

Description of changes

Add support for the databricks backend.

Notes

  • The PySpark compiler is almost entirely reused. Naturally there are a couple
    cases where things differ, and the get overridden in the databricks compiler.
  • Databricks seems to be aggressive about turning SQL NULLs into NaNs,
    which defeats a number of array and map tests that expect None in the output from to_pandas/execute
  • Naturally, databricks pins pyarrow to <17 (one version behind the latest) and numpy to <2. It's not as bad as it was with the snowflake connector, but we shouldn't merge this until we can figure out a sane workaround to avoid the pin in CI.

Issues closed

Resolves #9248.

@cpcloud cpcloud added this to the 10.0 milestone Sep 25, 2024
@cpcloud cpcloud added feature Features or general enhancements new backend PRs or issues related to adding new backends labels Sep 25, 2024
@github-actions github-actions bot added tests Issues or PRs related to tests ci Continuous Integration issues or PRs dependencies Issues or PRs related to dependencies labels Sep 25, 2024
@cpcloud
Copy link
Member Author

cpcloud commented Sep 25, 2024

Tests won't run until merge due to cloudiness.

I'll post the results from a local run, and then fix the CI (if needed) once this is merged.

@github-actions github-actions bot added the sql Backends that generate SQL label Sep 27, 2024
@cpcloud cpcloud force-pushed the databricks branch 4 times, most recently from d229f75 to 279322d Compare September 28, 2024 11:09
@github-actions github-actions bot added the flink Issues or PRs related to Flink label Sep 28, 2024
@github-actions github-actions bot added clickhouse The ClickHouse backend mssql The Microsoft SQL Server backend labels Sep 28, 2024
@cpcloud cpcloud force-pushed the databricks branch 2 times, most recently from 3fc3e74 to b14d84e Compare September 28, 2024 13:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci Continuous Integration issues or PRs clickhouse The ClickHouse backend dependencies Issues or PRs related to dependencies feature Features or general enhancements flink Issues or PRs related to Flink mssql The Microsoft SQL Server backend new backend PRs or issues related to adding new backends sql Backends that generate SQL tests Issues or PRs related to tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

feat: add databricks support
1 participant