Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DBT failing to execute CREATE TABLE command because it assigned a column DEFAULT value #1085

Closed
2 tasks done
shloktech opened this issue Aug 2, 2024 · 7 comments
Closed
2 tasks done
Labels
bug Something isn't working store_failures Related to storing data test failures in the database triage

Comments

@shloktech
Copy link

Is this a new bug in dbt-core?

  • I believe this is a new bug in dbt-core
  • I have searched the existing issues, and I could not find an existing issue for this bug

Current Behavior

I am using DBT for Data Testing. Over here I have a few not_null checks on delta tables which are in databricks and I am running them by using the below command:
dbt build --profile {our_data_testing_profile} -t {our_env} --select "models/models_to_test+" --store-failures| tee dbt_issues.log || true

Here I am storing the output of data testing into delta tables in databricks which are under dbt_test__audit schema. While doing it we are getting the below error stating:

Runtime Error in test our_internal_test_name WRONG_COLUMN_DEFAULTS_FOR_DELTA_FEATURE_NOT_ENABLED] Failed to execute CREATE TABLE command because it assigned a column DEFAULT value, but the corresponding table feature was not enabled. Please retry the command again after executing ALTER TABLE tableName SET
 TBLPROPERTIES('delta.feature.allowColumnDefaults' = 'supported')

On debugging I found that it dbt is using the below query to create table in dbt_test__audit schema from dbt.log file:
image

This is occurring because databricks does not let us create a table which is based on an empty dataframe.
Output of the select query:
image

To resolve in query we need to add the below property in the above query which is suggested by databricks as well in the error message. After this the query works as per expectations:
image

Even after creating the above table when I rerun the dbt test then it fails because it uses the query in first screenshot above which does not have the TBLPROPERTIES command in it. I searched in dbt documentations but did not find any option that let's me add table properties. I I run only the dbt build query then the data testing works without any error but when I run it with commands to store the results then it fails due to above TBLPROPERTIES not present in the create table query generated by dbt-core.

Expected Behavior

The expected behavior can be divided in two parts:

  1. DBT should add this table property by default in the create table query for databricks as a source.
  2. There should be an option to configure table properties when we are storing the failures.

Steps To Reproduce

  1. Configure a not null check on a delta table in databricks which will pass.

  2. Run the below command: dbt build --profile {your_data_testing_profile} -t {your_environment} --select "models/models_to_test+" --store-failures| tee dbt_issues.log || true

  3. Make sure that the target for storing the results is databricks. Secondly the source table which is undergoing data test has a column which has default value assigned to it like below:
    image

  4. You will get the below error for your test case:

Runtime Error in test our_internal_test_name WRONG_COLUMN_DEFAULTS_FOR_DELTA_FEATURE_NOT_ENABLED] Failed to execute CREATE TABLE command because it assigned a column DEFAULT value, but the corresponding table feature was not enabled. Please retry the command again after executing ALTER TABLE tableName SET
 TBLPROPERTIES('delta.feature.allowColumnDefaults' = 'supported')

Relevant log output

No response

Environment

- OS: Ubuntu 22.04.4 LTS
- Python: 3.8.18
- dbt-core:1.8.4

Which database adapter are you using with dbt?

other (mention it in "Additional Context")

Additional Context

The adapter used is Databricks

@shloktech shloktech added bug Something isn't working triage labels Aug 2, 2024
@dbeatty10 dbeatty10 transferred this issue from dbt-labs/dbt-core Aug 2, 2024
@shloktech
Copy link
Author

Hello @dbeatty10 ,
Any updates on the above issue? Feel free to reach out to me in case of any queries. Also, I'll be happy to assist you in resolving the above and contribute to this repo :)

@amychen1776
Copy link
Contributor

@shloktech curious if you would be willing to share with me why you're using the dbt-spark adapter for Databricks rather than the dbt-databricks adapter?

@shloktech
Copy link
Author

Hello @amychen1776 , I am using dbt-databricks 1.8.4 at my end. Below is a sample of my profiles.yml:

data_testing:
  outputs:
    dev:
      host: databricks_workspace
      http_path: path_to_warehouse
      schema: my_schema
      token: xxxx
      **type: databricks**
      connect_timeout: xx
      connect_retries: xx
      threads: x
  target: dev

cc: @dbeatty10

@amychen1776
Copy link
Contributor

Oh thank you - @dpguthrie this issue needs to be tracked with dbt-databricks then.

@shloktech
Copy link
Author

Hello @amychen1776 , @dpguthrie can you please transfer the issue to dbt-databricks repo as I don't have the subsequent write access in either of the repo:
image
Github article: https://docs.github.com/en/issues/tracking-your-work-with-issues/transferring-an-issue-to-another-repository

cc: @dbeatty10

@dbeatty10
Copy link
Contributor

@shloktech since dbt-databricks is in a different GitHub organization, we can't transfer it to that repo either. So I'm going to close this issue and you'll need to open up a new issue there instead.

@shloktech
Copy link
Author

Thank you @dbeatty10 created a issue in dbt-databricks: databricks/dbt-databricks#758

@dbeatty10 dbeatty10 added the store_failures Related to storing data test failures in the database label Aug 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working store_failures Related to storing data test failures in the database triage
Projects
None yet
Development

No branches or pull requests

3 participants