Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow hostname to be provided with or without a trailing slash #784

Conversation

tim-steinkuhler
Copy link
Contributor

@tim-steinkuhler tim-steinkuhler commented May 21, 2023

resolves #302

Description

Allow hostname to be provided with or without trailing slash

When you copy in the hostname, it may or may not include trailing /
We could avoid confusion by allowing users to include or exclude these elements.

Before this change, dbt debug would get you something like this:

dbt was unable to connect to the specified database.
The database returned the following error:

  >Runtime Error
  Database Error
    failed to connect

Output while using a trailing slash after change (given credentials are correct):

Connection test: [OK connection ok]

Checklist

@cla-bot
Copy link

cla-bot bot commented May 21, 2023

Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: Tim Steinkühler.
This is most likely caused by a git client misconfiguration; please make sure to:

  1. check if your git client is configured with an email to sign commits git config --list | grep email
  2. If not, set it up using git config --global user.email [email protected]
  3. Make sure that the git commit email is configured in your GitHub account settings, see https://github.com/settings/emails

@tim-steinkuhler tim-steinkuhler force-pushed the tst/ct-410-handle-host-name-with-backslash branch from ccc479f to c1af8c8 Compare May 24, 2023 09:00
@cla-bot cla-bot bot added the cla:yes label May 24, 2023
@tim-steinkuhler tim-steinkuhler force-pushed the tst/ct-410-handle-host-name-with-backslash branch from c1af8c8 to 96c29f0 Compare May 24, 2023 11:45
@tim-steinkuhler
Copy link
Contributor Author

Hi @colin-rogers-dbt (I see you are assigned as reviewer), I'm finding it a bit difficult to relate the CI error message "at least one column must be specified for the table" to the changes I made. If I did mess this up, I would expect a lot more tests to fai, not just that one, and I would expect a different error message. Could you help me get a better understanding?

@tim-steinkuhler tim-steinkuhler force-pushed the tst/ct-410-handle-host-name-with-backslash branch from 96c29f0 to 558910d Compare May 25, 2023 07:21
@tim-steinkuhler
Copy link
Contributor Author

@Fokko, are you able to have a look at this? After a rebase on the latest version of main, the tests have been passed, although there are still 3 "expected" checks, how does this work? :-D

Copy link
Contributor

@Fokko Fokko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tim-steinkuhler A maintainer (someone from dbt-labs) has to kick off the jobs. This is to avoid someone injecting malicious code into the CI. This isn't needed anymore once one PR is in. Unfortunately, it is a bit quiet on the dbt-spark repository. I'm also trying to get a few PRs in. @VersusFacit would you have time to dig into this?

dbt/adapters/spark/connections.py Outdated Show resolved Hide resolved
@@ -278,7 +289,7 @@ class SparkConnectionManager(SQLConnectionManager):

SPARK_CLUSTER_HTTP_PATH = "/sql/protocolv1/o/{organization}/{cluster}"
SPARK_SQL_ENDPOINT_HTTP_PATH = "/sql/1.0/endpoints/{endpoint}"
SPARK_CONNECTION_URL = "{host}:{port}" + SPARK_CLUSTER_HTTP_PATH
SPARK_CONNECTION_URL = "https://{host}:{port}" + SPARK_CLUSTER_HTTP_PATH
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keep in mind that by changing this, we cannot use http:// prefixes anymore. IDK if anyone uses this (maybe in test environments?).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, wasn't that already the case before, with this code?

                    # Prepend https:// if it is missing
                    host = creds.host
                    if not host.startswith("https://"):
                        host = "https://" + creds.host

I tried to "standardize" the host name in SparkCredentials. __post_init__(), so it wouldn't matter if you added the prefix and / or the trailing /, and then added this here, because you had the https:// prefix for the http connection method

@tim-steinkuhler
Copy link
Contributor Author

By the way, thank you @Fokko! Your feedback is much appreciated!

@tim-steinkuhler tim-steinkuhler force-pushed the tst/ct-410-handle-host-name-with-backslash branch 3 times, most recently from 6b793ce to 160d365 Compare May 25, 2023 19:51
@tim-steinkuhler
Copy link
Contributor Author

tim-steinkuhler commented May 25, 2023

I don't think my change could lead to this error message in the integration-spark-session pipeline. @colin-rogers-dbt could you have a look?

@JCZuurmond
Copy link
Collaborator

@tim-steinkuhler : Could you force a new commit to restart the CI? I do not understand why the integration-spark-session CI fails; I would like to rerun the CI to be sure

@JCZuurmond
Copy link
Collaborator

@tim-steinkuhler : Ok, that did not do the trick unfortunately. Let's discuss an approach for this in person!

@tim-steinkuhler tim-steinkuhler force-pushed the tst/ct-410-handle-host-name-with-backslash branch from 21c99d8 to cae2e08 Compare June 15, 2023 07:15
@tim-steinkuhler tim-steinkuhler force-pushed the tst/ct-410-handle-host-name-with-backslash branch from 5a0aa24 to 54dde46 Compare June 23, 2023 09:22
@tim-steinkuhler
Copy link
Contributor Author

After a short discussion with @JCZuurmond , we decided to not remove the https:// prefix

@tim-steinkuhler tim-steinkuhler force-pushed the tst/ct-410-handle-host-name-with-backslash branch from 54dde46 to f7b9bf0 Compare June 23, 2023 09:24
@tim-steinkuhler tim-steinkuhler changed the title Allow hostname to be provided with or without leading https and / or trailing slash Allow hostname to be provided with or without a trailing slash Jun 23, 2023
The trailing / is often added automatically when copy pasting
Before this change, you would get "Database Error - failed to connect"

After this change, the trailing backslash won't prevent your connection
@tim-steinkuhler tim-steinkuhler force-pushed the tst/ct-410-handle-host-name-with-backslash branch from f7b9bf0 to 7aa2460 Compare June 23, 2023 17:10
@JCZuurmond
Copy link
Collaborator

@colin-rogers-dbt : Could you help with the CI again? Given the code changes and that the CI passed before, I expect that the CI fails because it is a bit flacky.

@Fleid: This PR is ready to be merged. Could you merge it once the CI is fixed?

@colin-rogers-dbt colin-rogers-dbt merged commit b4a2f94 into dbt-labs:main Jun 26, 2023
@colin-rogers-dbt
Copy link
Contributor

Thanks for the contribution!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[CT-410] Handle host name with trailing backslash
4 participants