Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add new API connection for SiS/SPCS #1236

Closed

Conversation

sfc-gh-smirzaei
Copy link
Contributor

Please answer these questions before submitting your pull requests. Thanks!

  1. What GitHub issue is this PR addressing? Make sure that there is an accompanying issue to your PR.

    Fixes #NNNN

  2. Fill out the following pre-review checklist:

    • I am adding a new automated test(s) to verify correctness of my new code
    • I am adding new logging messages
    • I am adding a new telemetry message
    • I am adding new credentials
    • I am adding a new dependency
  3. Please describe how your code solves the related issue.

    The goal is to add a new API connect for SiS and SPCS where we can create a session using the default connection params.

else:
try:
# 2) check if we are running in SPCS
with open("/snowflake/session/token") as file:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could replace this block by adding a helper method like _get_default_SPSC_connection_params in connector (similar to _get_default_connection_params) to get SPCS default params and move the whole thing in _create_internal. Please lemme know which one is a better option/pattern.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no preference, either will work!

): # (will be removed before merging) no need for `"streamlit" in sys.modules` check since is_in_stored_procedure() must be true for SiS as well
# Basically noop and return the existing session
return _get_active_session()
else:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need for the else-clause since you return in the then-clause. Just get rid of this line and de-dent the rest

Comment on lines 2779 to 2781
except Exception:
# a general error happened, re-raise it
raise
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to catch an exception just to throw it again, let it bubble up

Suggested change
except Exception:
# a general error happened, re-raise it
raise

else:
try:
# 2) check if we are running in SPCS
with open("/snowflake/session/token") as file:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no preference, either will work!

return _get_active_session()
try:
# 2) check if we are running in SPCS
with open("/snowflake/session/token") as file:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any suggestion/idea on how I can test this for SPCS?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is probably even more challenging than stored proc. In stored proc, we have a test job that creates Anaconda package, and does some repodata hack to use the new package (test job: https://ci-dev-142.int.snowflakecomputing.com/job/PythonStoredProcBuildPrecommitTest/).

Do you know how SPCS does its regression test today?

@sfc-gh-smirzaei sfc-gh-smirzaei marked this pull request as ready for review February 3, 2024 00:40
@sfc-gh-smirzaei sfc-gh-smirzaei requested a review from a team as a code owner February 3, 2024 00:40
@@ -586,7 +585,10 @@ def test_create_session_from_default_config_file(monkeypatch, db_parameters):
with monkeypatch.context() as m:
m.setenv("SNOWFLAKE_CONNECTIONS", tomlkit.dumps(doc))
m.setenv("SNOWFLAKE_DEFAULT_CONNECTION_NAME", "default")
with Session.builder.create() as new_session:
if not IS_IN_STORED_PROC:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm now I'm not sure if it works (haven't tested this yet)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some tests failed. Looks like a rebase with main is needed before we do another run

@@ -2749,4 +2749,32 @@ def _explain_query(self, query: str) -> Optional[str]:
_logger.warning("query `%s` cannot be explained", query)
return None

@staticmethod
def create() -> "Session":
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is create the right action name for "gets existing session if one exists", or should this be fetch instead?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there is already a method called getOrCreate

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So basically this comes from a request from field/PM: https://docs.google.com/document/d/12FGgc9NeH-xa31FMZa_lO2N5Xq71Cm0aquTLI2-G0IE/edit#heading=h.hvtde6rw3c3p

They want a unified connect method in session (which I mistakenly named the method created) which works across different use-cases (i.e., when Snowpark is used as is or in SiS, SP, SPCS)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. @sfc-gh-mkeller @sfc-gh-yixie : Do you have concerns for introducing a connect method in session?

src/snowflake/snowpark/session.py Show resolved Hide resolved
return _get_active_session()
try:
# 2) check if we are running in SPCS
with open("/snowflake/session/token") as file:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is probably even more challenging than stored proc. In stored proc, we have a test job that creates Anaconda package, and does some repodata hack to use the new package (test job: https://ci-dev-142.int.snowflakecomputing.com/job/PythonStoredProcBuildPrecommitTest/).

Do you know how SPCS does its regression test today?

Comment on lines +2760 to +2761
): # (will be removed before merging) no need for `"streamlit" in sys.modules` check since is_in_stored_procedure() must be true for SiS as well
# Basically noop and return the existing session
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it would be removed before merging to main? I don't understand why have it then.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is already Session.builder.getOrCreate()

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getOrCreate as is, doesn't work (since we expect to have token in the toml file and not as a separate file). But I had a conversion with Srikanth, he is tasked with adding connect for SPCS for all connectors. He will write a design doc based on the existing connect API in Python connector where we take into account a new connect argument (like token_file) to be added to the toml file. I'll hold off working on this for now since his general solution is better/cleaner than what I have here.

@@ -501,6 +501,10 @@ def is_in_stored_procedure():
return PLATFORM == "XP"


def is_in_spcs():
return os.path.exists("/snowflake/session/token")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there another way?

Comment on lines +2760 to +2761
): # (will be removed before merging) no need for `"streamlit" in sys.modules` check since is_in_stored_procedure() must be true for SiS as well
# Basically noop and return the existing session
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is already Session.builder.getOrCreate()

"protocol": os.getenv("SNOWFLAKE_PROTOCOL"),
"authenticator": "oauth",
"token": token,
"warehouse": os.getenv("SNOWFLAKE_WAREHOUSE"),

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SNOWFLAKE_WAREHOUSE may or may not be set. Setting the QUERY_WAREHOUSE on the SERVICE is optional.

@github-actions github-actions bot locked and limited conversation to collaborators Apr 29, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants