Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SNOW-1183322: [Local Testing] Add support for registering sprocs #1338

Merged
merged 10 commits into from
Apr 15, 2024

Conversation

sfc-gh-jrose
Copy link
Contributor

Please answer these questions before submitting your pull requests. Thanks!

  1. What GitHub issue is this PR addressing? Make sure that there is an accompanying issue to your PR.

    Fixes #1183322

  2. Fill out the following pre-review checklist:

    • I am adding a new automated test(s) to verify correctness of my new code
    • I am adding new logging messages
    • I am adding a new telemetry message
    • I am adding new credentials
    • I am adding a new dependency
  3. Please describe how your code solves the related issue.

    Please write a short description of how your code change solves the related issue.

@sfc-gh-jrose sfc-gh-jrose marked this pull request as ready for review March 28, 2024 22:27
@sfc-gh-jrose sfc-gh-jrose requested a review from a team as a code owner March 28, 2024 22:27
src/snowflake/snowpark/mock/_stored_procedure.py Outdated Show resolved Hide resolved
Comment on lines 61 to 66
calculate_expression(
expr,
ColumnEmulator(data=[None]),
session._analyzer,
{},
keep_literal=True,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have two dumb questions here:

  1. what is Column type args for sproc and how this is used by users?
  2. why we set ColumnEmulator(data=[None]), here, this gives us an empty column with just the name, is this intended? this is kinda relavant to the first question on how Column is used in sproc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Some of our tests pass in lit values rather than raw values. Those are the only Column objects I've observed being passed to sproc.
  2. The way calculate_expression works for lit values requires there to be some input data even if it is all null. I think I can refactor this so that it is more clear.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are good questions actually. In fact I feel like we shouldn't support column inputs to stored procedures, although that might be a limitation from DataFrame.select...

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for refactor to be more clear

src/snowflake/snowpark/mock/_stored_procedure.py Outdated Show resolved Hide resolved
src/snowflake/snowpark/mock/_stored_procedure.py Outdated Show resolved Hide resolved
@sfc-gh-stan
Copy link
Collaborator

Can we enable more tests in tests/integ/test_stored_procedure.py like test_type_hints?

@sfc-gh-jrose
Copy link
Contributor Author

I've enabled all the tests that look relevant to local testing. There's several more tests related to imports and loading from file that can't be enabled yet. Turning on those tests revealed a couple of gaps.

  1. session.call didn't work previously, but should now.
  2. Calling a sproc with a nested column expression that returns a single value appears to be supported in live so I've reverted the refactor I made to add support for calling calculate_expression back. I've added a comment so that it's a little more understandable.

In order to implement tests 1:1 I also implemented functions.sqrt and functions.pow. They are trivial so shouldn't expand the scope of this PR by much.

CHANGELOG.md Outdated Show resolved Hide resolved

# If expression does not define its datatype we cannot verify it's compatibale.
# This is potentially unsafe.
if expr.datatype and not sproc_types_are_compatible(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: in what case is expr.datatype empty here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

elif isinstance(x, _IntegralType) and isinstance(y, _IntegralType):
return True
elif isinstance(x, _FractionalType) and isinstance(y, _FractionalType):
if (
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When we work on coercion, this function should be replaced by like DataType.can_coerce_to(DataType). @sfc-gh-aling

MapType,
StructType,
),
) and not isinstance(result, DataFrame):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have no support for table sprocs yet, why do we need not isinstance(result, DataFrame)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've seen examples in the tests where a sproc returns a Dataframe, but it's signature says it returns a Structtype. If it's truly a struct type it needs to be serialized, but if it's just a dataframe it does not.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Could you please point me to the test?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See here.

@sfc-gh-stan
Copy link
Collaborator

LGTM overall, had a couple of questions and comments.

@sfc-gh-aling
Copy link
Contributor

I noticed the oob telemetry test is flaky these days thus blocking the merge gate.
I created a PR to fix the flakiness: #1382

@sfc-gh-stan sfc-gh-stan merged commit 12d8620 into main Apr 15, 2024
62 checks passed
@sfc-gh-stan sfc-gh-stan deleted the jrose_snow_1183322_local_testing_sproc_support branch April 15, 2024 21:20
@github-actions github-actions bot locked and limited conversation to collaborators Apr 15, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants