-
Notifications
You must be signed in to change notification settings - Fork 120
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SNOW-1183322: [Local Testing] Add support for registering sprocs #1338
SNOW-1183322: [Local Testing] Add support for registering sprocs #1338
Conversation
calculate_expression( | ||
expr, | ||
ColumnEmulator(data=[None]), | ||
session._analyzer, | ||
{}, | ||
keep_literal=True, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have two dumb questions here:
- what is
Column
type args for sproc and how this is used by users? - why we set
ColumnEmulator(data=[None]),
here, this gives us an empty column with just the name, is this intended? this is kinda relavant to the first question on howColumn
is used in sproc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Some of our tests pass in
lit
values rather than raw values. Those are the only Column objects I've observed being passed to sproc. - The way calculate_expression works for lit values requires there to be some input data even if it is all null. I think I can refactor this so that it is more clear.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are good questions actually. In fact I feel like we shouldn't support column inputs to stored procedures, although that might be a limitation from DataFrame.select
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 for refactor to be more clear
Can we enable more tests in |
I've enabled all the tests that look relevant to local testing. There's several more tests related to imports and loading from file that can't be enabled yet. Turning on those tests revealed a couple of gaps.
In order to implement tests 1:1 I also implemented functions.sqrt and functions.pow. They are trivial so shouldn't expand the scope of this PR by much. |
|
||
# If expression does not define its datatype we cannot verify it's compatibale. | ||
# This is potentially unsafe. | ||
if expr.datatype and not sproc_types_are_compatible( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question: in what case is expr.datatype empty here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The datatype of the sqrt function in this example is None:
https://github.com/snowflakedb/snowpark-python/pull/1338/files#diff-4963488bfdaad538e6af7222d4db977758e66d05b4c11b600583449152416f50L221
elif isinstance(x, _IntegralType) and isinstance(y, _IntegralType): | ||
return True | ||
elif isinstance(x, _FractionalType) and isinstance(y, _FractionalType): | ||
if ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When we work on coercion, this function should be replaced by like DataType.can_coerce_to(DataType)
. @sfc-gh-aling
MapType, | ||
StructType, | ||
), | ||
) and not isinstance(result, DataFrame): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have no support for table sprocs yet, why do we need not isinstance(result, DataFrame)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've seen examples in the tests where a sproc returns a Dataframe, but it's signature says it returns a Structtype. If it's truly a struct type it needs to be serialized, but if it's just a dataframe it does not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. Could you please point me to the test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See here.
LGTM overall, had a couple of questions and comments. |
I noticed the oob telemetry test is flaky these days thus blocking the merge gate. |
Please answer these questions before submitting your pull requests. Thanks!
What GitHub issue is this PR addressing? Make sure that there is an accompanying issue to your PR.
Fixes #1183322
Fill out the following pre-review checklist:
Please describe how your code solves the related issue.
Please write a short description of how your code change solves the related issue.