Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SNOW-1043123: Add greatest/least support to local testing #1247

Merged
merged 14 commits into from
Mar 7, 2024

Conversation

sfc-gh-jrose
Copy link
Contributor

Please answer these questions before submitting your pull requests. Thanks!

  1. What GitHub issue is this PR addressing? Make sure that there is an accompanying issue to your PR.

    Fixes #1043123

  2. Fill out the following pre-review checklist:

    • I am adding a new automated test(s) to verify correctness of my new code
    • I am adding new logging messages
    • I am adding a new telemetry message
    • I am adding new credentials
    • I am adding a new dependency
  3. Please describe how your code solves the related issue.

This commit adds support for greatest and least in local testing. There are also a bunch of format changes that pre-commit made as well.

@sfc-gh-jrose sfc-gh-jrose marked this pull request as ready for review February 13, 2024 16:59
@sfc-gh-jrose sfc-gh-jrose requested a review from a team as a code owner February 13, 2024 16:59


@patch("least")
def mock_least(*exprs: ColumnEmulator):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The first argument determines the return type:

If the first type is numeric, then the return type is ‘widened’ according to the numeric types in the list of all arguments.

If the first type is not numeric, then all other arguments must be convertible to the first type.

If any of the argument values is NULL, the result is NULL.

from https://docs.snowflake.com/en/sql-reference/functions/least#usage-notes

The first two cases since those require coercion (implicit type conversion, "123" can be implicitly evaluated to 123) for this PR, we should leave a TODO in the docstring.

Does the current implementation support the third case?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've modified logic and added test cases to show that coercion works correctly.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I had a typo in the comment above, I meant that coercion is not supported in Local Testing in general so we could ignore the first two cases in this PR and simply leave a TODO in the docstring (including a JIRA ticket)



@patch("greatest")
def mock_greatest(*exprs: ColumnEmulator):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as below.

Copy link
Collaborator

@sfc-gh-stan sfc-gh-stan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a TODO comment for type coercion in the docstrings and check the edge case when the input column (only) has null values.

@sfc-gh-jrose sfc-gh-jrose requested a review from a team as a code owner February 21, 2024 18:11
Comment on lines 936 to 950
def _compare(x: Any, y: Any) -> Tuple[Any, Any]:
"""
Compares two values based on the rules described for greatest/least
https://docs.snowflake.com/en/sql-reference/functions/least#usage-notes
"""
if x is None or y is None:
return (None, None)

_x = x if isinstance(x, Number) else float(x)
_y = y if isinstance(y, Number) else float(y)

if _x > _y:
return (_x, _y)
else:
return (_y, _x)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we consider non-numeric types here?
I see in the description https://docs.snowflake.com/en/sql-reference/functions/least:

LEAST supports all data types, including VARIANT.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can add some more test cases to be sure. From what I can tell anything non numeric gets cast to a numeric type.

Copy link
Contributor

@sfc-gh-aling sfc-gh-aling Feb 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

interesting, if I compare a string "abc" with a date "2024-01-01", both would be cast to numeric first?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That case actually generates an exception:

SELECT greatest("A", "B") FROM ( SELECT "A", to_timestamp("B") AS "B" FROM ( SELECT $1 AS "A", $2 AS "B" FROM  VALUES ('abc' :: STRING, '2024-01-01 00:00:00' :: STRING)));
Timestamp 'abc' is not recognized

Copy link
Contributor Author

@sfc-gh-jrose sfc-gh-jrose Feb 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although if they are all strings it does look like they aren't converted to numerics so the _compare function is incorrect. I'll fix it up.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks! do you know what exception class it's raising?
I'm thinking of raising the same error to align the error experience in live mode.
for example, I noticed that in the code type(_x)(y), this might raise ValueError, I'm thinking if we should wrap this into a snowpark client exception

@@ -933,6 +933,45 @@ def mock_to_variant(expr: ColumnEmulator):
return res


def _compare(x: Any, y: Any) -> Tuple[Any, Any]:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's use TypeVar here to indicate that input and output needs to be the same type.

Copy link
Contributor

@sfc-gh-aling sfc-gh-aling left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have left a comment about the error experience, otherwise the change LGTM

CHANGELOG.md Outdated Show resolved Hide resolved
Comment on lines 936 to 950
def _compare(x: Any, y: Any) -> Tuple[Any, Any]:
"""
Compares two values based on the rules described for greatest/least
https://docs.snowflake.com/en/sql-reference/functions/least#usage-notes
"""
if x is None or y is None:
return (None, None)

_x = x if isinstance(x, Number) else float(x)
_y = y if isinstance(y, Number) else float(y)

if _x > _y:
return (_x, _y)
else:
return (_y, _x)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks! do you know what exception class it's raising?
I'm thinking of raising the same error to align the error experience in live mode.
for example, I noticed that in the code type(_x)(y), this might raise ValueError, I'm thinking if we should wrap this into a snowpark client exception

Copy link
Contributor

@sfc-gh-aling sfc-gh-aling left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, @sfc-gh-jrose can you help create a JIRA to track the error experience?

@sfc-gh-jrose sfc-gh-jrose enabled auto-merge (squash) March 7, 2024 00:52
@sfc-gh-jrose sfc-gh-jrose merged commit 3ce2356 into main Mar 7, 2024
55 of 57 checks passed
@sfc-gh-jrose sfc-gh-jrose deleted the jrose_snow_1043123_greatest_least_support branch March 7, 2024 20:04
@github-actions github-actions bot locked and limited conversation to collaborators Mar 7, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants