Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SNOW-1694649: [Local testing] equal_null fails in merge statement #2365

Closed
tvdboom opened this issue Sep 27, 2024 · 2 comments
Closed

SNOW-1694649: [Local testing] equal_null fails in merge statement #2365

tvdboom opened this issue Sep 27, 2024 · 2 comments
Assignees
Labels
bug Something isn't working local testing Local Testing issues/PRs status-triage_done Initial triage done, will be further handled by the driver team

Comments

@tvdboom
Copy link

tvdboom commented Sep 27, 2024

Please answer these questions before submitting your issue. Thanks!

  1. What version of Python are you using?

    Python 3.11.6 (tags/v3.11.6:8b6ee5b, Oct 2 2023, 14:57:12) [MSC v.1935 64 bit (AMD64)]

  2. What operating system and processor architecture are you using?

Windows-10-10.0.22631-SP0

  1. What are the component versions in the environment (pip freeze)?

    pandas==2.2.2
    snowflake-snowpark-python==1.22.1

  2. What did you do?

from snowflake.snowpark import Session
from snowflake.snowpark.functions import lit, when_matched

mock_session = Session.builder.config("local_testing", True).create()

df1 = mock_session.create_dataframe(pd.DataFrame({"A": [0, 1], "B": ['a', 'b']}))
df2 = mock_session.create_dataframe(pd.DataFrame({"A": [0, 1], "B": ['a', 'c']}))

result = df1.merge(
    source=df2,
    join_expr=df1["A"].equal_null(df2["A"]),
    clauses=[when_matched(~(df1["A"].equal_null(df2["A"])) & (df1["B"].equal_null(df2["B"]))).update({"A": lit(3)})],
)
  1. What did you expect to see?
    No error, but got:
  File "C:\repos\hippolib\test.py", line 14, in <module>
    result = df1.merge(
             ^^^^^^^^^^
  File "C:\repos\hippolib\venv\Lib\site-packages\snowflake\snowpark\table.py", line 676, in merge
    result = new_df._internal_collect_with_tag(
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\repos\hippolib\venv\Lib\site-packages\snowflake\snowpark\_internal\telemetry.py", line 156, in wrap
    result = func(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^
  File "C:\repos\hippolib\venv\Lib\site-packages\snowflake\snowpark\dataframe.py", line 651, in _internal_collect_with_tag_no_telemetry
    return self._session._conn.execute(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\repos\hippolib\venv\Lib\site-packages\snowflake\snowpark\mock\_connection.py", line 563, in execute
    res = execute_mock_plan(plan, plan.expr_to_alias)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\repos\hippolib\venv\Lib\site-packages\snowflake\snowpark\mock\_plan.py", line 1289, in execute_mock_plan
    condition = calculate_expression(
                ^^^^^^^^^^^^^^^^^^^^^
  File "C:\repos\hippolib\venv\Lib\site-packages\snowflake\snowpark\mock\_plan.py", line 1700, in calculate_expression
    calculate_expression(exp.left, input_data, analyzer, expr_to_alias)
  File "C:\repos\hippolib\venv\Lib\site-packages\snowflake\snowpark\mock\_plan.py", line 1679, in calculate_expression
    child_column = calculate_expression(
                   ^^^^^^^^^^^^^^^^^^^^^
  File "C:\repos\hippolib\venv\Lib\site-packages\snowflake\snowpark\mock\_plan.py", line 1766, in calculate_expression
    new_column[either_isna] = False
    ~~~~~~~~~~^^^^^^^^^^^^^
  File "C:\repos\hippolib\venv\Lib\site-packages\pandas\core\series.py", line 1338, in __setitem__
    key = check_bool_indexer(self.index, key)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\repos\hippolib\venv\Lib\site-packages\pandas\core\indexing.py", line 2662, in check_bool_indexer
    raise IndexingError(
pandas.errors.IndexingError: Unalignable boolean Series provided as indexer (index of the boolean Series and of the indexed object do not match).
@tvdboom tvdboom added bug Something isn't working needs triage Initial RCA is required labels Sep 27, 2024
@github-actions github-actions bot changed the title [Local testing] equal_null fails in merge statement SNOW-1694649: [Local testing] equal_null fails in merge statement Sep 27, 2024
@sfc-gh-sghosh sfc-gh-sghosh self-assigned this Oct 1, 2024
@sfc-gh-sghosh sfc-gh-sghosh added status-triage Issue is under initial triage and removed needs triage Initial RCA is required labels Oct 1, 2024
@sfc-gh-jrose sfc-gh-jrose added the local testing Local Testing issues/PRs label Oct 1, 2024
@sfc-gh-sghosh
Copy link

Hello @tvdboom ,

We are able to reproduce the issue with the local session, whereas it's working as expected with the regular sessions, we will work on eliminating it.

Regards,
Sujan

@sfc-gh-sghosh sfc-gh-sghosh added status-triage_done Initial triage done, will be further handled by the driver team and removed status-triage Issue is under initial triage labels Oct 3, 2024
@tvdboom
Copy link
Author

tvdboom commented Oct 3, 2024

Hi @sfc-gh-sghosh,

I already created PR #2373 in an attempt to resolve this (and various other) bugs in the local testing framework. I need some help to resolve #2305 though

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working local testing Local Testing issues/PRs status-triage_done Initial triage done, will be further handled by the driver team
Projects
None yet
Development

No branches or pull requests

3 participants