Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SNOW-1227759: ambiguous overload typing causing typing error false positives with DataFrameWriter.save_as_table() #1296

Closed
TedCha opened this issue Mar 8, 2024 · 6 comments · Fixed by #1395
Assignees
Labels
status-triage_done Initial triage done, will be further handled by the driver team

Comments

@TedCha
Copy link
Contributor

TedCha commented Mar 8, 2024

Please answer these questions before submitting your issue. Thanks!

  1. What version of Python are you using?

Python 3.8.10 (tags/v3.8.10:3d8993a, May 3 2021, 11:48:03) [MSC v.1928 64 bit (AMD64)]

  1. What operating system and processor architecture are you using?

Windows-10-10.0.19045-SP0

  1. What are the component versions in the environment (pip freeze)?
asn1crypto==1.5.1
certifi==2023.11.17      
cffi==1.16.0
charset-normalizer==3.3.2
cloudpickle==2.2.1       
colorama==0.4.6
cryptography==41.0.7     
exceptiongroup==1.2.0    
filelock==3.13.1
idna==3.6
iniconfig==2.0.0
numpy==1.24.4
packaging==23.2
pandas==2.0.3
platformdirs==3.11.0     
pluggy==1.3.0
ply==3.11
pyarrow==14.0.2
pycparser==2.21
PyJWT==2.8.0
pyOpenSSL==23.3.0
pytest==7.4.4
pytest-mock==3.12.0
python-dateutil==2.8.2
python-dotenv==1.0.0
pytz==2023.3.post1
PyYAML==6.0.1
requests==2.31.0
six==1.16.0
snowflake-connector-python==3.6.0
snowflake-snowpark-python==1.11.1
sortedcontainers==2.4.0
sqlglot==22.2.0
sqlglotrs==0.1.2
tomli==2.0.1
tomlkit==0.12.3
typing_extensions==4.9.0
tzdata==2023.4
urllib3==1.26.18
  1. What did you do?

When the DataFrameWriter.save_as_table() method is called without the clustering_keys parameter, the pylance type checker will report the following error when trying to call the function:

No overloads for "save_as_table" match the provided arguments

Snippet:

def main(session: Session):
    test_df_1 = session.create_dataframe([])

    # No overloads for "save_as_table" match the provided arguments Argument types: (Literal['table_name'], Literal['temporary'])
    test_df_1.write.mode("overwrite").save_as_table(
        "table_name",
        table_type="temporary"
    )

    test_df_2 = session.create_dataframe([])
    
    # No error
    test_df_2.write.mode("overwrite").save_as_table(
        "table_name",
        table_type="temporary",
        clustering_keys=[]
    )
  1. What did you expect to see?

No type checking error when using save_as_table method as described in documentation.

  1. Can you set logging to DEBUG and collect the logs?

NA; static type checking.

Note:

I think this issue could be resolved by making the clustering_keys parameter optional on all overloads:

Ex:

    @overload
    def save_as_table(
        self,
        table_name: Union[str, Iterable[str]],
        *,
        mode: Optional[str] = None,
        column_order: str = "index",
        create_temp_table: bool = False,
        table_type: Literal["", "temp", "temporary", "transient"] = "",
        clustering_keys: Iterable[Column], # Change to Optional[Iterable[ColumnOrName]] = None
        statement_params: Optional[Dict[str, str]] = None,
        block: bool = True,
    ) -> None:
        ...  # pragma: no cover

    @overload
    def save_as_table(
        self,
        table_name: Union[str, Iterable[str]],
        *,
        mode: Optional[str] = None,
        column_order: str = "index",
        create_temp_table: bool = False,
        table_type: Literal["", "temp", "temporary", "transient"] = "",
        clustering_keys: Iterable[Column], # Change to Optional[Iterable[ColumnOrName]] = None
        statement_params: Optional[Dict[str, str]] = None,
        block: bool = False,
    ) -> AsyncJob:
@TedCha TedCha added bug Something isn't working needs triage Initial RCA is required labels Mar 8, 2024
@github-actions github-actions bot changed the title ambiguous overload typing causing false positives with DataFrameWriter.save_as_table() SNOW-1227759: ambiguous overload typing causing false positives with DataFrameWriter.save_as_table() Mar 8, 2024
@TedCha TedCha changed the title SNOW-1227759: ambiguous overload typing causing false positives with DataFrameWriter.save_as_table() SNOW-1227759: ambiguous overload typing causing typing error false positives with DataFrameWriter.save_as_table() Mar 8, 2024
@sfc-gh-sghosh sfc-gh-sghosh self-assigned this Mar 10, 2024
@sfc-gh-sghosh sfc-gh-sghosh added status-triage Issue is under initial triage and removed bug Something isn't working needs triage Initial RCA is required labels Mar 10, 2024
@sfc-gh-sghosh
Copy link

Hello @TedCha ,

Thank you raising the issue.
I tried the code snippet provided by you in Jupyter notebook with Snowpark python 1.11.1, its working fine and no error being thrown, could you please check.

test_df_1 = session.create_dataframe([[1,2],[3,4]], schema=["a", "b"])

test_df_1.write.mode("overwrite").save_as_table(
        "table_name",
        table_type="temporary"
    )
session.table("table_name").collect()

Output: [Row(A=1, B=2), Row(A=3, B=4)]
test_df_2 = session.create_dataframe([[5,6],[7,8]], schema=["a", "b"])
test_df_2.write.mode("overwrite").save_as_table(
            "table_name",
            table_type="temporary",
            clustering_keys=[]
        )
session.table("table_name").collect()
Output: 
[Row(A=5, B=6), Row(A=7, B=8)]

Regards,
Sujan

@TedCha
Copy link
Contributor Author

TedCha commented Mar 11, 2024

Hello @sfc-gh-sghosh. No error is thrown at runtime, the error is thrown during static type checking. Please see the attached screenshot.

image

@sfc-gh-sghosh
Copy link

Hello @TedCha ,

I tried using snowflake-connector-python 3.7.0 and Snowpark python 1.11.1, there is no static type checking error. Its running successfully at run time as well.

Could you please paste the full message and could you try fresh from another IDE such as Jupyter?

image
Python_jupyter

Regards,
Sujan

@TedCha
Copy link
Contributor Author

TedCha commented Mar 18, 2024

Hello Sujan,

Thank you for looking into this issue. The issue is not an error happening at runtime, it is an error happening during static type checking before runtime.

Jupyter can not perform static type checking without having a language server installed. I was able to recreate the described issue in Jupyter Labs by installing the LSP integration for Jupyter and then installing the Pyright LSP. Pyright is the LSP that I am using in my IDE (VSCode) but this error could be in multiple Python language server protocols.

Please see attached screenshots for reference.

image
image

@TedCha
Copy link
Contributor Author

TedCha commented Mar 19, 2024

I believe #1058 is the same issue.

@sfc-gh-sghosh
Copy link

Hello @TedCha ,

Thanks for the update, we are checking.

Regards,
Sujan

@sfc-gh-sghosh sfc-gh-sghosh added status-triage_done Initial triage done, will be further handled by the driver team and removed status-triage Issue is under initial triage labels Apr 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status-triage_done Initial triage done, will be further handled by the driver team
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants