Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SNOW-1269037: [Local Testing] Add support for NaT and NaN values #1393

Merged
merged 8 commits into from
Apr 29, 2024
4 changes: 4 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,10 @@

- Added support for StringType, TimestampType and VariantType data conversion in the mocked function `to_date`.

#### Bug Fixes

- Fixed a bug that caused NaT and NaN values to not be recognized.


## 1.15.0 (2024-04-24)

Expand Down
9 changes: 9 additions & 0 deletions src/snowflake/snowpark/_internal/type_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -264,6 +264,15 @@ def convert_sp_to_sf_type(datatype: DataType) -> str:
datetime.time: TimeType,
bytes: BinaryType,
}
if installed_pandas:
Copy link
Collaborator

@sfc-gh-stan sfc-gh-stan Apr 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Elegant fix :chef_kiss:

import numpy

PYTHON_TO_SNOW_TYPE_MAPPINGS.update(
{
type(pandas.NaT): TimestampType,
numpy.float64: DecimalType,
}
)


VALID_PYTHON_TYPES_FOR_LITERAL_VALUE = (
Expand Down
31 changes: 30 additions & 1 deletion tests/mock_unit/test_create_df_from_pandas.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,15 @@
import pytz

from snowflake.snowpark import Row, Table
from snowflake.snowpark.types import BooleanType, DoubleType, LongType, StringType
from snowflake.snowpark.types import (
BooleanType,
DoubleType,
LongType,
StringType,
StructField,
StructType,
TimestampType,
)

try:
import pandas as pd
Expand Down Expand Up @@ -344,3 +352,24 @@ def test_na_and_null_data(session):
)
sp_df = session.create_dataframe(data=pandas_df)
assert sp_df.select("A").collect() == [Row("abc"), Row(None), Row("a"), Row("")]


@pytest.mark.localtest
def test_datetime_nat_nan(session):
df = pd.DataFrame(
{
"date": pd.to_datetime(
[None, "2020-01-13", "2020-02-01", "2020-02-23", "2020-03-05"], utc=True
),
"num": [None, 1.0, 2.0, 3.0, 4.0],
}
)

expected_schema = StructType(
[
StructField('"date"', TimestampType(), nullable=True),
StructField('"num"', DoubleType(), nullable=True),
]
)
sf_df = session.create_dataframe(data=df)
assert sf_df.schema == expected_schema
Loading