-
Notifications
You must be signed in to change notification settings - Fork 117
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SNOW-1000284: Add schema support for structure types #1323
SNOW-1000284: Add schema support for structure types #1323
Conversation
ff69f3f
to
d6f83cd
Compare
d1033e3
to
7b56a11
Compare
7b56a11
to
e2e444e
Compare
@@ -229,6 +294,170 @@ def test_dtypes(session): | |||
] | |||
|
|||
|
|||
@pytest.mark.parametrize( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we also skip these tests in stored proc? Does stored proc connector have the corresponding change for struct type?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By the time I merge this change it should have the corresponding change.
@@ -76,6 +76,12 @@ | |||
IS_NOT_ON_GITHUB = os.getenv("GITHUB_ACTIONS") != "true" | |||
# this env variable is set in regression test | |||
IS_IN_STORED_PROC_LOCALFS = IS_IN_STORED_PROC and os.getenv("IS_LOCAL_FS") | |||
STRUCTURED_TYPE_ENVIRONMENTS = {"dev", "aws"} | |||
IS_STRUCTURED_TYPES_SUPPORTED = ( | |||
os.getenv("cloud_provider", "dev") in STRUCTURED_TYPE_ENVIRONMENTS |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cloud_provider is set on github, right? We also have some tests running on jenkins, so they will just be skipped on jenkins?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure how that gets set. I know that in tox.ini it's passed in regardless of environment, but that doesn't mean it is set.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so we will skip these tests on jenkins right now? e.g., https://ci-dev-142.int.snowflakecomputing.com/job/SnowparkPythonClientRegressRunner/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. Once the parameter for enabling structured types is on by default we will need to make sure these tests work in all environments, but for now sfctest0 is the only environment that has them enabled. It is also the only environment that we have iceberg test infrastructure set up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
got it, thanks. Then maybe create a jira and add a TODO here to make sure these conditions will be finally removed?
), | ||
id="structured-types-enabled", | ||
), | ||
False: pytest.param( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we finally will remove this example once all deployments supporting struct types?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, at some point in the future we will make a BCR that turns structured types on by default. That will probably be the best time to remove this.
Co-authored-by: Jianzhun Du <[email protected]>
[ | ||
StructField( | ||
"MAP", | ||
MapType(StringType(16777216), LongType(), structured=True), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have to explicit use the length 16777216? This number may change soon.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without a BCR we have to use the length for now. I expect that whenever I make the lob change I'll need to update this test and others that depend on this constant.
Please answer these questions before submitting your pull requests. Thanks!
What GitHub issue is this PR addressing? Make sure that there is an accompanying issue to your PR.
Fixes #SNOW-1000284
Fill out the following pre-review checklist:
Please describe how your code solves the related issue.
This PR adds relevant changes needed to support Structured Types. The goal of this commit is to enable reading structured data from iceberg tables. Writing new tables and support for non-iceberg tables is a future goal of the project.
The change does the following:
I've added test cases that test the following:
Merge Notes:
This change cannot merge until the next python-connector version is released. I will need to add it as a dependency to support the new pandas changes.
This change also will not pass merge gates until I figure out how to add permissions to the test runner in order to access the external volume that we're using for iceberg tests. Maybe @sfc-gh-yixie can help me figure that out.