-
Notifications
You must be signed in to change notification settings - Fork 166
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
partitioned write support #353
base: main
Are you sure you want to change the base?
Conversation
pyiceberg/table/__init__.py
Outdated
arrow_table_partition: pa.Table | ||
|
||
|
||
def get_partition_sort_order(partition_columns: list[str], reverse: bool = False) -> dict[str, Any]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def get_partition_sort_order(partition_columns: list[str], reverse: bool = False) -> dict[str, Any]: | |
def _get_partition_sort_order(partition_columns: list[str], reverse: bool = False) -> dict[str, Any]: |
This is a nit: should we use _single_leading_underscore to indicate that these are internal functions according to PEP-8?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, thanks for the reminder! More refactoring is on the way.
d03f058
to
b74521e
Compare
pyiceberg/manifest.py
Outdated
@@ -308,6 +308,7 @@ def data_file_with_partition(partition_type: StructType, format_version: Literal | |||
field_id=field.field_id, | |||
name=field.name, | |||
field_type=partition_field_to_data_file_partition_field(field.field_type), | |||
required=False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Default value is True, which breaks the avro writer encoding manifest entry when the partition value is null. (Because the attribute of required being True in the nested field of the Avro schema will be used by the writer to infer and choose a non-optional field writer that could not encode None value.)
This is for encoding only, so set it to False for now.
TODO: Set it properly according to table schema.
805c19c
to
7a33b23
Compare
63a68b4
to
0e57e95
Compare
abc4ca3
to
0a199a3
Compare
As discussed in the monthly community sync, this will be broken down into 4 prs of:
|
Todo