-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature](iceberg-writer) Implements iceberg partition transform. #37692
[Feature](iceberg-writer) Implements iceberg partition transform. #37692
Conversation
…apache#36289) apache#31442 Added iceberg operator function to support direct entry into the lake by doris 1. Support insert into data to iceberg by appending hdfs files 2. Implement iceberg partition routing through partitionTransform 2.1) Serialize spec and schema data into json on the fe side and then deserialize on the be side to get the schema and partition information of iceberg table 2.2) Then implement Iceberg's Identity, Bucket, Year/Month/Day and other types of partition strategies through partitionTransform and template class 3. Transaction management through IcebergTransaction 3.1) After the be side file is written, report CommitData data to fe according to the partition granularity 3.2) After receiving CommitData data, fe submits metadata to iceberg in IcebergTransaction ### Future work - Add unit test for partition transform function. - Implement partition transform function with exchange sink turned on. - The partition transform function omits the processing of bigint type. --------- Co-authored-by: lik40 <[email protected]> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Thank you for your contribution to Apache Doris. Since 2024-03-18, the Document has been moved to doris-website. |
run buildall |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clang-tidy made some suggestions
ColumnPtr null_map_column_ptr; | ||
bool is_nullable = false; | ||
if (column_ptr->is_nullable()) { | ||
const ColumnNullable* nullable_column = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
warning: use auto when initializing with a cast to avoid duplicating the type name [modernize-use-auto]
const ColumnNullable* nullable_column = | |
const auto* nullable_column = |
ColumnPtr null_map_column_ptr; | ||
bool is_nullable = false; | ||
if (column_ptr->is_nullable()) { | ||
const ColumnNullable* nullable_column = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
warning: use auto when initializing with a cast to avoid duplicating the type name [modernize-use-auto]
const ColumnNullable* nullable_column = | |
const auto* nullable_column = |
ColumnPtr null_map_column_ptr; | ||
bool is_nullable = false; | ||
if (column_ptr->is_nullable()) { | ||
const ColumnNullable* nullable_column = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
warning: use auto when initializing with a cast to avoid duplicating the type name [modernize-use-auto]
const ColumnNullable* nullable_column = | |
const auto* nullable_column = |
Int32* __restrict p_out = out_data.data(); | ||
|
||
while (p_in < end_in) { | ||
Int64 long_value = static_cast<Int64>(*p_in); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
warning: use auto when initializing with a cast to avoid duplicating the type name [modernize-use-auto]
Int64 long_value = static_cast<Int64>(*p_in); | |
auto long_value = static_cast<Int64>(*p_in); |
ColumnPtr null_map_column_ptr; | ||
bool is_nullable = false; | ||
if (column_ptr->is_nullable()) { | ||
const ColumnNullable* nullable_column = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
warning: use auto when initializing with a cast to avoid duplicating the type name [modernize-use-auto]
const ColumnNullable* nullable_column = | |
const auto* nullable_column = |
ColumnPtr null_map_column_ptr; | ||
bool is_nullable = false; | ||
if (column_ptr->is_nullable()) { | ||
const ColumnNullable* nullable_column = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
warning: use auto when initializing with a cast to avoid duplicating the type name [modernize-use-auto]
const ColumnNullable* nullable_column = | |
const auto* nullable_column = |
ColumnPtr null_map_column_ptr; | ||
bool is_nullable = false; | ||
if (column_ptr->is_nullable()) { | ||
const ColumnNullable* nullable_column = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
warning: use auto when initializing with a cast to avoid duplicating the type name [modernize-use-auto]
const ColumnNullable* nullable_column = | |
const auto* nullable_column = |
ColumnPtr null_map_column_ptr; | ||
bool is_nullable = false; | ||
if (column_ptr->is_nullable()) { | ||
const ColumnNullable* nullable_column = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
warning: use auto when initializing with a cast to avoid duplicating the type name [modernize-use-auto]
const ColumnNullable* nullable_column = | |
const auto* nullable_column = |
ColumnPtr null_map_column_ptr; | ||
bool is_nullable = false; | ||
if (column_ptr->is_nullable()) { | ||
const ColumnNullable* nullable_column = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
warning: use auto when initializing with a cast to avoid duplicating the type name [modernize-use-auto]
const ColumnNullable* nullable_column = | |
const auto* nullable_column = |
ColumnPtr col_ptr = partition_column.column->convert_to_full_column_if_const(); | ||
CHECK(col_ptr != nullptr); | ||
if (col_ptr->is_nullable()) { | ||
const ColumnNullable* nullable_column = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
warning: use auto when initializing with a cast to avoid duplicating the type name [modernize-use-auto]
const ColumnNullable* nullable_column = | |
const auto* nullable_column = |
… fix some issues. (apache#36889) - Add iceberg partition transform unit tests. - Change `ColumnWithTypeAndName apply(Block& block, int column_pos)` to `ColumnWithTypeAndName apply(const Block& block, int column_pos)`. - Fix and change string truncate partition transform issue. - Fix bucket partition transform calculation error. - Fix year/month partition transform calculation error due to leap year issue.
d7d31d3
to
9a2c922
Compare
run buildall |
TeamCity be ut coverage result: |
run buildall |
TeamCity be ut coverage result: |
Proposed changes
Cherry-pick iceberg partition transform functionality. #36289 #36889