-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[fix](orc) fix predicate filter failed when use hive 1.x version #43185
base: master
Are you sure you want to change the base?
Conversation
Thank you for your contribution to Apache Doris. Since 2024-03-18, the Document has been moved to doris-website. |
clang-tidy review says "All clean, LGTM! 👍" |
6f5bf4b
to
1bacb15
Compare
clang-tidy review says "All clean, LGTM! 👍" |
1bacb15
to
ba1dcdb
Compare
clang-tidy review says "All clean, LGTM! 👍" |
ba1dcdb
to
f77fc8f
Compare
clang-tidy review says "All clean, LGTM! 👍" |
1 similar comment
clang-tidy review says "All clean, LGTM! 👍" |
Hello, thank you for your reply. We have reconstructed the logic of orc's pushdown predicate construction in this pr #43255 , and used _col_name_to_file_col_name_low_case to find the corresponding column name in the file when constructing OrcLiteral. You can use the latest code of the master branch to check whether the problem exists :) |
@fantasy12345zsq This PR will be picked to the branch-2.1 because the branch-2.1 has not yet used the new orc predicate pushdown code. Thanks again for the contribution :) |
…on (#45809) ### What problem does this PR solve? Issue Number: close #xxx Related PR: #43185 Pick the pr to branch-2.1 to fix predicate filter failed when use hive 1.x version Co-authored-by: fantasy12345zsq <[email protected]>
Proposed changes
we found presto scan byte much more bigger then Doris, such as query:
select l_orderkey from dev.lineitem_orc_100g_backup1 where l_orderkey = 7192579;
this query just have 1 rows result, scan bytes presto vs Doris:
presto:
Doris:
we use hive 1.x version,query predicate column name not match orc column name, in this case:
schame is l_orderkey name, in orc field name is _col1。
in _init_search_argument change l_orderkey -> _col1。
query latency:
scan bytes: