Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[fix](orc) fix predicate filter failed when use hive 1.x version #43185

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

fantasy12345zsq
Copy link

@fantasy12345zsq fantasy12345zsq commented Nov 4, 2024

Proposed changes

  1. original issue:
    we found presto scan byte much more bigger then Doris, such as query:
    select l_orderkey from dev.lineitem_orc_100g_backup1 where l_orderkey = 7192579;
    this query just have 1 rows result, scan bytes presto vs Doris:
    presto:
    image
    Doris:
    image
    image
  2. root cause:
    we use hive 1.x version,query predicate column name not match orc column name, in this case:
    schame is l_orderkey name, in orc field name is _col1。
  3. how to fix:
    in _init_search_argument change l_orderkey -> _col1。
  4. profile after fix:
    query latency:
    image
    scan bytes:
    image

@doris-robot
Copy link

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR

Since 2024-03-18, the Document has been moved to doris-website.
See Doris Document.

Copy link
Contributor

github-actions bot commented Nov 4, 2024

clang-tidy review says "All clean, LGTM! 👍"

@fantasy12345zsq fantasy12345zsq force-pushed the fix_orc_predicate branch 2 times, most recently from 6f5bf4b to 1bacb15 Compare November 4, 2024 04:30
Copy link
Contributor

github-actions bot commented Nov 4, 2024

clang-tidy review says "All clean, LGTM! 👍"

Copy link
Contributor

github-actions bot commented Nov 4, 2024

clang-tidy review says "All clean, LGTM! 👍"

Copy link
Contributor

github-actions bot commented Nov 4, 2024

clang-tidy review says "All clean, LGTM! 👍"

1 similar comment
Copy link
Contributor

github-actions bot commented Nov 4, 2024

clang-tidy review says "All clean, LGTM! 👍"

@suxiaogang223
Copy link
Contributor

Hello, thank you for your reply. We have reconstructed the logic of orc's pushdown predicate construction in this pr #43255 , and used _col_name_to_file_col_name_low_case to find the corresponding column name in the file when constructing OrcLiteral. You can use the latest code of the master branch to check whether the problem exists :)

@suxiaogang223
Copy link
Contributor

@fantasy12345zsq This PR will be picked to the branch-2.1 because the branch-2.1 has not yet used the new orc predicate pushdown code. Thanks again for the contribution :)

morningman pushed a commit that referenced this pull request Dec 25, 2024
…on (#45809)

### What problem does this PR solve?

Issue Number: close #xxx

Related PR: #43185 

Pick the pr to branch-2.1 to fix predicate filter failed when use hive
1.x version

Co-authored-by: fantasy12345zsq <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants