Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement construct PrefilterExpression from SparqlExpression. #1573

Conversation

realHannes
Copy link
Collaborator

@realHannes realHannes commented Oct 22, 2024

The SparqlExpression base class has been extended with the method getPrefilterExpressionForMetadata. This method constructs for suitable (logical) expressions a corresponding PrefilterExpression (see PR #1503). Those PrefilterExpression(s) preselect the relevant data blocks with the available CompressedBlockMetadata over the IndexScan procedure. This saves a lot of compute in the actual expression evaluation later on.
At the moment, the following expressions provide an overriden implementation of getPrefilterExpressionForMetadata: logical-or and logical-and (binary), logical-not (unary) and the standard RelationalExpressions.

realHannes and others added 30 commits April 25, 2024 19:07
@realHannes realHannes changed the title Implementation construct PrefilterExpression from SparqlExpression. Implementation construct PrefilterExpression from SparqlExpression. Oct 22, 2024
@realHannes realHannes changed the title Implementation construct PrefilterExpression from SparqlExpression. Implement construct PrefilterExpression from SparqlExpression. Oct 22, 2024
Copy link

codecov bot commented Oct 22, 2024

Codecov Report

Attention: Patch coverage is 90.58172% with 34 lines in your changes missing coverage. Please review.

Project coverage is 89.20%. Comparing base (1bcfeeb) to head (f489b24).
Report is 51 commits behind head on master.

Files with missing lines Patch % Lines
src/index/CompressedBlockPrefiltering.cpp 88.07% 3 Missing and 10 partials ⚠️
src/engine/IndexScan.cpp 81.81% 10 Missing ⚠️
...engine/sparqlExpressions/RelationalExpressions.cpp 80.43% 1 Missing and 8 partials ⚠️
src/engine/Filter.cpp 91.66% 0 Missing and 1 partial ⚠️
...ine/sparqlExpressions/NumericBinaryExpressions.cpp 98.30% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1573      +/-   ##
==========================================
- Coverage   89.21%   89.20%   -0.01%     
==========================================
  Files         372      372              
  Lines       34723    35049     +326     
  Branches     3915     3961      +46     
==========================================
+ Hits        30979    31267     +288     
- Misses       2471     2490      +19     
- Partials     1273     1292      +19     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@realHannes realHannes requested a review from joka921 October 25, 2024 13:51
Copy link
Member

@joka921 joka921 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A first thorough round of reviews.
The initial design looks clean and readable.
Please merge the current master in and add tests and fix bugs.

src/engine/sparqlExpressions/NumericBinaryExpressions.cpp Outdated Show resolved Hide resolved
src/engine/sparqlExpressions/NumericBinaryExpressions.cpp Outdated Show resolved Hide resolved
src/engine/sparqlExpressions/NumericBinaryExpressions.cpp Outdated Show resolved Hide resolved
src/engine/sparqlExpressions/NumericBinaryExpressions.cpp Outdated Show resolved Hide resolved
src/engine/sparqlExpressions/NumericBinaryExpressions.cpp Outdated Show resolved Hide resolved
src/engine/sparqlExpressions/NumericUnaryExpressions.cpp Outdated Show resolved Hide resolved
src/engine/sparqlExpressions/RelationalExpressions.cpp Outdated Show resolved Hide resolved
[[maybe_unused]] bool isNegated) const {
return std::nullopt;
};

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You have forgotten one case where you can apply a filter
(I also didn't tell you this).
In the RegexExpression there is the special case of a PrefixRegex which is applied using binary search and can basically be something like ` ?x begins with "hannes" -> ?x >= hannes && ?x " (but of course the logic is a little more involved. You can have a look at it, but maybe this can also be a separate PR as I am currently also refactoring the RegexExpression to not get conflicts.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And while i am at it : The startswith can probably also be implemented in that way.

src/global/ValueIdComparators.h Outdated Show resolved Hide resolved
src/parser/data/Variable.h Outdated Show resolved Hide resolved
Copy link
Member

@joka921 joka921 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two major comments:

  1. Consistently drop the optional<>
  2. Don't use deMorgan at all, as it doesn't work well with the SPARQL semantics.

src/engine/sparqlExpressions/NumericBinaryExpressions.cpp Outdated Show resolved Hide resolved
src/engine/sparqlExpressions/NumericBinaryExpressions.cpp Outdated Show resolved Hide resolved
src/engine/sparqlExpressions/NumericBinaryExpressions.cpp Outdated Show resolved Hide resolved
src/engine/sparqlExpressions/NumericBinaryExpressions.cpp Outdated Show resolved Hide resolved
src/index/CompressedBlockPrefiltering.cpp Outdated Show resolved Hide resolved
src/index/CompressedBlockPrefiltering.cpp Outdated Show resolved Hide resolved
src/index/CompressedBlockPrefiltering.cpp Outdated Show resolved Hide resolved
src/index/CompressedBlockPrefiltering.h Outdated Show resolved Hide resolved
src/engine/sparqlExpressions/NumericBinaryExpressions.cpp Outdated Show resolved Hide resolved
@sparql-conformance
Copy link

Copy link

Copy link
Member

@joka921 joka921 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The most important request

  • split the PR.
  • look at my comments, and then contact me for the questions/ discussions.

src/engine/Filter.h Show resolved Hide resolved
src/engine/IndexScan.cpp Show resolved Hide resolved
src/engine/IndexScan.cpp Show resolved Hide resolved
src/engine/IndexScan.cpp Show resolved Hide resolved
src/engine/IndexScan.h Show resolved Hide resolved
src/engine/Operation.h Show resolved Hide resolved
src/engine/Operation.cpp Show resolved Hide resolved
src/engine/Operation.cpp Show resolved Hide resolved
src/engine/QueryExecutionTree.h Show resolved Hide resolved
if (ptr) {
ptr->setPrefilterExpression(prefilterVec);
}
});
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some detailed comments here:

  1. You may only pass on PrefilterExpressions to a child execution tree, if the variable of the expression is visible in the variable to column map. Otherwise you have the case that subqueries filter out too much.

  2. It is fishy (and leads to unexpected behavior) if you just change the Operation without informing the shared_ptr<ExecutionTree> that owns it.
    So what you need, is a forThisExecutionTreeAndAllDescendents(shared_ptr) that completely replaces the shared_ptr with a completely new execution tree (via make_sharedor respectivelyad_utility::createExecutionTree) that stores the IndexScan. Then you best have a function in the IndexScanClass that can do something like shared_ptr makeCopyWithAddedPrefilters(Prefilters) const(we have something similar in theTransitivePathclass called bindLeftOrRightSide`.

  3. You should make two PRs out of this, one for the extraction of the prefilters out of the expression (the first part of this PR), and then this second one here which is based on that one. Because then we can merge the first part earlier (which is further in the review process), and it gets much easier to review.

realHannes added a commit to realHannes/qlever that referenced this pull request Nov 18, 2024
@joka921
Copy link
Member

joka921 commented Jan 8, 2025

This has been merged already in several other smaller PRs.

@joka921 joka921 closed this Jan 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants