Skip to content

Commit

Permalink
Apply binary search filter expressions directly on the block metadata…
Browse files Browse the repository at this point in the history
… of `Index Scan`s (#1619)

With this PR, filter expressions that can be evaluated via binary search on a sorted input are directly evaluated on the block metadata of an IndexScan. For example in a query that contains `{ ?s ?p ?o FILTER (?o > 3)`} only the blocks of the full index scan (sorted by the object) are read from disk that according to their metadata might contain values `> 3`.

Currently this mechanism has the following limitations:
1. It can only be applied if the IndexScan directly is the child of the FILTER clause
2. It can only be applied to logical expressions (AND/OR/NOT) and to relational expressions (greater than, equal to, etc.) between a variable and a constant. Currently the constant can not yet be an IRI or Literal.
  • Loading branch information
realHannes authored Dec 2, 2024
1 parent 82ccc51 commit 7680177
Show file tree
Hide file tree
Showing 17 changed files with 987 additions and 307 deletions.
16 changes: 15 additions & 1 deletion src/engine/Filter.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,9 @@ Filter::Filter(QueryExecutionContext* qec,
sparqlExpression::SparqlExpressionPimpl expression)
: Operation(qec),
_subtree(std::move(subtree)),
_expression{std::move(expression)} {}
_expression{std::move(expression)} {
setPrefilterExpressionForChildren();
}

// _____________________________________________________________________________
string Filter::getCacheKeyImpl() const {
Expand All @@ -37,10 +39,22 @@ string Filter::getCacheKeyImpl() const {
return std::move(os).str();
}

//______________________________________________________________________________
string Filter::getDescriptor() const {
return absl::StrCat("Filter ", _expression.getDescriptor());
}

//______________________________________________________________________________
void Filter::setPrefilterExpressionForChildren() {
std::vector<PrefilterVariablePair> prefilterPairs =
_expression.getPrefilterExpressionForMetadata();
auto optNewSubTree = _subtree->setPrefilterGetUpdatedQueryExecutionTree(
std::move(prefilterPairs));
if (optNewSubTree.has_value()) {
_subtree = std::move(optNewSubTree.value());
}
}

// _____________________________________________________________________________
ProtoResult Filter::computeResult(bool requestLaziness) {
LOG(DEBUG) << "Getting sub-result for Filter result computation..." << endl;
Expand Down
11 changes: 10 additions & 1 deletion src/engine/Filter.h
Original file line number Diff line number Diff line change
Expand Up @@ -10,10 +10,11 @@

#include "engine/Operation.h"
#include "engine/QueryExecutionTree.h"
#include "engine/sparqlExpressions/SparqlExpressionPimpl.h"
#include "parser/ParsedQuery.h"

class Filter : public Operation {
using PrefilterVariablePair = sparqlExpression::PrefilterExprVariablePair;

private:
std::shared_ptr<QueryExecutionTree> _subtree;
sparqlExpression::SparqlExpressionPimpl _expression;
Expand Down Expand Up @@ -58,6 +59,14 @@ class Filter : public Operation {
return _subtree->getVariableColumns();
}

// The method is directly invoked with the construction of this `Filter`
// object. Its implementation retrieves <PrefilterExpression, Variable> pairs
// from the corresponding `SparqlExpression` and uses them to call
// `QueryExecutionTree::setPrefilterGetUpdatedQueryExecutionTree()` on the
// `subtree_`. If necessary the `QueryExecutionTree` for this
// entity will be updated.
void setPrefilterExpressionForChildren();

ProtoResult computeResult(bool requestLaziness) override;

// Perform the actual filter operation of the data provided.
Expand Down
Loading

0 comments on commit 7680177

Please sign in to comment.