Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use DeltaTriplesManager to process update request #1608

Merged
merged 19 commits into from
Nov 18, 2024

Conversation

Qup42
Copy link
Member

@Qup42 Qup42 commented Nov 7, 2024

Add the missing code in Server.cpp and Server.h to process a parsed update request using the DeltaTriplesManager from #1603 .
As of this commit, QLever has an initial beta support for a subset of SPARQL UPDATE with the following limitations (all of which will be added in the future, and the list is far from being complete):

  1. Updates are not yet persistent, so a restart or crash of the QLever server will delete all updates.
  2. Only a single update request per query is allowed, not the syntax that atomically chains an arbitrary sequence of updates.
  3. Only INSERT and DELETE queries are supported, the support for queries that drop or add a complete graph will be added in the future.

Copy link
Member

@joka921 joka921 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some minor requests and thoughts.
We should now have almost everything in place to try out merging all this together for experimenting.

src/engine/Server.cpp Outdated Show resolved Hide resolved
src/engine/Server.cpp Outdated Show resolved Hide resolved
src/engine/Server.cpp Outdated Show resolved Hide resolved
src/engine/Server.cpp Outdated Show resolved Hide resolved
src/engine/Server.cpp Outdated Show resolved Hide resolved
Copy link
Member

@joka921 joka921 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some small changes.

src/engine/Server.h Outdated Show resolved Hide resolved
src/engine/Server.cpp Outdated Show resolved Hide resolved
Comment on lines 866 to 882
auto [queryHub, messageSender] =
createMessageSender(queryHub_, request, update);

auto [cancellationHandle, cancelTimeoutOnDestruction] =
setupCancellationHandle(messageSender.getQueryId(), timeLimit);

auto [pinSubtrees, pinResult] = determineResultPinning(params);
LOG(INFO) << "Processing the following SPARQL update:"
<< (pinResult ? " [pin result]" : "")
<< (pinSubtrees ? " [pin subresults]" : "") << "\n"
<< update << std::endl;
QueryExecutionContext qec(index_, &cache_, allocator_,
sortPerformanceEstimator_, std::ref(messageSender),
pinSubtrees, pinResult);
auto plannedQuery = co_await setupPlannedQuery(
params, update, qec, cancellationHandle, timeLimit, requestTimer);
auto qet = plannedQuery.queryExecutionTree_;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code is identically duplicated between the query and the update case (the only differense is "UPDATE" vs "QUERY" in the Log message.
Just make it a function that
returns unique_ptr<SomeStructThatcontainsStuffThatWeNeedToAccessAndKeepAlive> and does all this common setup of message senders, cancellation handles, qets, and whatNotElse.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am struggling with the type of cancelTimeoutOnDestruction lambda. Can you help with that @joka921? If skip the first two statements.

What I tried so far:

  • std::function - lambda is not copy constructible, because it captures
  • std::move_only_function - is c++23
  • function pointer - lambda captures
  • auto in the struct - lambda captures
  • obtaining the type through template magic was also difficult because the function is a private member function in the same class

Hannah Bast added 2 commits November 8, 2024 19:13
TODO: This does not yet work for concurrent update requests (which need
to be serialized via a lock).

But good enough for playing around already.
@hannahbast
Copy link
Member

@Qup42 I am already playing around with this (in conjunction with #1597) and it works like a charm. I have one request:

Currently, the API call simply returns a string "Update successful". It would be great if for Accept: application/qlever-results+json, it would return a JSON with (for now) the following fields:

  1. A field time with the following subfields, all values in milliseconds: parse (the time for parsing the query), where (the time for evaluating the WHERE clause, or zero if there is none), update (the time for adding the triples to the delta triples and locating them in the blocks), and total (the total time for the update operation).
  2. A field delta-triples with the following subfields: before (the number of delta triples before the update operation), after (the number of delta triples after the update operation), and difference (the value of after minus the value of before).
  3. A field located-triples with the following subfields: pos with two subfields blocks-total (the total number of blocks in that permutation) and blocks-affected (the number of blocks with at least one located triple), and the same for every other permutation.

I am assuming that all this information is available relatively easily. If not, please let me know. Please also let me know if you think that this belongs in a separate PR.

And somewhat related: In my original (by now, ancient) draft of this, there was a command to clear all delta triples. For example, it could be activated via cmd=clear-delta-triples (it should require the access-token, of course). Would that be easy to add here or should that (also) be in a separate PR?

@hannahbast hannahbast marked this pull request as ready for review November 10, 2024 18:50
@hannahbast hannahbast changed the title Enable processing of Updates in Server Use DeltaTriplesManager to process update request Nov 10, 2024
Copy link

codecov bot commented Nov 10, 2024

Codecov Report

Attention: Patch coverage is 7.69231% with 108 lines in your changes missing coverage. Please review.

Project coverage is 89.21%. Comparing base (1bcfeeb) to head (42f3846).
Report is 8 commits behind head on master.

Files with missing lines Patch % Lines
src/engine/Server.cpp 6.95% 107 Missing ⚠️
src/index/DeltaTriples.cpp 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1608      +/-   ##
==========================================
- Coverage   89.21%   89.21%   -0.01%     
==========================================
  Files         372      374       +2     
  Lines       34723    35387     +664     
  Branches     3915     3992      +77     
==========================================
+ Hits        30979    31569     +590     
- Misses       2471     2529      +58     
- Partials     1273     1289      +16     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@Qup42
Copy link
Member Author

Qup42 commented Nov 11, 2024

@hannahbast adding some timing information and further metadata to the update response is already on my ToDo list. Points 1 (time) and 2 (delta-triples) should be no problem. I would extend the delta-triples field by the number of added/removed fields, as these are not knowable otherwise. For located-triples I can't estimate how hard this is.

My current plan was to do the additional info in a separate PR. The PR activating the updates in the server was supposed to be small, so that it can be merged quickly. Before adding this additional information I wanted to have a look at what other engines do here. To be able to run updates some changes are required in qlever-ui, so there was little use to every it earlier at that poing.
But this can of course change depending on e.g. how long it will take to merge #1597 and how much value the additional info would provide for you for while benchmarking.

Clearing the updates is just very little code, I'll add that to this PR.

Copy link
Member

@joka921 joka921 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The general setup is fine, but there's some things to consider concerning threadsafety and
deadlock prevention.

src/index/DeltaTriples.cpp Outdated Show resolved Hide resolved
src/engine/Server.cpp Outdated Show resolved Hide resolved
src/engine/Server.cpp Outdated Show resolved Hide resolved
src/engine/Server.cpp Show resolved Hide resolved
Copy link
Member

@joka921 joka921 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two very minor changes.

src/engine/Server.cpp Show resolved Hide resolved
src/engine/Server.cpp Outdated Show resolved Hide resolved
@sparql-conformance
Copy link

@joka921 joka921 merged commit 97d5037 into ad-freiburg:master Nov 18, 2024
20 of 22 checks passed
@Qup42 Qup42 deleted the updateLastStep branch November 19, 2024 16:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants