-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implementation of atomic operation regions #328
Conversation
/// compaction. | ||
EndAtomicRegion { | ||
timestamp: Timestamp, | ||
begin_index: u64, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
|
||
// But this is not enough, because if the retried transactional block succeeds, | ||
// and later we replay it, we need to skip the first attempt and only replay the second. | ||
// Se we add a Jump entry to the oplog that registers a deleted region. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you explain this part? Not getting it properly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The oplog is append only. So let's imagine that on first run, we fail in the middle of an atomic operation:
x = arbitrary oplog entry
B = atomic region start
F = failure
E = atomic region end
x x x B x x F
now we restart the worker, and replay the oplog. when reaching B, we jump to the end and start writing new entries (now y
) but fail again:
x x x B x x F y y F
we restart it again, and when we replay we do the same, writing z
s but now it does not fail within the block, but fails later:
x x x B x x F y y F z z z z E z z F
Then we restart it again and when we reach B
, we find the corresponding E
so we know the region was committed and we don't need to rerun it.
This means that we need to act like only z z z z
is between B
and E
, otherwise we go out of sync (because how it works is that the WASM code always runs thet same, and when it tries doing some side effect, we read the next oplog entry.
So we have to skip the entries between the first x
and the last F
in the region and we can use the same mechanism as we use for the "time travel jumps" here.
|
||
let begin = mark_begin_operation(); | ||
|
||
remote_side_effect("1"); // repeated 3x |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
repeated 3x here means?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there is a failure between side effect "2" and "3" which is controlled from the outside in a way that first two times it fails, third time it is not. Because this side effect is between begin/end operation markers, it is going to be performed three times, but not more, because after the third the region is committed.
Resolves #162
Contains some refactorings related to #321 (it was confusing what goes into the
DurableWorkerCtx
and what goes into the.private_state
level so removed some unnecessary duplications, and added a macro for looking up the next oplog entry instead of duplicating the same loop for each type)