-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: Do not eat bitmap backfill errors #871
Conversation
block-streamer/src/block_stream.rs
Outdated
@@ -269,7 +269,7 @@ async fn process_bitmap_indexer_blocks( | |||
|
|||
let mut last_published_block_height: u64 = start_block_height; | |||
|
|||
while let Some(Ok(block_height)) = matching_block_heights.next().await { | |||
while let Some(block_height) = matching_block_heights.next().await.transpose()? { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The thing here is we could also trace the error and then proceed to Lake. But I felt this was a situation that deserves a fail fast given that the stored Last Processed Block
will allow for graceful restarts and errors in the backfill process should be fixed immediately.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In what case do we get an error? Can we handle it internally with retries?
I'm on the fence about this. We could log it, monitor it and perhaps set up an alert, and then fix, all while not causing disruptions to users?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed in team meeting to go with a resilient approach where possible. I've changed the code to instead log an error and then proceed to lake.
|
||
last_published_block_height = block_height; | ||
} | ||
Err(err) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should break
right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yep we should do so explicitly. If an error happens, the stream just ends anyway, but a break is more explicit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I was under the assumption that the error handling here would cause it to continue processing.
With the previous syntax it would continue because you are matching against Ok()
, so if it was an Err()
the loop condition wasn't met.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I believe in the new case, the next() would complete as the stream terminates due to the thrown error and thus cannot yield anymore.
If the stream yielding block heights from the bitmap indexer fails, its error will be eaten by the while loop as the loop will simply end if any error takes place. This PR ensures that if the result from the stream is an error, it logs the error before proceeding with Lake.