core: validate chain before writing block to avoid race condition #1319
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
This PR is to avoid a race condition explained below.
Any incoming chain is checked against the local whitelisted milestone/checkpoint entries during the initial validation. We don't perform any extra checks once the blocks are executed locally. Hence, there is a possibility of receiving a milestone until the time we received the chain to import till the time we execute and write the blocks to db.
E.g. if the last milestone received was 100. We imported a chain from 101 to 120 which is an honest fork. A milestone is being generated from 101 to 116 (can differ also) but it hasn't yet reached bor. For bor, the last milestone is still 100 but 116 will be correct and whitelisted soon (from heimdall). Meanwhile, we receive a side chain (which seems to be incorrect) from 101 to 120. Because bor still doesn't know if milestone with 116 end block is finalised so it can't comment on the validity of that chain. It will try to import it normally and will pass the initial validation. Now, while the blocks of this side chain are executing, bor imports the 101-116 milestone (this is possible because canonical is still the correct chain we received initially). Now, the latest milestone is 116 (corresponding to correct fork) but our local head has the incorrect side chain and we can't know it until we receive the next milestone (even that's tricky because that will only cause a rewind till 116). Hence, the finality API will fallback to using checkpoint for returning the latest finalised block (because the block hash of whitelisted milestone and the local chain doesn't match). For an end user, this may feel like a finality violation but it's just the nature of occurrence of events which is causing it to use a fallback value (i.e. checkpoint).
Changes
Breaking changes
Please complete this section if any breaking changes have been made, otherwise delete it
Nodes audience
In case this PR includes changes that must be applied only to a subset of nodes, please specify how you handled it (e.g. by adding a flag with a default value...)
Checklist
Cross repository changes
Testing
Manual tests
Please complete this section with the steps you performed if you ran manual tests for this functionality, otherwise delete it
Additional comments
Please post additional comments in this section if you have them, otherwise delete it