-
Notifications
You must be signed in to change notification settings - Fork 961
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
das: DASing of past headers #473
Conversation
a034dbe
to
8a540d9
Compare
7621a53
to
76e4016
Compare
548ce3e
to
2b6edea
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left some nits. Overall looks very good.
4eeac67
to
ee95275
Compare
I think I will address the first TODO in a separate issue:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally looks good, but there are two issues:
- Stopping logic will panic if there were two or more catchups
- A few changes still needed to be done before share: Cache Availability #180 becomes helpful
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, so the last two issues are fixed. Here is another review with suggestions, thoughts, and one bug catch.
Also, I think I found a drawback in a solution for the last TODO. Think of a common case:
- A Node is started and needs/starts a catch-up
- It finishes catching up while the
sample
routine is and was going for thousands of heights - The node is stopped along with the DASser and the very old checkpoint is saved
- On startup, DASer will load this old checkpoint and will start dasing from it
You may think that #180 will fix this and this is true to some extent, but #180 is about the new ShareAvailable
implementation which stores the fact of availability on disk and to check it we need to do an IO per height, so on long height ranges, this can be an IO waste issue. I think we can justify this issue in the case of a gap described originally, but for the common case I mentioned above this is not acceptable IMO. Also, the logs will be misleading and false stating, as it would say catching up over already sampled headers.
The issue can be fixed by checking: if there were no catchUp routines terminated(meaning we fully caught up on everything), then store the checkpoint not from the catchUpScheduler
, but the one in sample
like it was before.
CI won't run with merge conflicts IIRC, so could you fix the conflict as you push? |
4889821
to
329c004
Compare
@renaynay, did you get a chance to run DASer manually? |
d474f09
to
21045cb
Compare
Co-authored-by: Hlib Kanunnikov <[email protected]>
…is not on the Node
8dd3c00
to
c01a057
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No more serious issues I am aware of, only nits left. Almost there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is with the 2 TODOs mentioned in the opening comment?
Co-authored-by: Ismail Khoffi <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find the naming confusing but the docs clarify it. So LGTM
This PR implements DASing of past headers in a single routine.
Eventually, this will be parallelised.
TODO
DASState
Resolves #181.