Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"cold" exit node startup and missing cfg ref #28

Open
sarnold opened this issue Apr 22, 2020 · 3 comments
Open

"cold" exit node startup and missing cfg ref #28

sarnold opened this issue Apr 22, 2020 · 3 comments
Assignees
Labels
watch item hard to reproduce or rarely seen

Comments

@sarnold
Copy link
Contributor

sarnold commented Apr 22, 2020

Where "cold" startup means the node has not (recently) orbited the root; this issue has not been seen on a node restart after missing the initial cfg ref. If it happens, the root never gets the cfg msg from the ctlr node, because the ctlr node never sees the node msg, so the staging queue is never filled, so there's no bootstrap (ie, no networks are created or configured).

  • mbr node: gets the first msg ref, but fails to get the cfg msg (so keeps trying)
  • root node: sees the cfg request but never responds
  • ctlr node: never sees the pub msg
@sarnold sarnold added the bug Something isn't working label Apr 22, 2020
@sarnold sarnold self-assigned this Apr 22, 2020
@sarnold
Copy link
Contributor Author

sarnold commented Apr 23, 2020

The "timing" issue seems mitigated by some refactoring in the announce message queue handling (we now allow duplicate nodes, still only if the node ID and msg ID match).

@sarnold sarnold added watch item hard to reproduce or rarely seen and removed bug Something isn't working labels Apr 23, 2020
@sarnold
Copy link
Contributor Author

sarnold commented Apr 23, 2020

Commit f594a44 has all the mitigation stuff for now; updates will be added here if seen again.

@sarnold
Copy link
Contributor Author

sarnold commented Jun 10, 2020

* Where "mitigation stuff" includes the hold queue and a re-insertion of the node ID into the reg queue if hold limit is exceeded.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
watch item hard to reproduce or rarely seen
Projects
None yet
Development

No branches or pull requests

1 participant