Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Syncer.Synced() #9

Merged
merged 1 commit into from
Jan 30, 2024
Merged

Conversation

mike76-dev
Copy link
Contributor

I need this method, otherwise I have to implement my own Syncer. Later on, it might prove helpful when porting hostd and renterd, too.
It has been working fine in my tests so far. If needed, minSyncedPeers and minTimeUntilSynceed can be added to the config options.

@lukechampine
Copy link
Member

I confess that I have always found Synced() to be kind of a gross hack. It conflates multiple questions:

  • "Have I performed IBD with a reasonable number of peers?"
  • "Does my blockchain tip match the 'real' tip, i.e. the tip I would see on a public explorer?"
  • "Is my wallet balance and transaction history up-to-date?"
  • "If I broadcast a transaction, will my peers accept it?"

Now, it is usually the case that the answer to all of these questions is the same (whether that answer is yes or no). But it is not guaranteed. Maybe you are getting eclipse-attacked. Maybe the public explorer got hacked. Maybe your wallet hasn't had any activity for hundreds of thousands of blocks. Maybe your transaction will be valid despite being a little outdated.

My understanding is that generally, apps want a Synced bool so that they can a) inform the user that their wallet balance is not final, and b) prevent the user from trying sending to send transactions too soon. I suggest handling these cases separately. Instead of "synced/not synced," tell the user that their balance is up-to-date as of block X at timestamp T. And instead of waiting until "synced" to allow transactions, allow them as soon as the last known block is within 1 day. Separately, you can add an (optional) setting to compare the local tip against a public explorer, and display a checkmark or something when it matches.

I realize I'm probably in the minority here, so perhaps @n8maninger or @ChrisSchinnerl could weigh in with their thoughts. Anyway, I do appreciate you taking the time to understand and modify the Syncer code. The implementation itself looks fine to me.

@ChrisSchinnerl
Copy link
Member

I think having a Synced method is very useful and considering that coreutils is implementing the Syncer which has the most information on what other peers currently consider synced it is probably also the place to put it.

That said I think waiting for at least 10 minutes before considering us synced breaks many use-cases. e.g. the Autopilot in renterd relies on Synced and having to wait for 10 minutes every time you start the autopilot ruins UX.

I think a reasonable assumption would be to check 2 things.

  1. do we have a minimum of n peers. e.g. 6
  2. have all of those peers reported that they don't have anymore blocks for us
  3. has the latest block arrived within the last hour

That's probably the best any node in a trustless, distributed environment can do anyway without a trusted entity which we don't want to rely on for the security of our daemons.
We can only hope to make the Syncer as resilient to various attacks as possible via our implementation.

@mike76-dev
Copy link
Contributor Author

mike76-dev commented Jan 22, 2024

The method returns true in two cases:

  • if minimum 5 peers say that we are synced
    OR
  • if 10 minutes have passed since we have started syncing.

In my tests on Zen, I usually gain the minimum peer count within less than a minute. On Anagami, though, I have to wait 10 minutes, because I usually don't get to connect to 5 peers there.

Regarding 3), it's possible to have the last block mined more than 1 hour ago (especially on Anagami), in that case we're not synced anymore.

@n8maninger
Copy link
Member

n8maninger commented Jan 22, 2024

I don't see a reason not to add it. I would suggest a few changes, though:

  1. Make minSyncedPeers configurable as WithMinPeers
  2. Remove minTimeUntilSynced. Not enough peers is a valid reason to consider a node unsynced. Five peers is a low bar to clear, even for Anagami.
  3. Check if the last block timestamp is less than 24 hours ago. If the node hasn't received a recent block from a peer, we're not synced, the system clock is wrong, or the whole network is experiencing issues.

Anagami is a special (temporary) case; nothing should be built with Anagami in mind.

@lukechampine
Copy link
Member

the Autopilot in renterd relies on Synced

@ChrisSchinnerl If your Synced heuristic was "the last known block is <24 hrs old," would that break anything? I can see it causing mild issues, like transactions failing. You might also calculate contract end heights that are a bit shorter than they should be. Anything else?

Separately: what if instead of adding Synced method to Syncer, we added it to gateway.Peer? Then clients can implement whatever heuristic they want.

@ChrisSchinnerl
Copy link
Member

renterd currently operates under the same heuristics that siad provides. Which I think is that peers don't have any blocks for us? We want it to be as accurate as possible when interacting with hosts because once we consider ourselves synced, the renter will only grant +-12 blocks or so as wiggle room to hosts when trying to agree on a block height.
If we considered ourselves synced with the last block being 24 hours old we'd have to increase that threshold with hosts as well since they might be 144 blocks ahead of us when we consider ourselves synced. So I'd rather have it be smaller than that.

Exposing all that info in a different way would be fine as well imo. Maybe even better since we can tweak it ourselves and even have users make their own choices.

@mike76-dev
Copy link
Contributor Author

mike76-dev commented Jan 22, 2024

For the record: this is the logic that siad uses:

// The consensus set will not recognize IBD as complete until it has enough
// peers. After the deadline though, it will recognize the blockchain
// download as complete even with only one peer. This deadline is helpful
// to local-net setups, where a machine will frequently only have one peer
// (and that peer will be another machine on the same local network, but
// within the local network at least one peer is connected to the braod
// network).```

@n8maninger
Copy link
Member

renterd currently operates under the same heuristics that siad provides.

IIRC, renterd switched over to the same heuristic hostd uses: last block timestamp < 1 hour

@lukechampine
Copy link
Member

the renter will only grant +-12 blocks or so as wiggle room to hosts when trying to agree on a block height

I wonder if this is something we could address in RHPv4. Like, if the host thinks the renter is behind, it could send a header chain proving it.

Anyway, I think some combination of:

  • Checking last block timestamp
  • Checking the Synced flag on individual peers
  • Comparing tip against public explorer

should be sufficient for anyone's needs. Unless there are further objections, I suggest we add the Synced flag to gateway.Peer, like so:

type Peer struct {
	UniqueID UniqueID
	Version  string
	Addr     string
	ConnAddr string
	Inbound  bool
	smux     *smux.Session // for v1
	mux      *mux.Mux      // for v2
	mu       sync.Mutex
	synced   bool
	err      error
}

// Synced returns the peer's sync status.
func (p *Peer) Synced() bool {
	p.mu.Lock()
	defer p.mu.Unlock()
	return p.synced
}

// SetSynced sets the peer's sync status.
func (p *Peer) SetSynced(synced bool) {
	p.mu.Lock()
	defer p.mu.Unlock()
	p.synced = synced
}

and call SetSynced in the Syncer code where appropriate. (This would also let us remove the synced field from Syncer.) @mike76-dev, does that sound ok to you?

@n8maninger
Copy link
Member

n8maninger commented Jan 22, 2024

I see no reason to export SetSynced. The only package with enough information to set that is the Syncer.

@lukechampine
Copy link
Member

True, but gateway.Peer is not in the syncer package 😅

@mike76-dev
Copy link
Contributor Author

@mike76-dev, does that sound ok to you?

Yeah, that should be fine. I can take over the PRs, if nobody minds.

@n8maninger
Copy link
Member

n8maninger commented Jan 22, 2024

@mike76-dev, does that sound ok to you?

Yeah, that should be fine. I can take over the PRs, if nobody minds.

We will be moving Peer and the RPC implementations out of core and into coreutils to be more consistent with the rhp package. Hold off on any other changes until after

@lukechampine
Copy link
Member

@mike76-dev now that #10 is merged, feel free to proceed

@n8maninger
Copy link
Member

n8maninger commented Jan 26, 2024

Can you rebase so we get a clean history?

syncer/syncer.go Outdated Show resolved Hide resolved
@mike76-dev mike76-dev reopened this Jan 26, 2024
@lukechampine lukechampine merged commit 5f8451a into SiaFoundation:master Jan 30, 2024
12 checks passed
@lukechampine
Copy link
Member

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants