Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Validator epochs/ZIP25 #1298

Open
wants to merge 14 commits into
base: main
Choose a base branch
from
Open

Validator epochs/ZIP25 #1298

wants to merge 14 commits into from

Conversation

theo-zil
Copy link
Contributor

@theo-zil theo-zil commented Aug 14, 2024

Preliminary draft of the new validator state machine, implementing epochs.

  • The validator committee no longer changes except at epoch boundary blocks.
  • Joining or leaving the committee is now subject to a waiting queue of no less than 1 epoch (and currently, no more than 2). Thus, at any given epoch, it is possible to mostly (see Staking: fix total_stake change on slashing #1338) know the validator set for the next epoch.
  • Epoch changes are handled by a tick_epoch() function in the deposit contract.

An important consequence of all this is that our consensus now depends on an EVM call succeeding. Inline comment on this in the code below.

Further work out of scope of this PR:

Comment on lines 1056 to 1064
if self.block_is_first_in_epoch(parent.number() + 1) {
// Here we ignore any errors and proceed with block creation, under the philosophy that
// it is better to have the network run but with broken epochs, than it would be to
// crash the entire network on a failing epoch transition.
match state.tick_epoch() {
Ok(()) => (),
Err(e) => error!("Unable to transition the epoch - EVM error: {e}"),
};
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our consensus now depends on successful EVM calls - and not just read-only checks as before, but a complex write transaction.

This raises the question of what to do if the tick_epoch() call fails for any reason. Should we assume that it must never fail, and crash the node if it does (potentially crashing the network if a buggy state is reached)? The approach taken here is to soft-ignore errors; the network then will remain alive, but with a static validator set since epoch changes will be broken. Measures can then be taken to upgrade to fix the root cause of the issue (e.g. upgrading the deposit contract) to restore normal functionality.

But, especially right now in the early testnet phases, perhaps hard-crashing the network on any bugs might be preferable to discover any issues immediately.

@theo-zil theo-zil marked this pull request as ready for review August 23, 2024 18:22
@theo-zil theo-zil changed the title Validator epochs/ZIP25 [DRAFT] Validator epochs/ZIP25 Aug 26, 2024
Copy link
Contributor

@DrZoltanFazekas DrZoltanFazekas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was only reviewing the deposit contract, will continue with the rest tomorrow

uint256 physical = queue.head + idx;
// Wrap the physical index in case it is out-of-bounds of the buffer.
if (physical >= queue.values.length) {
return queue.values.length - physical;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sure this will fail due to underflow :-)
I guess you want to return physical - queue.values.length;

bytes[] committeeKeys;

KeysQueue[WAIT_EPOCHS] pendingQueues;
KeysQueue[WAIT_EPOCHS] unstakingQueues;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we would need to use KeysQueue[UNBONDING_PERIOD_EPOCHS] unstakingQueues; instead where UNBONDING_PERIOD_EPOCHS is set to e.g. 48 (2 days)

@@ -1,26 +1,138 @@
// SPDX-License-Identifier: MIT OR Apache-2.0
pragma solidity ^0.8.20;

// Implementation of a circular queue/buffer of validator public keys
Copy link
Contributor

@DrZoltanFazekas DrZoltanFazekas Sep 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we could simply use fixed arrays of dynamic arrays of type Value[][WAIT_EPOCHS] instead, with elements being pushed during the epoch and popped until the respective dynamic array is empty when tickEpoch() is called.

}

function currentQueueIndex() private view returns (uint8) {
return uint8(currentEpoch() % WAIT_EPOCHS);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would work if the WAIT_PERIOD were the same for staking and unstaking, but for the latter we'll need a longer unbonding period

_stakerKeys.push(blsPubKey);
// If staker info isn't stored yet, do so
// peerId is guaranteed to be non-zero for all stakers
if (_stakersMap[blsPubKey].controlAddress == address(0)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

due to this condition the deposit() function now can't be used for updating the validators rewardAddress i.e. we should provide an extra function

}

// Apply penalty, if any (might be 0 e.g. for voluntary suspension)
// TODO: collect these rewards somewhere? Right now they're effectively burned
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a fraction of the slashing penalty shall be paid as whistleblower reward to whoever submitted the slashing evidence, the rest must be burned back to the zero account

uint8 queueIndex = currentQueueIndex();

// Process pending queue for this epoch
for (uint i = 0; i < pendingQueues[queueIndex].length(); ++i) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Malicious validators could mount a DoS attack by calling depositTopup() a lot of times with small values so that tickEpoch() would need to consume more than the block limit iterating through the pendingQueue

_stakersMap[blsPubKey].keyIndex = _stakerKeys.length + 1;
_stakerKeys.push(blsPubKey);
// If staker info isn't stored yet, do so
// peerId is guaranteed to be non-zero for all stakers
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it or do have have to check, and even if it is, why is the comment relevant here?

// no associated amount
}

function eject(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is the successor of tempRemoveStaker() and will be removed once penalties are fully implemented, right?

@JamesHinshelwood JamesHinshelwood self-assigned this Sep 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants