Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(evmstaking): implement epoched staking #96

Merged
merged 10 commits into from
Oct 10, 2024

Conversation

Narangde
Copy link
Contributor

@Narangde Narangde commented Sep 9, 2024

Implemention of epoched staking in evmstaking module.

With epoched staking, the messages are handled in 2 steps: queueing & processing. The messages from EL related to validator set are not executed immediately, but every epoch.

Queueing messages

When events related to validator set (CreateValidator, Deposit, Withdraw, Redelegate, and Unjail) are emitted from EL, the messages are queued in k.MessageQueue.

Processing messages

In EndBlock of evmstaking module, it is checked every block if the next epoch is started or not. If the next epoch is started, it executes all queued messages and return the updated validator set.

Epoch info

The epoch is managed in epochs keeper, so it needs to be injected to evmstaking keeper. The epoch_identifier for epoched staking is added to params of evmstaking.

issue: none

@Narangde Narangde force-pushed the feat/implement-epoched-staking branch from 6d2c56c to 1954660 Compare September 16, 2024 07:03
Comment on lines -19 to -20
ProcessDeposit(ctx context.Context, ev *bindings.IPTokenStakingDeposit) error
ProcessWithdraw(ctx context.Context, ev *bindings.IPTokenStakingWithdraw) error
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These two are not used in evmengine module

@Narangde Narangde force-pushed the feat/implement-epoched-staking branch from 1954660 to 17ce356 Compare September 17, 2024 08:15
@Narangde Narangde changed the title feat(wip): implement epoched staking feat(evmstaking): implement epoched staking Sep 17, 2024
@Narangde
Copy link
Contributor Author

@zsystm With this implementation, I fixed some tests of evmstaking module. Please review this 4c344d2.

@zsystm
Copy link
Collaborator

zsystm commented Sep 18, 2024

@Narangde
Since I’m on vacation today, I won’t be able to review the updated test code. I’ll review it tomorrow.

client/x/evmstaking/types/params.proto Outdated Show resolved Hide resolved
client/x/evmstaking/types/evmstaking.proto Outdated Show resolved Hide resolved
client/x/evmstaking/keeper/abci.go Show resolved Hide resolved
client/x/evmstaking/keeper/abci.go Outdated Show resolved Hide resolved
@Narangde Narangde force-pushed the feat/implement-epoched-staking branch 2 times, most recently from d04622b to 4e25327 Compare September 18, 2024 09:31

if isNextEpoch {
// process all queued messages
if err := k.ProcessAllMsgs(ctx); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what if we have a lot msgs? any risk of block timeout here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Epoch blocks usually take longer due to backlog calculations, e.g. Osmosis's daily epoch block takes up to a few minutes at worst. We should set settings to allow epoch blocks to exceed any timeout

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. we need to allow longer block time for epoch block.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense👍🏻

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How are we thinking about setting a longer timeout for epoch blocks? I think the current timeout in config is set for all blocks

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure we can set longer timeout for epoch block, but if possible, it could be good. Why don't we make improvement in another PR?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should definitely increase the timeout. I would imagine a lot of operations need to be processed during the epoch block. Can be in another PR

client/x/evmstaking/keeper/epochs.go Outdated Show resolved Hide resolved
Copy link
Collaborator

@zsystm zsystm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comments for mocking usage

client/x/evmstaking/keeper/abci_test.go Outdated Show resolved Hide resolved
@Narangde Narangde force-pushed the feat/implement-epoched-staking branch from 99c399e to e3589d7 Compare September 21, 2024 08:54
@Narangde
Copy link
Contributor Author

I added validation of epoch duration for epoched staking in a6c4fd5.

This is because if the unbonding period is shorter than the epoch duration, the unbonding is mature, but the staking keeper's EndBlocker may not process completeUnbonding and fail to send coins from the account to the module.

@edisonz0718
Copy link
Contributor

edisonz0718 commented Sep 23, 2024

I added validation of epoch duration for epoched staking in a6c4fd5.

This is because if the unbonding period is shorter than the epoch duration, the unbonding is mature, but the staking keeper's EndBlocker may not process completeUnbonding and fail to send coins from the account to the module.

Could you elaborate this more? Why completeUnbonding is not processed if unbonding period is shorter than the epoch duration? I thought withdrawals (full or partial) should happen regardlessly.

@edisonz0718 edisonz0718 reopened this Sep 23, 2024
@jdubpark
Copy link
Contributor

FWIW, the current design implies that withdrawing a stake doesn't immediately lower the validator power, and only starts to unbond (for 21 days) at the start of each epoch.


if err := k.processMsg(ctx, &qMsg); err != nil {
log.Warn(ctx, "Failed to process queued message", err, "tx_id", string(qMsg.TxId))
return errors.Wrap(err, "process queued message")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to continue here, otherwise all queued messages after the error will not get processed. The same issue happened before while the evmstaking module processed events (incl. invalid ones) from the staking contract.

Also, we need to consider what it means for a queued message to fail. If a reward minting fails, why did it fail, and are there any side effects we need to resolve (e.g. bank module balance increasing)? In other words, gracefully handling failed messages in an epoch.

Copy link
Contributor Author

@Narangde Narangde Sep 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that we continue processing messages. I just returned error here because it is used for our test codes. I will make it continue processing messages.

Also totally agree that we need to handle failed messages. This is also the part that I had the most trouble with. Since the case where a message fails can be quite complex to consider right now, I'm thinking of storing failed messages separately rather than handling them right away, so that we can handle them later appropriately for example via upgrade. What do you think?

return errors.Wrap(err, "process CreateValidator msg")
}
case *stype.MsgDelegate:
if err := k.ProcessDepositMsg(ctx, unwrappedMsg); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the future, maybe we can add an improvement where deposits are handled per block but only go into effect at the end of each epoch.

@Narangde
Copy link
Contributor Author

Could you elaborate this more? Why completeUnbonding is not processed if unbonding period is shorter than the epoch duration? I thought withdrawals (full or partial) should happen regardlessly.

Because completeUnbonding is called in staking keeper's EndBlocker, which is called every epoch.
I will give an example of this. Let's assume the unbonding period is 10s and epoch duration is 60s. And epoch starts at time t.

  1. t: epoch starts
  2. t+10 withdraw tx submitted and queued
  3. t+60 withdraw tx executed when the next epoch starts. Here, the unbonding complete time is t+70.
  4. t+70 unbonding is completed, but completeUnbonding is not called yet because epoch is not passed, leading to failure in submit coins from account to module.

@Narangde Narangde force-pushed the feat/implement-epoched-staking branch from a6c4fd5 to b4dc947 Compare September 28, 2024 10:12
@@ -144,7 +149,7 @@ func (k Keeper) ProcessStakingEvents(ctx context.Context, height uint64, logs []
continue
}
ev.StakeAmount.Div(ev.StakeAmount, gwei)
if err = k.ProcessCreateValidator(ctx, ev); err != nil {
if err = k.HandleCreateValidatorEvent(ctx, ev); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we combine all these events to a single enqueue since all of them share the same msg type

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, all types of events are converted to types.QueuedMessage and enqueue to a single queue. Did you mean to combine all Handle*Event to a single func?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we can combine all Handle*Event function to a single function. Because each event has different fields. If we want to make a single function, we need to move the logics for handling events (converting each event to QueuedMessage and enqueuing) out to ProcessStakingEvents. If I misunderstand your comment, please correct me.

@@ -144,7 +149,7 @@ func (k Keeper) ProcessStakingEvents(ctx context.Context, height uint64, logs []
continue
}
ev.StakeAmount.Div(ev.StakeAmount, gwei)
if err = k.ProcessCreateValidator(ctx, ev); err != nil {
if err = k.HandleCreateValidatorEvent(ctx, ev); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah

@@ -268,7 +270,7 @@ func (k Keeper) ProcessWithdraw(ctx context.Context, ev *bindings.IPTokenStaking

amountCoin, _ := IPTokenToBondCoin(ev.Amount)

log.Debug(ctx, "Processing EVM staking withdraw",
log.Info(ctx, "EVM staking withdraw detected",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should keep this as debug. I'm thinking all tx level information are more for debugging purpose.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For other events, the log level in Info when the event is detected, thus I changed this to Info as well for code consistency. I think Debug is enough when the event is detected. How about we changing all log for event detection to Debug level?

Copy link
Contributor

@0xHansLee 0xHansLee Oct 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@edisonz0718, could you check this comment? As you said, it seems like to be better that we use Debug for all tx log as done in 36a50a2

delEvmAddr,
entry.amount.Uint64(),
))
partialWithdrawals, err := k.ExpectedPartialWithdrawals(ctx)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the partial withdrawal should still happen per block.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right.

"min_partial_withdrawal_amount": 100000000
}
"min_partial_withdrawal_amount": 100000000,
"epoch_identifier": "minute"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do we add epoch duration in genesis

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only epoch identifier is needed in evmstaking module. Epoch info such as epoch duration is in epochs module, and it can be added in genesis.json for now. There is no way to add epoch info as cosmos-sdk do.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will be great to add configuration in genesis.json to change the number of blocks per epoch. It helps to perform tests in reasonable amount of time.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The epoch duration can be set in epochs module. For local or mininet test, we can use shorter epoch by setting shorter epoch. emvstaking module depends on epochs module

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we can set the duration in genesis.json, it will make setting different networks easier. Otherwise, different networks need to manage different binaries.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's ok if each network has different genesis.json file (to be exact, different epoch identifier for epoch staking).

If we add duration in evmstaking param (and add to genesis.json), we don't need epochs module in evmstaking module and we don't need epochs module in Story for now. evmstaking module can be independent with epochs module. The reason why I have added and used epochs module for epoch staking is that I wanna make epochs module manage epoch stuffs in our app for modularity. Do you think that it better than current design that evmstaking module have independent epoch info for epoch staking?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I think we should add epoch info we use for evmstaking to epochs module's genesis file. I guess what you meant is that it should be in another PR?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I got it. I may misunderstand your comment.
Sure! I will add it in the following PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in #181

"min_partial_withdrawal_amount": 100000000,
"epoch_identifier": "minute"
},
"epoch_number": "0"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the starting number of epoch?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is the epoch number currently used in evmstaking module for epoch staking. This value is used for checking the next epoch is started in epochs module by comparing epoch number.

"min_partial_withdrawal_amount": 100000000
}
"min_partial_withdrawal_amount": 100000000,
"epoch_identifier": "minute"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will be great to add configuration in genesis.json to change the number of blocks per epoch. It helps to perform tests in reasonable amount of time.

@0xHansLee 0xHansLee force-pushed the feat/implement-epoched-staking branch 2 times, most recently from 823ad83 to 22d3af8 Compare October 8, 2024 06:16
if err != nil {
return nil, errors.Wrap(err, "map delegator pubkey to evm address")
// init message queue
if err := k.MessageQueue.Initialize(ctx); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will all processed msgs pruned after re-init the MessageQueue?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes right.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we initialize every epoch? I thought there is a max amount that one epoch can process and the rest will flow to the next epoch

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the current PR, there is no limit in size of queue, that is, process all queued msgs.
With limitation, we should not initialize queue as you said.

@0xHansLee 0xHansLee force-pushed the feat/implement-epoched-staking branch from 22d3af8 to 44d0595 Compare October 9, 2024 09:12
Comment on lines +79 to +83
unbondedEntries = append(unbondedEntries, UnbondedEntry{
validatorAddress: dvPair.ValidatorAddress,
delegatorAddress: dvPair.DelegatorAddress,
amount: amt.Amount,
})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought we have a limit on the number of operations per unbonding period or epoch. Do we know where that is enforced?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As commented here, no limit in this PR. Will do in the following PR.


if isNextEpoch {
// process all queued messages
if err := k.ProcessAllMsgs(ctx); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should definitely increase the timeout. I would imagine a lot of operations need to be processed during the epoch block. Can be in another PR

if err != nil {
return nil, errors.Wrap(err, "map delegator pubkey to evm address")
// init message queue
if err := k.MessageQueue.Initialize(ctx); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we initialize every epoch? I thought there is a max amount that one epoch can process and the rest will flow to the next epoch

@edisonz0718
Copy link
Contributor

edisonz0718 commented Oct 9, 2024

Based on the comments, there are still two changes to be made:

  1. We need a maximum length for the message queue. Can set it to 40,000 for now. Also, we only process X requests from the queue per epoch. We can set it to 4096 first.
  2. Increase the timeout for the epoch block.

You can create issues and add them in a separate PR if it's a bigger change.

@0xHansLee
Copy link
Contributor

Based on the comments, there are still two changes to be made:

We need a maximum length for the message queue. Can set it to 40,000 for now. Also, we only process X requests from the queue per epoch. We can set it to 4096 first.
Increase the timeout for the epoch block.
You can create issues and add them in a separate PR if it's a bigger change.

Thank you for your clarification. I will make another PR for the above things.

@0xHansLee 0xHansLee force-pushed the feat/implement-epoched-staking branch from 36a50a2 to 9c976f4 Compare October 10, 2024 08:05
@0xHansLee 0xHansLee force-pushed the feat/implement-epoched-staking branch from 81e0517 to 71fbbde Compare October 10, 2024 08:54
@0xHansLee 0xHansLee merged commit 91a4806 into main Oct 10, 2024
6 checks passed
@0xHansLee 0xHansLee deleted the feat/implement-epoched-staking branch October 10, 2024 08:57
Copy link

Binary uploaded successfully 🎉

📦 Version Name: 0.10.1-unstable-91a4806
📦 Download Source: AWS S3

0xHansLee added a commit that referenced this pull request Oct 13, 2024
@0xHansLee 0xHansLee restored the feat/implement-epoched-staking branch October 13, 2024 07:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants