Skip to content

Commit

Permalink
feat!: remove VSCMaturedPackets (provider changes) (#2098)
Browse files Browse the repository at this point in the history
* turn off expensive tests

* deprecate key prefixes

* tag protos for removal

* remove QueryOldestUnconfirmedVsc

* make AfterUnbondingInitiated a noop

* add migration

* remove completeMaturedUnbondingOps

* send VSCPacket only on valupdates

* remove VSC timeout logic

* remove unbonding ops logic

* remove VscUnbondingOps from ConsumerState

* remove unnecessary fields from GenesisState

* remove GetAllUnbondingOps

* remove deprecated protos

* remove TODO

* remove unbonding integration tests

* fix checkConsumerChainIsRemoved

* remove checkCCVUnbondingOp

* TestProviderStateIsCleanedAfterConsumerChainIsStopped

* fix TestRelayAndApplyDowntimePacket

* remove vsc_timeout_period param

* add TestUndelegationCompletion

* add TODO for init timeout

* fix integration tests

* avoid breaking protos

* remove TestTooManyLastValidators

* add ADR for removing vscmatured packets

* Update docs/docs/adrs/adr-018-remove-vscmatured.md

Co-authored-by: MSalopek <[email protected]>

* Update docs/docs/adrs/adr-018-remove-vscmatured.md

Co-authored-by: MSalopek <[email protected]>

* add consumer changes -- wip

* consumer changes

* add new prefix for ConsumerAddrsToPrune

* add consumer_addrs_to_prune_v2 to  GenesisState

* add migration for ConsumerAddrsToPrune index

* Update docs/docs/adrs/adr-018-remove-vscmatured.md

Co-authored-by: insumity <[email protected]>

* Update docs/docs/adrs/adr-018-remove-vscmatured.md

Co-authored-by: insumity <[email protected]>

* deprecate ConsumerAddrsToPruneBytePrefix

* change name of field for ConsumerAddrsToPruneV2

* reuse original ConsumerAddrsToPrune methods

* stop handling VSCMaturedPackets

* update ADR

* apply review suggestions

* Update docs/docs/adrs/adr-018-remove-vscmatured.md

Co-authored-by: Philip Offtermatt <[email protected]>

* Update docs/docs/adrs/adr-018-remove-vscmatured.md

Co-authored-by: Philip Offtermatt <[email protected]>

* Update docs/docs/adrs/adr-018-remove-vscmatured.md

Co-authored-by: Simon Noetzlin <[email protected]>

* describe how hermes uses UnbondingPeriod

* apply review suggestions

* apply review suggestions

* apply review suggestions

* remove ConsumerAddrsToPrune

* deprecate InitTimeoutTimestampBytePrefix

* remove init timeout logic

* migration to remove InitTimeoutTimestamp state

* update GenesisState

* fix tests

* Update x/ccv/provider/types/keys.go

Co-authored-by: insumity <[email protected]>

* apply review suggestions

* uncomment TestSimulatedAssignmentsAndUpdateApplication

* Fixed TestSimulatedAssignmentsAndUpdateApplication test

* formatting fix

* fix error message

* remove TestInitTimeout

* remove init_timeout_period param

* fix migration and tests

* update ADR 18 status

* remove InitTimeoutTimestamp proto

* add TestPreserveBytePrefix

* fix migration; wip

* remove throttle_legacy as not needed

* fix v7 migration

* remove GetBinaryCodec

* avoid calling ConsumerAddrsToPruneV2Key twice

* convert keys prefixes to byte

* add changelog entries

* update docs

* remove duplicate event type

* revert changes to test.yml

* apply review suggestions

* apply minor changes

* add test for MigrateConsumerAddrsToPrune

* apply review suggestion

---------

Co-authored-by: MSalopek <[email protected]>
Co-authored-by: insumity <[email protected]>
Co-authored-by: Philip Offtermatt <[email protected]>
Co-authored-by: Simon Noetzlin <[email protected]>
Co-authored-by: Bernd Mueller <[email protected]>
  • Loading branch information
6 people authored Jul 31, 2024
1 parent e5c567e commit 8e38556
Show file tree
Hide file tree
Showing 59 changed files with 1,488 additions and 6,025 deletions.
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
- The removal of `VSCMaturedPackets` entail several API breaking changes.
([\#2098](https://github.com/cosmos/interchain-security/pull/2098))

- Remove the `oldest_unconfirmed_vsc` query -- used to get
the send timestamp of the oldest unconfirmed VSCPacket.
- Deprecate the `init_timeout_period` and `vsc_timeout_period` parameters
from the provider module.
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
- Remove `VSCMaturedPackets` from the provider module, which entails the following
changes to the provider.
([\#2098](https://github.com/cosmos/interchain-security/pull/2098))

- Remove unbonding operations pausing.
- Remove the CCV channel initialization timeout.
- Remove `VSCPackets` timeout.
- Redesign key assignment pruning -- prune old consumer keys after the unbonding period elapses.
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
- Remove `VSCMaturedPackets` from the provider module.
([\#2098](https://github.com/cosmos/interchain-security/pull/2098))
6 changes: 5 additions & 1 deletion app/provider/app.go
Original file line number Diff line number Diff line change
Expand Up @@ -505,7 +505,11 @@ func New(
govtypes.NewMultiGovHooks(app.ProviderKeeper.Hooks()),
)

providerModule := ibcprovider.NewAppModule(&app.ProviderKeeper, app.GetSubspace(providertypes.ModuleName))
providerModule := ibcprovider.NewAppModule(
&app.ProviderKeeper,
app.GetSubspace(providertypes.ModuleName),
keys[providertypes.StoreKey],
)

app.TransferKeeper = ibctransferkeeper.NewKeeper(
appCodec,
Expand Down
188 changes: 188 additions & 0 deletions docs/docs/adrs/adr-018-remove-vscmatured.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,188 @@
---
sidebar_position: 19
title: Remove VSCMatured Packets
---
# ADR 018: Remove VSCMatured Packets

## Changelog
* 19/06/2024: Create initial draft

## Status

Accepted

## Context

> The idea of removing VSCMatured packets was first suggested by [Jae Kwon](https://github.com/jaekwon) over a conversation in 2022.
> As a result of that conversation, [an ADR was proposed](https://github.com/cosmos/interchain-security/pull/712) to halt consumer chain if VSC packets are no longer received.
> The ADR was not accepted due to other considerations regarding the safety of consumer chains.
> See [this blog post](https://informal.systems/blog/learning-to-live-with-unbonding-pausing) for more details.
The consumer module on the consumer chains is a representation of the Hub’s staking module, i.e., it provides an _asynchronous_ view of the voting powers and indirectly of the locked collateral.
The key word here is _asynchronous_, which means that (in theory) there is no bound on the lag between the Hub’s view of stake and the consumer’s view of stake.
The reasons for this asynchrony are relaying delays and chain liveness (e.g., a consumer could be down for a long period of time without affecting the liveness of the staking module on the Hub).

The current version of ICS uses `VSCMaturedPackets` to create on the consumers a _partially synchronous_ view of the Hub’s staking module.
Partially synchronous means that the lag between the Hub’s view of stake and the consumer’s view of stake is bounded, because consumers that exceed this lag are forcibly removed from the protocol.
Basically, unlocking collateral from the Hub is being delayed until the consumers’ `UnbondingPeriod` elapses.
The reason the view is only partially synchronous is that eventually the collateral is unlocked, i.e., if `VSCMaturedPackets` are not received from a consumer for `VscTimeoutPeriod` (default: 5 weeks), then the consumer is removed from ICS and the collateral is unlocked.
Note that keeping the stake locked “forever” would affect the Hub’s liveness, so it’s not a viable option.

The issue is that whatever attack is possible with an asynchronous view of the staking module, it is eventually possible with the partially synchronous view as well.
For example, an attacker could wait for `VscTimeoutPeriod` for the collateral to be unlocked and then send invalid headers to third-party chains that are not aware the consumer's collateral is no longer locked on the Hub (i.e., the consumer is no longer part of ICS).

Moreover, with the introduction of [PSS](./adr-015-partial-set-security.md), a consumer’s validator set could “lie” about its `UnbondingPeriod` elapsing by sending `VSCMaturedPackets` earlier.
This would result in a discrepancy between a light client’s view of the `UnbondingPeriod` and the actual Hub’s `UnbondingPeriod`.

## Decision

This ADR proposes the removal of `VSCMaturedPackets`. The reason is twofold.
First, `VSCMaturedPackets` provide a "false" sense of correctness as the attack discribed above is still possible.
Second, `VSCMaturedPackets` add considerable complexity to the ICS protocol -- an extra message plus the pausing of unbonding operations that can affect the UX.

To simplify the upgrading process, removing `VSCMaturedPackets` can be done in two releases:

* (R1) Update the provider to drop `VSCMaturedPackets`.
* (R2) Update the consumer to stop sending `VSCMaturedPackets`.

As a result, once the provider chain runs R1, the consumers can start upgrading to R2.

### Provider Changes (R1)

#### Parameters

Deprecate the `InitTimeoutPeriod` and `VscTimeoutPeriod` parameters.

#### State

Add the following key prefix to the state:

- `ConsumerAddrsToPruneV2BytePrefix` -- the byte prefix for storing consumer validators addresses that need to be pruned.
These are stored as
```
ConsumerAddrsToPruneV2BytePrefix | len(chainID) | chainID | ts -> (consumer_address1, consumer_address2, ...)
```
where `ts` is the timestamp at which the consumer validators addresses can be pruned.

Migrate the consumer validator addresses stored under the `ConsumerAddrsToPruneBytePrefix` to the new prefix `ConsumerAddrsToPruneV2BytePrefix`.
Note that these consumer validators addresses are stored as
```
ConsumerAddrsToPruneBytePrefix | len(chainID) | chainID | vscID -> (consumer_address1, consumer_address2, ...)
```
where `vscID` is the ID of the first VSCPacket sent after these consumer validator addresses were changed.
These means that we can use the `VscSendTimestamp`s to compute the timestamps when these consumer validator addresses can be pruned, i.e.,
```golang
func MigrateConsumerAddrsToPrune()
iterator := sdk.KVStorePrefixIterator(store, []byte{providertypes.ConsumerAddrsToPruneBytePrefix})
for ; iterator.Valid(); iterator.Next() {
chainID, vscID, _ := providertypes.ParseChainIdAndUintIdKey(providertypes.ConsumerAddrsToPruneBytePrefix, iterator.Key())
// use the VscSendTimestamp index to compute the timestamp at which this consumer addresses can be pruned
vscSendTimestampKey := providertypes.ChainIdAndUintIdKey(providertypes.VscSendTimestampBytePrefix, chainID, vscID)
timeBz := store.Get(vscSendTimestampKey)
sentTime, _ := sdk.ParseTimeBytes(timeBz)
pruneTs := sentTime.Add(sk.UnbondingTime(ctx))

var addrs providertypes.AddressList
addrs.Unmarshal(iterator.Value())

for _, addr := range addrs.Addresses {
consumerAddr := providertypes.NewConsumerConsAddress(addr)
pk.AppendConsumerAddrsToPrune(ctx, chainID, pruneTs, consumerAddr)
}
}
}
```

Remove the following key prefixes from the state. Note that these removals require state migration.

- `MaturedUnbondingOpsByteKey` -- the byte key that stores the list of all unbonding operations ids that have matured from a consumer chain perspective.
- `UnbondingOpBytePrefix` -- the byte prefix that stores a record of all the ids of consumer chains that need to unbond before a given unbonding operation can unbond on this chain.
- `UnbondingOpIndexBytePrefix` -- the byte prefix of the index for looking up which unbonding operations are waiting for a given consumer chain to unbond.
- `InitTimeoutTimestampBytePrefix` -- the byte prefix for storing the init timeout timestamp for a given consumer chainID.
- `VscSendTimestampBytePrefix` -- the byte prefix for storing the list of VSC sending timestamps for a given consumer chainID.
- `ConsumerAddrsToPruneBytePrefix` -- the byte prefix for storing the mapping from VSC IDs to consumer validators addresses that need to be pruned.

#### State Transitions

Removing `VSCMaturedPackets` affects three ICS sub-protocols (see [HandleVSCMaturedPacket](https://github.com/cosmos/interchain-security/blob/v4.2.0/x/ccv/provider/keeper/relay.go#L51)): unbonding operations pausing, `VSCPackets` timeout, and key assignment pruning.
The first two are no longer needed, while the third (key assignment pruning) needs to be redesigned to not depend on `VSCMaturedPackets`.

**Removing unbonding operations pausing:**

- Make the `AfterUnbondingInitiated` hook a no-op. As a result, unbonding operations are no longer paused.
- Stop calling the `UnbondingCanComplete` method from the staking keeper. This entails, it is no longer necessary to append `MaturedUnbondingOps` and the `completeMaturedUnbondingOps` method can be removed.
- Note, that during the upgrade, all unbonding operations stored under the `UnbondingOpBytePrefix` prefix need to be completed (via the `UnbondingCanComplete` method from the staking keeper).
- Remove the init timeout timestamp logic from the following methods: `CreateConsumerClient`, `SetConsumerChain`, and `EndBlockCCR`.

**Removing `VSCPackets` timeout:**

- Stop setting VSC send timestamps when sending `VSCPackets`.
- Stop removing the VSC send timestamps when receiving `VSCMaturedPackets`.
- Remove the logic from `EndBlockCCR` that checks if the first VSC send timestamp in iterator plus `VscTimeoutPeriod` exceeds the current block time.

**Redesign key assignment pruning.** The reason for keeping "old" consumer addresses is to enable slashing / jailing validators that misbehave on consumer chains,
i.e., the slashing logic uses the `GetProviderAddrFromConsumerAddr` method that accesses the mapping from validator addresses on consumer chains to validator addresses on the provider chain (`ValidatorsByConsumerAddrBytePrefix`).
Thus, "old" consumer addresses are no longer needed after the provider's `UnbondingPeriod` elapses.
This means that once a validator changes its key on a consumer, we can prune the address corresponding to the "old" key after `UnbondingPeriod`.
This requires the following changes:

- Adapt the `AppendConsumerAddrsToPrune()` method to use the timestamp at which it is safe to prune the consumer validator address (instead of the current `vscID`).
- Add a new method `ConsumeConsumerAddrsToPrune(ts)` that returns the list of consumer addresses that can be pruned at timestamp `ts`.
- Adapt the `PruneKeyAssignments()` method to call `ConsumeConsumerAddrsToPrune(ctx.BlockTime())`.
- Call the `PruneKeyAssignments()` method from every `EndBlock()` instead of calling it from `HandleVSCMaturedPacket()`.

#### Queries

Remove the `oldest_unconfirmed_vsc` query.

### Consumer Changes (R2)

#### Parameters

Given that currently relayers use the consumer `UnbondingPeriod` (see `ConsumerParams`), this param cannot be deprecated.
Note that Hermes queries the `UnbondingPeriod` for sanity checks and to set the default trusting period when it is not specified.
As a result, the `UnbondingTime` method from the staking interface will continue to be used to retrieve the consumer's `UnbondingPeriod`.

#### State

Remove the following key prefixes from the state:

- `PacketMaturityTimeBytePrefix` -- the byte prefix that will store maturity time for each received VSC packet

Note that these removals require state migration.

#### State Transitions

To stop the consumer chains from sending `VSCMaturedPackets`, it is sufficient to not store the maturity time of `VSCPacket`s when receiving them, i.e., do not call `SetPacketMaturityTime` from the `OnRecvVSCPacket()` method.
Note that eventually, no additional `VSCMaturedPackets` will be added to the sending queue as `QueueVSCMaturedPackets` iterates over elapsed maturity times.
In addition, to clean up the code, the `QueueVSCMaturedPackets` must be removed.

#### Messages

`VSCMaturedPacketData` is deprecated.
Note that this is a wire-breaking change -- older consumer versions will send `VSCMaturedPackets` and older provider versions will expect to receive `VSCMaturedPackets`.

## Consequences

### Positive

- Remove feature that provides a "false" sense of correctness.
- Remove unnecessary complexity, from both ICS and Cosmos SDK.
- Remove one IBC packet and, thus, reduce relaying cost.
- Remove unbonding pausing logic that could affect the UX.

### Negative

- Large refactor that might introduce unexpected bugs.
- Consumer chains are no longer removed if the duration between creating a client and creating the CCV channel exceeds `InitTimeoutPeriod`.
This means that if the CCV channel is not created on time, the client expires and the consumer chain can no longer start without a ClientUpdate proposal or re-submitting a `ConsumerAdditionProposal`.

### Neutral

- Consumer chains are no longer removed after a `VscTimeoutPeriod` of inactivity.
Note that consumers are still removed if their CCV channel expires, which usually happens after two weeks instead of five weeks (the default value for `VscTimeoutPeriod`).

## References

- [PR #712](https://github.com/cosmos/interchain-security/pull/712) -- Proposal to set an expiration date on the consumer chain updated on each VSCPacket received to ensure the chain is up-to-date.
- [Learning to Live with “Unbonding Pausing”](https://informal.systems/blog/learning-to-live-with-unbonding-pausing) blog post.
1 change: 1 addition & 0 deletions docs/docs/adrs/intro.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,7 @@ To suggest an ADR, please make use of the [ADR template](https://github.com/cosm
- [ADR 013: Slashing on the provider for consumer equivocation](./adr-013-equivocation-slashing.md)
- [ADR 014: Epochs](./adr-014-epochs.md)
- [ADR 015: Partial Set Security](./adr-015-partial-set-security.md)
- [ADR 018: Remove VSCMatured Packets](./adr-018-remove-vscmatured.md)

### Proposed

Expand Down
15 changes: 0 additions & 15 deletions docs/docs/introduction/params.md
Original file line number Diff line number Diff line change
Expand Up @@ -61,21 +61,6 @@ CCVTimeoutPeriod may have different values on the provider and consumer chains.
- `CCVTimeoutPeriod` on the provider **must** be larger than `ConsumerUnbondingPeriod`
- `CCVTimeoutPeriod` on the consumer is initial set via the `ConsumerAdditionProposal`

### InitTimeoutPeriod
`InitTimeoutPeriod` is the maximum allowed duration for CCV channel initialization to execute.

For any consumer chain, if the CCV channel is not established within `InitTimeoutPeriod` then the consumer chain will be removed and therefore will not be secured by the provider chain.

The countdown starts when the `spawn_time` specified in the `ConsumerAdditionProposal` is reached.

### VscTimeoutPeriod
`VscTimeoutPeriod` is the provider-side param that enables the provider to timeout VSC packets even when a consumer chain is not live.
If the `VscTimeoutPeriod` is ever reached for a consumer chain that chain will be considered not live and removed from interchain security.

:::tip
`VscTimeoutPeriod` MUST be larger than the `ConsumerUnbondingPeriod`.
:::

### BlocksPerDistributionTransmission
`BlocksPerDistributionTransmission` is the number of blocks between rewards transfers from the consumer to the provider.

Expand Down
36 changes: 20 additions & 16 deletions proto/interchain_security/ccv/provider/v1/genesis.proto
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,21 @@ import "interchain_security/ccv/provider/v1/provider.proto";

// GenesisState defines the CCV provider chain genesis state
message GenesisState {
// Reserve 3rd slot for removed unbonding_ops field
reserved 3;

// Reserve 4th slot for removed mature_unbonding_ops field
reserved 4;

// Reserve 11th slot for consumer_addrs_to_prune field
reserved 11;

// Reserve 12th slot for removed init_timeout_timestamps field
reserved 12;

// Reserve 13th slot for removed exported_vsc_send_timestamps field
reserved 13;

// strictly positive and set to 1 (DefaultValsetUpdateID) for a new chain
uint64 valset_update_id = 1;
// empty for a new chain
Expand All @@ -19,11 +34,6 @@ message GenesisState {
(gogoproto.moretags) = "yaml:\"consumer_states\""
];
// empty for a new chain
repeated interchain_security.ccv.provider.v1.UnbondingOp unbonding_ops = 3
[ (gogoproto.nullable) = false ];
// empty for a new chain
interchain_security.ccv.provider.v1.MaturedUnbondingOps mature_unbonding_ops = 4;
// empty for a new chain
repeated ValsetUpdateIdToHeight valset_update_id_to_height = 5
[ (gogoproto.nullable) = false ];
// empty for a new chain
Expand All @@ -39,21 +49,19 @@ message GenesisState {
// empty for a new chain
repeated ValidatorByConsumerAddr validators_by_consumer_addr = 10
[ (gogoproto.nullable) = false ];
// empty for a new chain
repeated ConsumerAddrsToPrune consumer_addrs_to_prune = 11
[ (gogoproto.nullable) = false ];

repeated interchain_security.ccv.provider.v1.InitTimeoutTimestamp init_timeout_timestamps = 12
[ (gogoproto.nullable) = false ];

repeated interchain_security.ccv.provider.v1.ExportedVscSendTimestamp exported_vsc_send_timestamps = 13
// empty for a new chain
repeated ConsumerAddrsToPruneV2 consumer_addrs_to_prune_v2 = 14
[ (gogoproto.nullable) = false ];
}

// The provider CCV module's knowledge of consumer state.
//
// Note this type is only used internally to the provider CCV module.
message ConsumerState {
// Reserve 8th slot for removed unbonding_ops_index field
reserved 8;

// ChainID defines the chain ID for the consumer chain
string chain_id = 1;
// ChannelID defines the IBC channel ID for the consumer chain
Expand All @@ -70,10 +78,6 @@ message ConsumerState {
repeated interchain_security.ccv.v1.ValidatorSetChangePacketData
pending_valset_changes = 6 [ (gogoproto.nullable) = false ];
repeated string slash_downtime_ack = 7;
// UnbondingOpsIndex defines the unbonding operations waiting on this consumer
// chain
repeated interchain_security.ccv.provider.v1.VscUnbondingOps
unbonding_ops_index = 8 [ (gogoproto.nullable) = false ];
}

// ValsetUpdateIdToHeight defines the genesis information for the mapping
Expand Down
Loading

0 comments on commit 8e38556

Please sign in to comment.