Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add storage bounds for pallet staking and clean up deprecated non paged exposure storages #6445

Open
wants to merge 137 commits into
base: master
Choose a base branch
from

Conversation

re-gius
Copy link
Contributor

@re-gius re-gius commented Nov 11, 2024

This is part of #6289 and necessary for the Asset Hub migration.

Building on the observations and suggestions from #255 .

Changes

  • Add MaxInvulnerables to bound Invulnerables Vec -> BoundedVec.
    • Set to constant 20 in the pallet (must be >= 17 for backward compatibility of runtime westend).
  • Add MaxDisabledValidators to bound DisabledValidators Vec -> BoundedVec
    • Set to constant 100 in the pallet (it should be <= 1/3 * MaxValidatorsCount according to the current disabling strategy)
  • Remove ErasStakers and ErasStakersClipped (see Tracker issue for cleaning up old non-paged exposure logic in staking pallet #433 )
  • Use MaxExposurePageSize to bound ErasStakersPaged mapping to exposure pages: each ExposurePage.others Vec is turned into a WeakBoundedVec to allow easy and quick changes to this bound
  • Add MaxBondedEras to bound BondedEras Vec -> BoundedVec
    • Set to BondingDuration::get() + 1 everywhere to include both time interval endpoints in [current_era - BondingDuration::get(), current_era]. Notice that this was done manually in every test and runtime, so I wonder if there is a better way to ensure that MaxBondedEras::get() == BondingDuration::get() + 1 everywhere.
  • Add MaxRewardPagesPerValidator to bound ClaimedRewards Vec of pages -> WeakBoundedVec
    • Set to constant 20 in the pallet. The vector of pages is now a WeakBoundedVec to allow easy and quick changes to this parameter
  • Remove MaxValidatorsCount optional storage item to add MaxValidatorsCount mandatory config parameter
    • Using it to to bound EraRewardPoints.individual BTreeMap -> BoundedBTreeMap;
    • Set to dynamic parameter in runtime westend so that changing it should not require migrations for it

TO DO
Slashing storage items will be bounded in another PR.

  • UnappliedSlashes
  • SlashingSpans

@@ -1184,7 +1184,6 @@ impl parachains_slashing::Config for Runtime {
ReportLongevity,
>;
type WeightInfo = parachains_slashing::TestWeightInfo;
type BenchmarkingConfig = parachains_slashing::BenchConfig<200>;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The overall observation is that we had to add temporarily a lot of bounds to pallets just for benchmarks, but as staking has now proper bounds for all its storage, they are not needed ✅

invulnerables: BoundedVec::try_from(
initial_authorities.iter().map(|x| x.0.clone()).collect::<Vec<_>>()
)
.expect("Too many invulnerable validators!"),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Error message can be a bit more informative, hinting at which config should be tweaked if this expect is ever reached.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

error message improved in 0d47dc5

@@ -1841,6 +1855,7 @@ pub mod migrations {
parachains_shared::migration::MigrateToV1<Runtime>,
parachains_scheduler::migration::MigrateV2ToV3<Runtime>,
pallet_staking::migrations::v16::MigrateV15ToV16<Runtime>,
pallet_staking::migrations::v17::MigrateV16ToV17<Runtime>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can check the logs for this migration in this CI job:

https://github.com/paritytech/polkadot-sdk/actions/runs/12412281493/job/34651846644?pr=6445

there seems to be some error-ish logs in there:

💸 Migration failed for ClaimedRewards from v16 to v17.

Copy link
Contributor Author

@re-gius re-gius Dec 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Solved with force_from(...) for WeakBoundedVec. I noticed that old individual validator exposures have up to 205 pages, while we would expect them to be at most 20 in the future @Ank4n
What do you think is the best approach: keep 20 as a limit for MaxRewardPagesPerValidator for future items and force old items in a WeakBoundedVec or increase the MaxRewardPagesPerValidator limit to something like 250 to accomodate old items?

EDIT: I can also see a third option: merge pages for old items until they are less than 20.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think using WeakBoundedVec with too much elements is prone to error. operation like remove and then try_push will fail. We should have a clean state. WeakBoundedVec is more for emergency.

@@ -745,6 +746,10 @@ impl pallet_staking::Config for Runtime {
type WeightInfo = pallet_staking::weights::SubstrateWeight<Runtime>;
type BenchmarkingConfig = StakingBenchmarkingConfig;
type DisablingStrategy = pallet_staking::UpToLimitWithReEnablingDisablingStrategy;
type MaxInvulnerables = ConstU32<20>;
type MaxRewardPagesPerValidator = ConstU32<20>;
type MaxValidatorsCount = ConstU32<300>;
Copy link
Contributor

@kianenigma kianenigma Dec 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other pallets like session, babe, grandpa and such also have a notion of MaxValidators (expressed as MaxAuthorities) that should be equal to the max validators of this pallet.

within this file, you can use one pub const MaxValidators: u32 = 300 everywhere to unify it.

Taking it a step further, you can expose this as a part of trait SessionManager. this trait can expose a trat SessionManager { type MaxAuthorities = <set-by-staking-pallet> }. Then, within the pallet-session, who consumes SessionManager, you can do:

fn integroty_check() { 
    /// A way to express within pallet-session that whoever implements `SessionManager` should have a compatible `MaxAuthorities`. 
    assert!(T::SessionManager::MaxAuthoritiet::get() >= T::MaxAuthorities::get())
}

This might be too much for this PR, but good for you to be familiar with the pattern.

Whenever they are multiple parameters within two pallets that have a logical dependency (they have to be equal, or one has to be larger than the other), you can remove the implicitness like this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed in a065a91

Opened issue #6972 to unify MaxValidators limit (I also see a MaxActiveValidators)

substrate/frame/staking/src/pallet/mod.rs Show resolved Hide resolved
Comment on lines +1631 to +1635
ensure!(
invulnerables.len() as u32 <= T::MaxInvulnerables::get(),
Error::<T>::BoundNotMet
);
<Invulnerables<T>>::put(BoundedVec::truncate_from(invulnerables));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
ensure!(
invulnerables.len() as u32 <= T::MaxInvulnerables::get(),
Error::<T>::BoundNotMet
);
<Invulnerables<T>>::put(BoundedVec::truncate_from(invulnerables));
let invulnerables = BoundedVec::try_from(invulnerables)
.map_err(|_| Error::<T>::BoundNotMet)?;
<Invulnerables<T>>::put(invulnerables);

for individual in chunk.iter() {
page_total.saturating_accrue(individual.value);
others.push(IndividualExposure {
let _ = others.try_push(IndividualExposure {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could use defensive. but it is ok to me

Suggested change
let _ = others.try_push(IndividualExposure {
// Always successful as chunks are of size `MaxExposurePageSize`.
let _ = others.try_push(IndividualExposure {

@@ -400,7 +401,7 @@ impl pallet_staking::Config for Runtime {
type DisablingStrategy = pallet_staking::UpToLimitWithReEnablingDisablingStrategy;
type MaxInvulnerables = ConstU32<20>;
type MaxRewardPagesPerValidator = ConstU32<20>;
type MaxValidatorsCount = ConstU32<300>;
type MaxValidatorsCount = MaxAuthorities;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a huge jump, is there a reason why the MaxAuthorities is so big in the test runtime?

Copy link
Contributor

@gui1117 gui1117 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good, but there is still some comments to resolve.

Comment on lines +763 to +766
pub fn from_clipped(exposure: Exposure<AccountId, Balance>) -> Result<Self, ()> {
let old_exposures = exposure.others.len();
let others = WeakBoundedVec::try_from(exposure.others).unwrap_or_default();
defensive_assert!(old_exposures == others.len(), "Too many exposures for a page");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is not used, we can make it better or remove it.

Suggested change
pub fn from_clipped(exposure: Exposure<AccountId, Balance>) -> Result<Self, ()> {
let old_exposures = exposure.others.len();
let others = WeakBoundedVec::try_from(exposure.others).unwrap_or_default();
defensive_assert!(old_exposures == others.len(), "Too many exposures for a page");
pub fn try_from_clipped(exposure: Exposure<AccountId, Balance>) -> Result<Self, ()> {
let others = WeakBoundedVec::try_from(exposure.others).map_err(|_| ())?;

Comment on lines 1112 to 1114
let overview = <ErasStakersOverview<T>>::get(&era, validator);

if overview.is_none() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
let overview = <ErasStakersOverview<T>>::get(&era, validator);
if overview.is_none() {
let Some(overview) = <ErasStakersOverview<T>>::get(&era, validator) else { return Exposure::default(); };

@@ -1144,15 +1112,18 @@ impl<T: Config> EraInfo<T> {
let overview = <ErasStakersOverview<T>>::get(&era, validator);

if overview.is_none() {
return ErasStakers::<T>::get(era, validator)
return Exposure::default()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return Exposure::default()

Comment on lines 1116 to 1118
}

let overview = overview.expect("checked above; qed");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
}
let overview = overview.expect("checked above; qed");

claimed_pages.push(page);
// try to add page to claimed entries
if claimed_pages.try_push(page).is_err() {
defensive!("Limit reached for maximum number of pages.");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The proof should be more precise. In what circumstance can this limit be reached, why is it impossible in practice.

Comment on lines +75 to +88
let mut eras_stakers_keys =
v16::ErasStakers::<T>::iter_keys().map(|(k1, _k2)| k1).collect::<Vec<_>>();
eras_stakers_keys.dedup();
for k in eras_stakers_keys {
let mut removal_result =
v16::ErasStakers::<T>::clear_prefix(k, u32::max_value(), None);
while let Some(next_cursor) = removal_result.maybe_cursor {
removal_result = v16::ErasStakers::<T>::clear_prefix(
k,
u32::max_value(),
Some(&next_cursor[..]),
);
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to try to remove all keys in one time, if we don't need multi-block migration and we are sure it is ok then we can do:

Suggested change
let mut eras_stakers_keys =
v16::ErasStakers::<T>::iter_keys().map(|(k1, _k2)| k1).collect::<Vec<_>>();
eras_stakers_keys.dedup();
for k in eras_stakers_keys {
let mut removal_result =
v16::ErasStakers::<T>::clear_prefix(k, u32::max_value(), None);
while let Some(next_cursor) = removal_result.maybe_cursor {
removal_result = v16::ErasStakers::<T>::clear_prefix(
k,
u32::max_value(),
Some(&next_cursor[..]),
);
}
}
v16::ErasStakers::<T>::clear(u32::max_value(), None);

Comment on lines +90 to +104
let mut eras_stakers_clipped_keys = v16::ErasStakersClipped::<T>::iter_keys()
.map(|(k1, _k2)| k1)
.collect::<Vec<_>>();
eras_stakers_clipped_keys.dedup();
for k in eras_stakers_clipped_keys {
let mut removal_result =
v16::ErasStakersClipped::<T>::clear_prefix(k, u32::max_value(), None);
while let Some(next_cursor) = removal_result.maybe_cursor {
removal_result = v16::ErasStakersClipped::<T>::clear_prefix(
k,
u32::max_value(),
Some(&next_cursor[..]),
);
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
let mut eras_stakers_clipped_keys = v16::ErasStakersClipped::<T>::iter_keys()
.map(|(k1, _k2)| k1)
.collect::<Vec<_>>();
eras_stakers_clipped_keys.dedup();
for k in eras_stakers_clipped_keys {
let mut removal_result =
v16::ErasStakersClipped::<T>::clear_prefix(k, u32::max_value(), None);
while let Some(next_cursor) = removal_result.maybe_cursor {
removal_result = v16::ErasStakersClipped::<T>::clear_prefix(
k,
u32::max_value(),
Some(&next_cursor[..]),
);
}
}
v16::ErasStakersClipped::<T>::clear(u32::max_value(), None);

} else {
log!(info, "v17 applied successfully.");
}
T::DbWeight::get().reads_writes(1, 1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can count the number of operation as we do them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T1-FRAME This PR/Issue is related to core FRAME, the framework. T2-pallets This PR/Issue is related to a particular pallet.
Projects
Status: In review
Status: Scheduled
Development

Successfully merging this pull request may close these issues.

6 participants