Skip to content

Commit

Permalink
Release 5.0.5 (#895)
Browse files Browse the repository at this point in the history
* changelog

* hardcode test upgrade the previous version on mainnet
  • Loading branch information
0o-de-lally authored Dec 11, 2021
1 parent 9d34378 commit ba3c7d6
Show file tree
Hide file tree
Showing 3 changed files with 118 additions and 3 deletions.
116 changes: 116 additions & 0 deletions ol/changelog/5_0_5.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,116 @@
## 5.0.5

This upgrade has changes to the network Move bytecode, diem-node JSON RPC, and command-line tools.

TL;DR Submit an upgrade vote with hash then update `diem-node` to v5.0.5

```
# do a lazy vote
txs oracle-upgrade -v -h 3f46ba5768165dfb4503e5ab2474c3bbf81cbf0cf2550f7ff5252b15a5b985e4
# install new binaries
cd ~/libra
git checkout v5.0.5 -f
make bins install
<stop diem-node>
<restart diem-node>
```

### Summary

5.0.5 predominantly includes bugfixes to system policies (Move code) which had diverged from the specification.

The upgrade can be done without halting the network with a hot upgrade.

The upgrade includes a autonomous state migration, which will occur on-the-fly. THIS IS THE FIRST TIME such a state migration attempted on mainnet.

While this upgrade was tested extensively in CI, and went through manual QA awith pre-flight-checl, THERE IS A NON-ZERO RISK of network halt due to the change in data structures and APIs.

The diem-node is backwards compatible with the current network bytecode (5.0.4).
The Move bytecode is backwards compatible with diem-node 5.0.4.

A hot network upgrade is required before the tools can be used. [Read about network upgrades.](../documentation/network-upgrades/upgrades.md)

The stdlib payload hash for voting is:

3f46ba5768165dfb4503e5ab2474c3bbf81cbf0cf2550f7ff5252b15a5b985e4

You can build and check the hash of stdlib from project root with: `make sdlib`

### Changes

##### Move Changes
##### - Carpe underpayment
5.0.5 solves an unfortunate code regression when mainnet was launched and Carpe alpha users are having their Identity Subsidy grossly undercounted, and decreasing each day.

Post-mortem:
An investigation showed that the root-cause of the error was that continuous integration tests were presenting a false positive of the behavior of the Move code policy. The relevant counters was not getting reset at epoch boundaries in production code, only in test code. This prevented the dev team from catching this pre-flight. Additionally there were not sufficient metrics being collected until the Web Explorer was developed to demonstrate the divergence in the calculations, it was only when a Carpe user presented the conflicting data that the developers behind the Explorer, Carpe, and Move core could identigfy the cause. Many thanks to @mannybothanz who correctly identified the issue, @gnudrew25 who rapidly deployed the analytics, @jamesm who identified the source of the bug, and @0o-de-lally for developing the fixes.

Fixes:
This upgrade solves the issue going forward. This upgrade does not correct for historical account balances that are below what was expected. A future upgrade will do this.

PR resolves the underpayment (incorrect resetting of proofs per epoch) that was affecting Carpe miners receiving exponentially less rewards every new epoch.

These changes include modifications to how TowerState collects proof counts, but also requires a state migration. The state migration needs to happen at the start of the epoch (after the upgrade writeset takes place).

A state migration is necessary for this upgrade, the TowerState struct will be deprecated in favor of TowerCounter.

https://github.com/OLSF/libra/pull/890

##### - Epochs validator compliant

This upgrade also solves the issue with rate-limiting of validator account creation being miscounted. There was an edge case where end-users that had a Tower height, and later changed their account type to validators were able to onboard users before the appropriate amount of time passed.

The issue was relatively simple, on the epoch boundary the updating of epoch statistics for Tower was happening for all miners, instead of the subset which were validators. This is a reversion, likely due to inclusion of Carpe use cases, and a faulty merge. The tests were also not robust enough for these cases.

The practical matter is that there is a corner case where a miner (not yet a validator) builds a tower for n periods, and then is upgraded to validator (not the workflow we anticipate for validator onboarding) and then instead of epochs_mining_and_validating being 0, it was n.

We take the opportunity to patch this to implement lazy computation of resetting the epoch proof count instead of iterating through the entire list of miners, for a relatively small benefit.

https://github.com/OLSF/libra/pull/880

##### - End-user transfers enabled
As a policy end-users transfers have been enabled since genesis. However an issue with extracting the "withdrawal capability token" was implementing a deprecated policy from the experimental network (no transfers were possible). The functional tests were giving a false positive because the Test settings use a different payment logic.

This patch then allows for transfers from end users in an unrestricted manner. Functional tests were updated. CLI tools were already implemented and are correct. Carpe could also introduce this feature.

https://github.com/OLSF/libra/pull/887

##### Tests

- A functional test was created for the Move tx scriopt.

- An integration test for command line tools was created ./ol/integraion-tests/test-tx-tools.mk

- QA of the transfer functionality was conducted on Devnet.
##### Compatibility
The Move framework changes are backwards compatible with `diem-node` from v5.0.1

### Rust changes

#### Diem Node
The `diem-node` service also serves JSON-RPC requests. A new method for RPC requests related to TowerState was developed with the change to lazy computation of the current proofs in epoch.
Queries to that method will return an error if the node serving responses is not on v5.0.5.

All Carpe fullnodes should update to v5.0.5.


NOTE: If you have not yet upgraded to previous versions (e.g. v5.0.2) there are critical upgrade to `diem-node`. It's safe to skip 5.0.2 directly to 5.0.5 if that upgrade was not yet done.



# Preflight checks on Rex
```
- [x] ol: web monitor starts
- [x] tower: miner tx submit
- [x] txs: set community wallet
- [x] confirm autopay values
- [x] txs: create end user account "eve"
- [x] txs: eve submits miner proof
- [x] epoch change
- [x] txs: send stdlib upgrade tx
- [x] second epoch change after upgrade vote
- [x] stdlib upgrade
- [x] txs: validator onboarding
```
3 changes: 1 addition & 2 deletions ol/documentation/network-upgrades/pre-flight-checks.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,6 @@ Each Network upgrade is previously tested in Devnet (aka Rex net). The following

- [ ] ol: web monitor starts
- [ ] tower: miner tx submit
- [ ] ol: user account creation
- [ ] txs: set community wallet
- [ ] confirm autopay values
- [ ] txs: create end user account "eve"
Expand All @@ -11,4 +10,4 @@ Each Network upgrade is previously tested in Devnet (aka Rex net). The following
- [ ] txs: send stdlib upgrade tx
- [ ] second epoch change after upgrade vote
- [ ] stdlib upgrade
- [ ] txs: validator onboarding
- [ ] txs: validator onboarding
2 changes: 1 addition & 1 deletion ol/integration-tests/test-upgrade.mk
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ NUM_NODES = 2

ifndef PREV_VERSION
#TODO: decide how to programmatically tell the tests what version is in production.
PREV_VERSION = $(shell git branch --show-current)
PREV_VERSION = v5.0.4
endif

ifndef BRANCH_NAME
Expand Down

0 comments on commit ba3c7d6

Please sign in to comment.