Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

panic: loading latest version: failed to load latest version: version of store qgb mismatch root store's version; expected 1 got 0 #3462

Closed
rootulp opened this issue May 10, 2024 · 9 comments · Fixed by #3477
Assignees
Labels
bug Something isn't working WS: Maintenance 🔧 includes bugs, refactors, flakes, and tech debt etc WS: V2 ✌️ lemongrass hardfork related

Comments

@rootulp
Copy link
Collaborator

rootulp commented May 10, 2024

Problem

I can't start a node, stop it, and start it again on current main

$ celestia-appd start
1:02PM INF starting node with ABCI Tendermint in-process
1:02PM INF service start impl=multiAppConn module=proxy msg={}
1:02PM INF service start connection=query impl=localClient module=abci-client msg={}
1:02PM INF service start connection=snapshot impl=localClient module=abci-client msg={}
1:02PM INF service start connection=mempool impl=localClient module=abci-client msg={}
1:02PM INF service start connection=consensus impl=localClient module=abci-client msg={}
1:02PM INF service start impl=EventBus module=events msg={}
1:02PM INF service start impl=PubSub module=pubsub msg={}
1:02PM INF service start impl=IndexerService module=txindex msg={}
panic: loading latest version: failed to load latest version: version of store qgb mismatch root store's version; expected 1 got 0

goroutine 8 [running]:
github.com/celestiaorg/celestia-app/v2/app.(*App).Info(0x140012ba008, {{0x1041410ea?, 0x14001033748?}, 0x1027e7ab4?, 0x0?})
	/Users/rootulp/git/rootulp/celestiaorg/celestia-app/app/app.go:528 +0x370
github.com/tendermint/tendermint/abci/client.(*localClient).InfoSync(0x14000113020, {{0x1041410ea?, 0x14001033878?}, 0x1027dc1e8?, 0x10413f3f6?})
	/Users/rootulp/go/pkg/mod/github.com/celestiaorg/[email protected]/abci/client/local_client.go:250 +0x10c
github.com/tendermint/tendermint/proxy.(*appConnQuery).InfoSync(0x14001728560?, {{0x1041410ea?, 0x40?}, 0x12dfdce68?, 0x70?})
	/Users/rootulp/go/pkg/mod/github.com/celestiaorg/[email protected]/proxy/app_conn.go:170 +0x2c
github.com/tendermint/tendermint/consensus.(*Handshaker).HandshakeWithContext(0x14001033c90, {0x105375888, 0x1067bb940}, {0x10538a4f0, 0x140012ad860})
	/Users/rootulp/go/pkg/mod/github.com/celestiaorg/[email protected]/consensus/replay.go:250 +0x7c
github.com/tendermint/tendermint/consensus.(*Handshaker).Handshake(...)
	/Users/rootulp/go/pkg/mod/github.com/celestiaorg/[email protected]/consensus/replay.go:243
github.com/tendermint/tendermint/node.doHandshake({_, _}, {_, _}, {{{0xb, 0x2}, {0x1400169af30, 0x7}}, {0x1400169af37, 0x7}, ...}, ...)
	/Users/rootulp/go/pkg/mod/github.com/celestiaorg/[email protected]/node/node.go:337 +0x124
github.com/tendermint/tendermint/node.NewNode(0x140010fde00, {0x105369730, 0x14001740320}, 0x1400113ff10, {0x105353140, 0x1400066f680}, 0x14001035970, 0x1053457b8, 0x14001035960, {0x105375690, ...}, ...)
	/Users/rootulp/go/pkg/mod/github.com/celestiaorg/[email protected]/node/node.go:831 +0x4f8
github.com/cosmos/cosmos-sdk/server.startInProcess(_, {{0x0, 0x0, 0x0}, {0x105396b58, 0x140013fb650}, 0x0, {0x140013ef018, 0x7}, {0x10538dec0, ...}, ...}, ...)
	/Users/rootulp/go/pkg/mod/github.com/celestiaorg/[email protected]/server/start.go:301 +0x5a0
github.com/cosmos/cosmos-sdk/server.StartCmd.func2.2()
	/Users/rootulp/go/pkg/mod/github.com/celestiaorg/[email protected]/server/start.go:147 +0x48
github.com/cosmos/cosmos-sdk/server.wrapCPUProfile.func2()
	/Users/rootulp/go/pkg/mod/github.com/celestiaorg/[email protected]/server/start.go:535 +0x30
created by github.com/cosmos/cosmos-sdk/server.wrapCPUProfile in goroutine 1
	/Users/rootulp/go/pkg/mod/github.com/celestiaorg/[email protected]/server/start.go:534 +0x1d0
@rootulp rootulp added bug Something isn't working WS: Maintenance 🔧 includes bugs, refactors, flakes, and tech debt etc WS: V2 ✌️ lemongrass hardfork related labels May 10, 2024
@rootulp
Copy link
Collaborator Author

rootulp commented May 10, 2024

We likely would've prevented the PR that introduced the regression if we did #3337

@rootulp
Copy link
Collaborator Author

rootulp commented May 10, 2024

I repro'ed this behavior with d0d2a3f

@rootulp
Copy link
Collaborator Author

rootulp commented May 10, 2024

I'm on block height 4 and it looks like the qgb and/or blob substore has a commit ID of 0 when it should have a commit ID of 4.

$ celestia-appd start
mounting KVStoreKey{0x1400038fbc0, params}
loadVersion 4
storesKeys [KVStoreKey{0x1400038fbc0, params} TransientStoreKey{0x1400038fc90, transient_params} MemoryStoreKey{0x1400038fcb0, mem_capability}]
key KVStoreKey{0x1400038fbc0, params}commitID CommitID{[40 73 237 50 132 157 113 185 147 198 212 66 229 192 193 35 233 56 36 185 148 88 177 189 89 116 223 221 222 46 185 153]:4}
key TransientStoreKey{0x1400038fc90, transient_params}commitID CommitID{[]:0}
key MemoryStoreKey{0x1400038fcb0, mem_capability}commitID CommitID{[]:0}
2:17PM INF starting node with ABCI Tendermint in-process
2:17PM INF service start impl=multiAppConn module=proxy msg={}
2:17PM INF service start connection=query impl=localClient module=abci-client msg={}
2:17PM INF service start connection=snapshot impl=localClient module=abci-client msg={}
2:17PM INF service start connection=mempool impl=localClient module=abci-client msg={}
2:17PM INF service start connection=consensus impl=localClient module=abci-client msg={}
2:17PM INF service start impl=EventBus module=events msg={}
2:17PM INF service start impl=PubSub module=pubsub msg={}
2:17PM INF service start impl=IndexerService module=txindex msg={}
2:17PM INF Inside setCheckState height 4
2:17PM INF app.checkState &{{0x14000423980 map[0x1400038fbc0:0x14000423a80 0x1400038fcb0:0x14000423b40 0x1400038fc90:0x14000423ac0] map[mem_capability:0x1400038fcb0 params:0x1400038fbc0 transient_params:0x1400038fc90] <nil> map[]} {{{}} {0x14000423980 map[0x1400038fbc0:0x14000423a80 0x1400038fcb0:0x14000423b40 0x1400038fc90:0x14000423ac0] map[mem_capability:0x1400038fcb0 params:0x1400038fbc0 transient_params:0x1400038fc90] <nil> map[]} {{0 0}  4 {0 0 <nil>} {[] {0 []}} [] [] [] [] [] [] [] [] []} []  [] {{{{0x140001ac028 false  [] [] [] <nil> <nil> <nil> <nil> <nil> <nil> <nil> <nil> <nil> <nil>}} 1 <nil> [123] [{}] false <nil>}} [] 0x14001803aa0 <nil> true false [{utia {0x1400164bf40}}] <nil> 0x14000632240 0 {1000 1000 1000 3 2000 30 30} {100 100 100 0 200 3 3}}}
resp.AppVersion 1
got: map[acc:KVStoreKey{0x1400038fb40, acc} authz:KVStoreKey{0x1400038fb50, authz} bank:KVStoreKey{0x1400038fb60, bank} blob:KVStoreKey{0x1400038fc70, blob} capability:KVStoreKey{0x1400038fc00, capability} distribution:KVStoreKey{0x1400038fb90, distribution} evidence:KVStoreKey{0x1400038fbf0, evidence} feegrant:KVStoreKey{0x1400038fbe0, feegrant} gov:KVStoreKey{0x1400038fbb0, gov} ibc:KVStoreKey{0x1400038fc30, ibc} mint:KVStoreKey{0x1400038fb80, mint} qgb:KVStoreKey{0x1400038fc10, qgb} slashing:KVStoreKey{0x1400038fba0, slashing} staking:KVStoreKey{0x1400038fb70, staking} transfer:KVStoreKey{0x1400038fc20, transfer} upgrade:KVStoreKey{0x1400038fbd0, upgrade}]mounting KVStoreKey{0x1400038fc30, ibc}
mounting KVStoreKey{0x1400038fba0, slashing}
mounting KVStoreKey{0x1400038fb50, authz}
mounting KVStoreKey{0x1400038fbb0, gov}
mounting KVStoreKey{0x1400038fc20, transfer}
mounting KVStoreKey{0x1400038fb70, staking}
mounting KVStoreKey{0x1400038fbd0, upgrade}
mounting KVStoreKey{0x1400038fb60, bank}
mounting KVStoreKey{0x1400038fc70, blob}
mounting KVStoreKey{0x1400038fc00, capability}
mounting KVStoreKey{0x1400038fb90, distribution}
mounting KVStoreKey{0x1400038fbe0, feegrant}
mounting KVStoreKey{0x1400038fb40, acc}
mounting KVStoreKey{0x1400038fc10, qgb}
mounting KVStoreKey{0x1400038fbf0, evidence}
mounting KVStoreKey{0x1400038fb80, mint}
latestVersion 4
loadVersion 4
storesKeys [KVStoreKey{0x1400038fc20, transfer} KVStoreKey{0x1400038fbd0, upgrade} KVStoreKey{0x1400038fb40, acc} KVStoreKey{0x1400038fc10, qgb} KVStoreKey{0x1400038fb80, mint} KVStoreKey{0x1400038fc30, ibc} KVStoreKey{0x1400038fb50, authz} KVStoreKey{0x1400038fc00, capability} KVStoreKey{0x1400038fbe0, feegrant} KVStoreKey{0x1400038fbf0, evidence} MemoryStoreKey{0x1400038fcb0, mem_capability} KVStoreKey{0x1400038fbb0, gov} KVStoreKey{0x1400038fb70, staking} KVStoreKey{0x1400038fc70, blob} KVStoreKey{0x1400038fb90, distribution} KVStoreKey{0x1400038fbc0, params} TransientStoreKey{0x1400038fc90, transient_params} KVStoreKey{0x1400038fba0, slashing} KVStoreKey{0x1400038fb60, bank}]
key KVStoreKey{0x1400038fc20, transfer}commitID CommitID{[37 154 206 51 243 216 25 253 156 199 81 150 140 200 61 80 58 146 232 167 230 253 94 50 152 6 130 73 86 95 215 6]:4}
key KVStoreKey{0x1400038fbd0, upgrade}commitID CommitID{[210 140 218 250 41 229 225 106 36 24 184 61 250 194 9 132 254 175 244 106 139 250 172 251 242 208 27 123 92 234 223 210]:4}
key KVStoreKey{0x1400038fb40, acc}commitID CommitID{[42 211 202 73 49 171 10 100 3 156 16 170 43 44 124 98 35 49 66 58 151 253 161 24 143 247 144 47 12 129 248 122]:4}
key KVStoreKey{0x1400038fc10, qgb}commitID CommitID{[]:0}
panic: loading latest version: failed to load latest version: version of store qgb mismatch root store's version; expected 4 got 0

@rootulp
Copy link
Collaborator Author

rootulp commented May 10, 2024

A few stores are missing from this list:

calling commitStores with version 2 and stores map[
    KVStoreKey{0x1400118a560, acc}:0x1400000fb90 
    KVStoreKey{0x1400118a570, authz}:0x14000180ee8 
    KVStoreKey{0x1400118a580, bank}:0x1400000efd8 
    KVStoreKey{0x1400118a590, staking}:0x1400000fda0 
    KVStoreKey{0x1400118a5a0, mint}:0x14000181848 
    KVStoreKey{0x1400118a5b0, distribution}:0x140001812a8 
    KVStoreKey{0x1400118a5c0, slashing}:0x1400000e9a8 
    KVStoreKey{0x1400118a5d0, gov}:0x14000181668 
    KVStoreKey{0x1400118a5e0, params}:0x140011a8618 
    KVStoreKey{0x1400118a5f0, upgrade}:0x1400000e2a0 
    KVStoreKey{0x1400118a600, feegrant}:0x1400000f350 
    KVStoreKey{0x1400118a610, evidence}:0x140001810c8 
    KVStoreKey{0x1400118a620, capability}:0x14000181c08 
    KVStoreKey{0x1400118a640, transfer}:0x14000181488 
    KVStoreKey{0x1400118a650, ibc}:0x14000181de8 
    KVStoreKey{0x1400118a660, packetfowardmiddleware}:0x1400000f908 
    KVStoreKey{0x1400118a670, icahost}:0x14000181a28 
    KVStoreKey{0x1400118a680, signal}:0x1400000f698 
    MemoryStoreKey{0x1400118a6d0, mem_capability}:0x140011af9a0 
    TransientStoreKey{0x1400118a6b0, transient_params}:0x140011afe30]

notably, blob and qgb

@rootulp
Copy link
Collaborator Author

rootulp commented May 10, 2024

#3320 missed adding the blob key for v2. The qgb key should be absent from v2.

@rootulp
Copy link
Collaborator Author

rootulp commented May 10, 2024

Okay after fixing the blob key issue, this is still broken because https://github.com/rootulp/celestia-app/blob/d7b03f260b3bde92cb661454004822f1daff8135/app/app.go#L523 returns res.AppVersion=1 even though the devnet I have locally started on v2, stopped on v2, and I attempted to start it again. Since it thinks it's on AppVersion=1, it tries to register store keys for QGB which it shouldn't.

@rootulp rootulp self-assigned this May 10, 2024
@rootulp
Copy link
Collaborator Author

rootulp commented May 12, 2024

The app version in the param store isn't actually set yet

got app version from param store 0

even though it should be v1.

@rootulp
Copy link
Collaborator Author

rootulp commented May 13, 2024

On the main branch and rp/fix-export branch I can only repro this bug if the chain starts at app version 2.

If the chain starts at app version 1 and then upgrades to v2 via an upgrade height, restarting works.

#3405 is broken for chains that start on app version 2.

@rootulp
Copy link
Collaborator Author

rootulp commented May 13, 2024

This goes all the way back to celestiaorg/cosmos-sdk#347 because InitChain is not actually persisting that v2 in consensus params. The fix is to set the consensus params in InitChain for v2:

// InitChain implements the ABCI interface. This method is a wrapper around
// baseapp's InitChain so we can take the app version and setup the multicommit
// store.
//
// Side-effect: calls baseapp.Init()
func (app *App) InitChain(req abci.RequestInitChain) (res abci.ResponseInitChain) {
	// genesis must always contain the consensus params. The validator set however is derived from the
	// initial genesis state. The genesis must always contain a non zero app version which is the initial
	// version that the chain starts on
	if req.ConsensusParams == nil || req.ConsensusParams.Version == nil {
		panic("no consensus params set")
	}
	if req.ConsensusParams.Version.AppVersion == 0 {
		panic("app version 0 is not accepted. Please set an app version in the genesis")
	}
	appVersion := req.ConsensusParams.Version.AppVersion

	// mount the stores for the provided app version if it has not already been mounted
	if app.AppVersion() == 0 && !app.IsSealed() {
		app.mountKeysAndInit(appVersion)
	}

	res = app.BaseApp.InitChain(req)

	ctx := app.NewContext(false, tmproto.Header{})
	if appVersion != v1 {
		// set the initial app version in the consensus params
		app.SetInitialAppVersionInConsensusParams(ctx, appVersion)
		app.SetAppVersion(ctx, appVersion)
	}
	return res
}

rootulp added a commit that referenced this issue May 14, 2024
Closes #3462

Previously, `InitChain` for a genesis with app version 2 wouldn't save
the app version to consensus params. This means that if the node stopped
and started up again, it would fetch `0` from consensus params and
default to using app version 1 because of [these
lines](https://github.com/rootulp/celestia-app/blob/dd6b188c0553d7cdd81161c62558db88a353d173/app/app.go#L520-L525).

The fix in this PR is to save app version 2 and above to consensus
params in `InitGenesis`.

## Testing


```shell
./scripts/single-node.sh

# wait a block then stop the node with CTRL + C

celestia-appd start # works and doesn't panic
```
0xchainlover pushed a commit to celestia-org/celestia-app that referenced this issue Aug 1, 2024
Closes celestiaorg/celestia-app#3462

Previously, `InitChain` for a genesis with app version 2 wouldn't save
the app version to consensus params. This means that if the node stopped
and started up again, it would fetch `0` from consensus params and
default to using app version 1 because of [these
lines](https://github.com/rootulp/celestia-app/blob/dd6b188c0553d7cdd81161c62558db88a353d173/app/app.go#L520-L525).

The fix in this PR is to save app version 2 and above to consensus
params in `InitGenesis`.

## Testing


```shell
./scripts/single-node.sh

# wait a block then stop the node with CTRL + C

celestia-appd start # works and doesn't panic
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working WS: Maintenance 🔧 includes bugs, refactors, flakes, and tech debt etc WS: V2 ✌️ lemongrass hardfork related
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant