perf: simplify cachekv store with copy-on-write btree #14350

yihuang · 2022-12-17T16:03:49Z

Description

This PR fixes the cachekv deadlock issue while keeping it thread-safe.

One tricky part of cachekv store is to handle modification while iteration in a thread-safe way, previously it separates unsorted and sorted cache, and sync them when starting iteration, which is complicated and leads to deadlock issues.

This PR unifies the cache using tidwall/btree, and leverage the copy-on-write feature to do iteration on an isolated view, which allows safe modifications while iteration.

This PR is consensus-breaking because in previous version, iteration is not done solely on isolated view, it also checks the store.deleted.

Benchmarks

TODO

Some performance regression is expected in cases iterations are not used, because btree is slower than golang map (see difference here).
But iteration would be faster, because it don't need to sort the cache.

Alternative

One alternative is still using the unsorted cache trick, we can still fix the concurrency/deadlock issues with the copy-on-write btree.

Author Checklist

All items are required. Please add a note to the item if the item is not applicable and
please add links to any relevant follow up issues.

I have...

Reviewers Checklist

All items are required. Please add a note if the item is not applicable and please add
your handle next to the items reviewed if you only reviewed selected items.

I have...

confirmed the correct type prefix in the PR title
confirmed ! in the type prefix if API or client breaking change
confirmed all author checklist items have been addressed
reviewed state machine logic
reviewed API design and naming
reviewed documentation is accurate
reviewed tests and test coverage
manually tested (if applicable)

This PR fixes the cachekv deadlock issue while keeping it thread-safe. One tricky part of cachekv store is to handle modification while iteration in a thread-safe way, previously it separates unsorted and sorted cache, and sync them when starting iteration, which is complicated and leads to deadlock issues. This PR unifies the cache using tidwall/btree, and leverage the copy-on-write feature to do iteration on an isolated view, which allows safe modifications while iteration. This PR is consensus-breaking because in previous version, iteration is not done solely on isolated view, it also checks the `store.deleted`.

CHANGELOG.md

alexanderbez · 2022-12-19T16:29:27Z

Just to understand what's going on here, because there isn't a lot of discussion and issues on the CacheKVStore (which I really really prefer to create issues prior to opening PRs).

So what I'd like more clarity on is the difference between this PR and the original PR?

store/cachekv/internal/memcache.go

alexanderbez · 2022-12-19T16:33:16Z

store/cachekv/internal/memcache.go

+// item represents a cached key-value pair and the entry of the cache btree.
+// If dirty is true, it indicates the cached value is newly set, maybe different from the underlying value.
+type item struct {
+	key   []byte
+	value []byte
+	dirty bool
+}
+
+// byKeys compares the items by key
+func byKeys(a, b item) bool {
+	return bytes.Compare(a.key, b.key) == -1
+}


nit: move these types to the top

yihuang · 2022-12-20T02:41:41Z

Just to understand what's going on here, because there isn't a lot of discussion and issues on the CacheKVStore (which I really really prefer to create issues prior to opening PRs).

So what I'd like more clarity on is the difference between this PR and the original PR?

Yeah I should've explained clearer, it starts with optimizing cachekv store for nested cases in #13881, then it end up with replacing MemDB with tidwall/btree to optimize more general cases, then deadlock issues happened which sparks the thread-safety discussions of cachekv store, eventually we find the current version already has the deadlock issue, so we eventually removed the btree lock in #13881 as a fix of the deadlock issue, but it's not thread-safe anymore.
After that, when I read more about tidwall/btree, I find that it support copy-on-write, I realize it can fix the deadlock issue while being thread-safe, thus this PR.

angbrav · 2022-12-20T16:29:30Z

store/cachekv/internal/memcache.go

+}
+
+// Copy the cache. This is a copy-on-write operation and is very fast because
+// it only performs a shadowed copy.


Can you please explain a bit how this work? I understand that the copy is not done until the btree is modified. This means either the original or the copy?

https://github.com/tidwall/btree/blob/master/btreeg.go#L296
I think the writer will check a seq number on nodes and do the copy when needed.

angbrav

I think this PR addresses too many things:

Simplifies cachekv a lot, especially iterators.
Gets rid of locks.
Allows for nested, concurrent iterators.
Allows for modifications while iterating.

I would rather do this iteratively. Also, as suggested by @alexanderbez I would rather create issues to discuss the problems and possible solutions before PRs. Things that I would like to have better understanding:

How important is allowing modifications while iterating? This seems to have a significant impact on performance. First, the cache size is unbonded (and it has to be). Thus, copying the entire cache when modifying while iterating could be really expensive. Second, the solution favors iterators' performance over simple get/set operations. Is this justified?
Are we sure that removing locks is safe?

angbrav · 2022-12-20T16:30:24Z

store/cachekv/store.go

 	}

+	value := store.parent.Get(key)
+	store.setCacheValue(key, value, false)


What happens if the key is not in the parent store?

It'll cache the nil value.

Would make sense to avoid caching the nil value by checking the value returned by store.parent.Get(key)?

I think the non-existence result is cacheable as well?

Makes sense, thanks!

angbrav · 2022-12-20T16:53:56Z

store/cachekv/internal/memcache.go

+const (
+	// The approximate number of items and children per B-tree node. Tuned with benchmarks.
+	// copied from memdb.
+	bTreeDegree = 32


What is the impact of misconfiguring this?

I believe it only affect performance

yihuang · 2022-12-20T17:02:14Z

Are we sure that removing locks is safe?

Actually the lock is removed in the previous PR #13881 to address the deadlock issues in released version.
This PR adds locks back in a safe way. There are locks inside the tidwal/btree, and we enabled it in the constructor.

EDIT: I guess you mean the mutex in the store, yeah, it seems it should not be removed. Although both parent and btree are thread-safe, but some operations on store sre supposed to be atomic.
I'll mark the PR as draft for now and make the change later.

Co-authored-by: Aleksandr Bezobchuk <[email protected]>

yihuang · 2022-12-20T17:27:58Z

Second, the solution favors iterators' performance over simple get/set operations. Is this justified?

I'm not sure, I guess it depends on how much slower it is, and do we want to pay the little price for simpler logic and thread-safety. Need benchmarks to justify it.

angbrav · 2022-12-20T17:50:15Z

EDIT: I guess you mean the mutex in the store, yeah, it seems it should not be removed. Although both parent and btree are thread-safe, but some operations on store sre supposed to be atomic. I'll mark the PR as draft for now and make the change later.

Yeah, I meant the mutex in the store. It may not be needed, but removing it makes cachekv non-thread-safe.

yihuang · 2022-12-20T17:51:53Z

I think I'll make a smaller PR later to fix the thread-safety issue only, without the controversial refactoring part, closing for now, thanks for reviewing ;)

yihuang requested a review from a team as a code owner December 17, 2022 16:03

github-actions bot added the C:Store label Dec 17, 2022

yihuang commented Dec 17, 2022

View reviewed changes

CHANGELOG.md Outdated Show resolved Hide resolved

Update CHANGELOG.md

bae4ee3

yihuang mentioned this pull request Dec 17, 2022

docs: Spec on current cachekv implementation #13977

Merged

13 tasks

alexanderbez requested review from angbrav and alexanderbez December 19, 2022 15:28

julienrbrt assigned angbrav and alexanderbez Dec 19, 2022

alexanderbez approved these changes Dec 19, 2022

View reviewed changes

angbrav reviewed Dec 20, 2022

View reviewed changes

yihuang marked this pull request as draft December 20, 2022 17:18

Update store/cachekv/internal/memcache.go

8fd7011

Co-authored-by: Aleksandr Bezobchuk <[email protected]>

yihuang closed this Dec 20, 2022

yihuang deleted the copy-cache-store branch December 21, 2022 15:22

nicolaslara added a commit to osmosis-labs/cosmos-sdk that referenced this pull request Apr 26, 2023

backport cosmos#14350

ac286eb

nicolaslara mentioned this pull request Apr 26, 2023

Backport cachekv deadlock fix osmosis-labs/cosmos-sdk#434

Closed

nicolaslara added a commit to osmosis-labs/cosmos-sdk that referenced this pull request Apr 26, 2023

backport cosmos#14350

d054a71

nicolaslara mentioned this pull request Apr 26, 2023

backport cachekv iterator fix osmosis-labs/cosmos-sdk#435

Closed

nicolaslara added a commit to osmosis-labs/cosmos-sdk that referenced this pull request Apr 26, 2023

backport cosmos#14350

a412f86

nicolaslara mentioned this pull request Apr 26, 2023

Backport CacheKV iterator fix v15 osmosis-labs/cosmos-sdk#436

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: simplify cachekv store with copy-on-write btree #14350

perf: simplify cachekv store with copy-on-write btree #14350

yihuang commented Dec 17, 2022 •

edited

Loading

alexanderbez commented Dec 19, 2022

alexanderbez Dec 19, 2022

yihuang commented Dec 20, 2022 •

edited

Loading

angbrav Dec 20, 2022

yihuang Dec 20, 2022

angbrav left a comment •

edited

Loading

angbrav Dec 20, 2022

yihuang Dec 20, 2022

angbrav Dec 20, 2022

yihuang Dec 20, 2022 •

edited

Loading

angbrav Dec 20, 2022

angbrav Dec 20, 2022

yihuang Dec 20, 2022

yihuang commented Dec 20, 2022 •

edited

Loading

yihuang commented Dec 20, 2022

angbrav commented Dec 20, 2022 •

edited

Loading

yihuang commented Dec 20, 2022

perf: simplify cachekv store with copy-on-write btree #14350

perf: simplify cachekv store with copy-on-write btree #14350

Conversation

yihuang commented Dec 17, 2022 • edited Loading

Description

Benchmarks

TODO

Alternative

Author Checklist

Reviewers Checklist

alexanderbez commented Dec 19, 2022

Choose a reason for hiding this comment

yihuang commented Dec 20, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

angbrav left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

yihuang Dec 20, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

yihuang commented Dec 20, 2022 • edited Loading

yihuang commented Dec 20, 2022

angbrav commented Dec 20, 2022 • edited Loading

yihuang commented Dec 20, 2022

yihuang commented Dec 17, 2022 •

edited

Loading

yihuang commented Dec 20, 2022 •

edited

Loading

angbrav left a comment •

edited

Loading

yihuang Dec 20, 2022 •

edited

Loading

yihuang commented Dec 20, 2022 •

edited

Loading

angbrav commented Dec 20, 2022 •

edited

Loading