Skip to content

Commit

Permalink
crimson/common: don't assume pointer-from-SharedLRU can't outlive it.
Browse files Browse the repository at this point in the history
Initially, we were assuming that no pointer obtained from SharedLRU
can outlive the lru itself. However, since going with the interruption
concept for handling shutdowns, this is no longer valid.

The patch is supposed to deal with crashes like the following one:

```
ceph-osd: /home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/17.0.0-8898-ge57ad63c/rpm/el8/BUILD/ceph-17.0.
0-8898-ge57ad63c/src/crimson/common/shared_lru.h:46: SharedLRU<K, V>::~SharedLRU() [with K = unsigned int; V = OSDMap]: Assertion `weak_refs.empty()' failed.
Aborting on shard 0.
Backtrace:
Reactor stalled for 1162 ms on shard 0. Backtrace: 0xb14ab 0x46e57428 0x46bc450d 0x46be03bd 0x46be0782 0x46be0946 0x46be0bf6 0x12b1f 0xc8e3b 0x3fdd77e2 0x3fddccdb 0x3fdde1ee 0x3fdde8b3 0x3fdd3f2b 0x3fdd4442 0x3f
dd4c3a 0x12b1f 0x3737e 0x21db4 0x21c88 0x2fa75 0x3a5ae1b9 0x3a38c5e2 0x3a0c823d 0x3a1771f1 0x3a1796f5 0x46ff92c9 0x46ff9525 0x46ff9e93 0x46ff8eae 0x46ff8bd9 0x3a160e67 0x39f50c83 0x39f51cd0 0x46b96271 0x46bde51a
 0x46d6891b 0x46d6a8f0 0x4681a7d2 0x4681f03b 0x39fd50f2 0x23492 0x39b7a7dd
 0# gsignal in /lib64/libc.so.6
 1# abort in /lib64/libc.so.6
 2# 0x00007F9535E04C89 in /lib64/libc.so.6
 3# 0x00007F9535E12A76 in /lib64/libc.so.6
 4# crimson::osd::OSD::~OSD() in ceph-osd
 5# seastar::shared_ptr_count_for<crimson::osd::OSD>::~shared_ptr_count_for() in ceph-osd
 6# seastar::shared_ptr<crimson::osd::OSD>::~shared_ptr() in ceph-osd
 7# seastar::futurize<std::result_of<seastar::sharded<crimson::osd::OSD>::stop()::{lambda(seastar::future<void>)#2}::operator()(seastar::future<void>) const::{lambda(unsigned int)#1}::operator()(unsigned int) co
nst::{lambda()#1} ()>::type>::type seastar::smp::submit_to<seastar::sharded<crimson::osd::OSD>::stop()::{lambda(seastar::future<void>)#2}::operator()(seastar::future<void>) const::{lambda(unsigned int)#1}::opera
tor()(unsigned int) const::{lambda()#1}>(unsigned int, seastar::smp_submit_to_options, seastar::sharded<crimson::osd::OSD>::stop()::{lambda(seastar::future<void>)#2}::operator()(seastar::future<void>) const::{la
mbda(unsigned int)#1}::operator()(unsigned int) const::{lambda()#1}&&) in ceph-osd
 8# std::_Function_handler<seastar::future<void> (unsigned int), seastar::sharded<crimson::osd::OSD>::stop()::{lambda(seastar::future<void>)#2}::operator()(seastar::future<void>) const::{lambda(unsigned int)#1}>
::_M_invoke(std::_Any_data const&, unsigned int&&) in ceph-osd
 9# 0x0000562DA18162CA in ceph-osd
10# 0x0000562DA1816526 in ceph-osd
11# 0x0000562DA1816E94 in ceph-osd
12# 0x0000562DA1815EAF in ceph-osd
13# 0x0000562DA1815BDA in ceph-osd
14# seastar::noncopyable_function<seastar::future<void> (seastar::future<void>&&)>::direct_vtable_for<seastar::future<void>::then_wrapped_maybe_erase<true, seastar::future<void>, seastar::sharded<crimson::osd::OSD>::stop()::{lambda(seastar::future<void>)#2}>(seastar::sharded<crimson::osd::OSD>::stop()::{lambda(seastar::future<void>)#2}&&)::{lambda(seastar::future<void>&&)#1}>::call(seastar::noncopyable_function<seastar::future<void> (seastar::future<void>&&)> const*, seastar::future<void>&&) in ceph-osd
15# 0x0000562D9476DC84 in ceph-osd
16# 0x0000562D9476ECD1 in ceph-osd
17# 0x0000562DA13B3272 in ceph-osd
18# 0x0000562DA13FB51B in ceph-osd
19# 0x0000562DA158591C in ceph-osd
20# 0x0000562DA15878F1 in ceph-osd
21# 0x0000562DA10377D3 in ceph-osd
22# 0x0000562DA103C03C in ceph-osd
23# main in ceph-osd
24# __libc_start_main in /lib64/libc.so.6
25# _start in ceph-osd
```

Signed-off-by: Radoslaw Zarzynski <[email protected]>
  • Loading branch information
rzarzynski committed Nov 23, 2021
1 parent 5e56258 commit 82bfb2e
Showing 1 changed file with 4 additions and 2 deletions.
6 changes: 4 additions & 2 deletions src/crimson/common/shared_lru.h
Original file line number Diff line number Diff line change
Expand Up @@ -42,8 +42,10 @@ class SharedLRU {
{}
~SharedLRU() {
cache.clear();
// use plain assert() in utiliy classes to avoid dependencies on logging
assert(weak_refs.empty());
// initially, we were assuming that no pointer obtained from SharedLRU
// can outlive the lru itself. However, since going with the interruption
// concept for handling shutdowns, this is no longer valid.
weak_refs.clear();
}
/**
* Returns a reference to the given key, and perform an insertion if such
Expand Down

0 comments on commit 82bfb2e

Please sign in to comment.