Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

produce the double lock/unlock error #102

Merged
merged 3 commits into from
Jul 1, 2024
Merged

Conversation

wsong83
Copy link
Member

@wsong83 wsong83 commented Jun 28, 2024

The current implementation is buggy. A cache line can be double locked or unlocked.
For example (in probe_resp() of coherence_multi.hpp):

    bool hit = cache->hit(addr, &ai, &s, &w, Priority::probe);
    if(hit){
      std::tie(meta, data) = cache->access_line(ai, s, w);

Here, both hit and access_line would set the cache set state and lock the cache line, twice!

By adding a flag in the new MetaLock wrapper for MT supported metadata, this error is explicitly produced:

Thread 6 "multi-l2-msi" received signal SIGABRT, Aborted.
[Switching to Thread 0x7fffefe00640 (LWP 1298828)]
__pthread_kill_implementation (no_tid=0, signo=6, threadid=140737217824320) at ./nptl/pthread_kill.c:44
44	./nptl/pthread_kill.c: No such file or directory.
(gdb) bt
#0  __pthread_kill_implementation (no_tid=0, signo=6, threadid=140737217824320) at ./nptl/pthread_kill.c:44
#1  __pthread_kill_internal (signo=6, threadid=140737217824320) at ./nptl/pthread_kill.c:78
#2  __GI___pthread_kill (threadid=140737217824320, signo=signo@entry=6) at ./nptl/pthread_kill.c:89
#3  0x00007ffff7242476 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
#4  0x00007ffff72287f3 in __GI_abort () at ./stdlib/abort.c:79
#5  0x00007ffff722871b in __assert_fail_base (fmt=0x7ffff73dd130 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", 
    assertion=0x555555706300 "locked.load() || 0 == \"This cache line has already be unlocked and should not be unlocked again!\"", file=0x555555706221 "./cache/metadata.hpp", line=242, function=<optimized out>) at ./assert/assert.c:92
#6  0x00007ffff7239e96 in __GI___assert_fail (
    assertion=0x555555706300 "locked.load() || 0 == \"This cache line has already be unlocked and should not be unlocked again!\"", file=0x555555706221 "./cache/metadata.hpp", line=242, 
    function=0x5555557063d0 "void MetaLock<MT>::unlock() [with MT = MetadataMixer<48, 4, 10, MetadataMSIBase<CMMetadataBase> >]") at ./assert/assert.c:101
#7  0x00005555556055c3 in MetaLock<MetadataMixer<48, 4, 10, MetadataMSIBase<CMMetadataBase> > >::unlock (this=0x5555557c89f0)
    at ./cache/metadata.hpp:242
#8  0x000055555560e5f7 in CacheBase::unlock_line (w=0, s=6, ai=0, this=0x5555557c6ae0) at ./cache/cache.hpp:173
#9  OuterCohPortMultiThreadT<OuterCohPortMultiThreadUncached<CacheSkewedMultiThread<4, 4, 1, MetadataMixer<48, 4, 10, MetadataMSIBase<CMMetadataBase> >, Data64B, IndexNorm<4, 6>, ReplaceLRU<4, 4, true, true, true>, void, true, true> >, CoreMultiThreadInterface<OuterCohPortMultiThreadUncached<CacheSkewedMultiThread<4, 4, 1, MetadataMixer<48, 4, 10, MetadataMSIBase<CMMetadataBase> >, Data64B, IndexNorm<4, 6>, ReplaceLRU<4, 4, true, true, true>, void, true, true> >, CacheSkewedMultiThread<4, 4, 1, MetadataMixer<48, 4, 10, MetadataMSIBase<CMMetadataBase> >, Data64B, IndexNorm<4, 6>, ReplaceLRU<4, 4, true, true, true>, void, true, true>, MSIMultiThreadPolicy<MetadataMixer<48, 4, 10, MetadataMSIBase<CMMetadataBase> >, true, false> >, CacheSkewedMultiThread<4, 4, 1, MetadataMixer<48, 4, 10, MetadataMSIBase<CMMetadataBase> >, Data64B, IndexNorm<4, 6>, ReplaceLRU<4, 4, true, true, true>, void, true, true> >::probe_resp (this=0x5555557c6a90, addr=8284141782400, meta_outer=0x5555557f1990, data_outer=0x5555557f7690, 
    outer_cmd=..., delay=0x0) at ./cache/coherence_multi.hpp:179

According to https://en.cppreference.com/w/cpp/thread/mutex/lock

If lock is called by a thread that already owns the mutex, the behavior is undefined: for example, the program may deadlock. An implementation that can detect the invalid usage is encouraged to throw a [std::system_error](https://en.cppreference.com/w/cpp/error/system_error) with error condition resource_deadlock_would_occur instead of deadlocking.

It seems like GCC on Linux should not raise an error and simply let to lock and unlock silently! However, this is obviously not safe.

Since I am currently reviewing and refactoring the multithread extension, this PR is pure for your reference. No fix is required as I will find a way to fix it hopefully very soon.

@wsong83 wsong83 added bug Something isn't working wontfix This will not be worked on labels Jun 28, 2024
@wsong83 wsong83 requested a review from HanJinChi June 28, 2024 03:11
@wsong83
Copy link
Member Author

wsong83 commented Jun 28, 2024

Well, I have made a mistake to think that cache->access_line(ai, s, w) would lock a cache line. But the guard in MetaLock caching double unlock indicate an even more delicate bug.

@HanJinChi
Copy link
Member

问题应该是出在了probe_resp的函数中没有对cache line进行上锁,在coherence_multi.hpp中的probe_resp函数的代码应为:

    bool hit = cache->hit(addr, &ai, &s, &w, Priority::probe);
    if(hit){
      std::tie(meta, data) = cache->access_line(ai, s, w);
      cache->lock_line(ai, s, w);

此外,对于meta的应该是可以二次上锁的(第二次上锁的线程会等待,直到另一个线程解锁),因此MetaLock类中的lock函数不应该有assert存在,在将上述两处改动修改后代码可以正常运行

@wsong83 wsong83 removed the wontfix This will not be worked on label Jul 1, 2024
@wsong83
Copy link
Member Author

wsong83 commented Jul 1, 2024

OK. seems this is a lonely bug I have introduced in PR #99.
The heck for double lock is still required but I need to check whether the lock holder is the same thread.

@wsong83 wsong83 added this pull request to the merge queue Jul 1, 2024
Merged via the queue into master with commit ded7aa9 Jul 1, 2024
1 check passed
@wsong83 wsong83 deleted the verify-cache-line-double-lock branch July 1, 2024 01:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants