Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Legion Multinode Crash UBSAN Error #1726

Open
Jacobfaib opened this issue Jul 25, 2024 · 1 comment
Open

[BUG] Legion Multinode Crash UBSAN Error #1726

Jacobfaib opened this issue Jul 25, 2024 · 1 comment

Comments

@Jacobfaib
Copy link
Contributor

remote_task member is used before it is initialized:

RemoteContext::RemoteContext(DistributedID id, Runtime *rt,
                                 CollectiveMapping *mapping)
      : InnerContext(rt, NULL, -1, false/*full inner*/, remote_task.regions, // <<< used here
                     remote_task.output_regions, local_parent_req_indexes, // <<< and here
                     local_virtual_mapped, ApEvent::NO_AP_EVENT, id,
                     false, false, false, mapping),
        parent_ctx(NULL), shard_manager(NULL), provenance(NULL),
        top_level_context(false), remote_task(RemoteTask(this)), // <<< initialized here
        remote_uid(0), repl_id(0)
    //--------------------------------------------------------------------------
    {
    }
legion-src/runtime/legion/legion_context.cc:22561:57: runtime error: member access within address 0x000110b55c00 which does not point to an object of type 'Legion::Task'
0x000110b55c00: note: object has invalid vptr
 00 00 00 00  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  00 00 00 00
              ^~~~~~~~~~~~~~~~~~~~~~~
              invalid vptr
    #0 0x1257bb650 in Legion::Internal::RemoteContext::RemoteContext(unsigned long long, Legion::Internal::Runtime*, Legion::Internal::CollectiveMapping*) (/Users/jfaibussowit/soft/nv/legate.core.internal/arch-darwin-debug/cmake_build/_deps/legion-build/lib/liblegion.1.dylib:arm64+0x1ba7650)
    #1 0x1257bc1f0 in Legion::Internal::RemoteContext::RemoteContext(unsigned long long, Legion::Internal::Runtime*, Legion::Internal::CollectiveMapping*) (/Users/jfaibussowit/soft/nv/legate.core.internal/arch-darwin-debug/cmake_build/_deps/legion-build/lib/liblegion.1.dylib:arm64+0x1ba81f0)
    #2 0x1257db05c in Legion::Internal::RemoteContext::handle_context_response(Legion::Deserializer&, Legion::Internal::Runtime*) (/Users/jfaibussowit/soft/nv/legate.core.internal/arch-darwin-debug/cmake_build/_deps/legion-build/lib/liblegion.1.dylib:arm64+0x1bc705c)
    #3 0x1285c2400 in Legion::Internal::Runtime::handle_remote_context_response(Legion::Deserializer&) (/Users/jfaibussowit/soft/nv/legate.core.internal/arch-darwin-debug/cmake_build/_deps/legion-build/lib/liblegion.1.dylib:arm64+0x49ae400)
    #4 0x1285ad3c8 in Legion::Internal::VirtualChannel::handle_messages(unsigned int, Legion::Internal::Runtime*, unsigned int, char const*, unsigned long) const (/Users/jfaibussowit/soft/nv/legate.core.internal/arch-darwin-debug/cmake_build/_deps/legion-build/lib/liblegion.1.dylib:arm64+0x49993c8)
    #5 0x1285a6fb0 in Legion::Internal::VirtualChannel::process_message(void const*, unsigned long, Legion::Internal::Runtime*, unsigned int) (/Users/jfaibussowit/soft/nv/legate.core.internal/arch-darwin-debug/cmake_build/_deps/legion-build/lib/liblegion.1.dylib:arm64+0x4992fb0)
    #6 0x1285d5e74 in Legion::Internal::MessageManager::receive_message(void const*, unsigned long) (/Users/jfaibussowit/soft/nv/legate.core.internal/arch-darwin-debug/cmake_build/_deps/legion-build/lib/liblegion.1.dylib:arm64+0x49c1e74)
    #7 0x128745a88 in Legion::Internal::Runtime::process_message_task(void const*, unsigned long) (/Users/jfaibussowit/soft/nv/legate.core.internal/arch-darwin-debug/cmake_build/_deps/legion-build/lib/liblegion.1.dylib:arm64+0x4b31a88)
    #8 0x1287ebc34 in Legion::Internal::Runtime::legion_runtime_task(void const*, unsigned long, void const*, unsigned long, Realm::Processor) (/Users/jfaibussowit/soft/nv/legate.core.internal/arch-darwin-debug/cmake_build/_deps/legion-build/lib/liblegion.1.dylib:arm64+0x4bd7c34)
    #9 0x13a6dc540 in Realm::LocalTaskProcessor::execute_task(unsigned int, Realm::ByteArrayRef const&) (/Users/jfaibussowit/soft/nv/legate.core.internal/arch-darwin-debug/cmake_build/_deps/legion-build/lib/librealm.1.dylib:arm64+0x2a64540)
    #10 0x13aadc9e4 in Realm::Task::execute_on_processor(Realm::Processor) (/Users/jfaibussowit/soft/nv/legate.core.internal/arch-darwin-debug/cmake_build/_deps/legion-build/lib/librealm.1.dylib:arm64+0x2e649e4)
    #11 0x13ab0bd98 in Realm::KernelThreadTaskScheduler::execute_task(Realm::Task*) (/Users/jfaibussowit/soft/nv/legate.core.internal/arch-darwin-debug/cmake_build/_deps/legion-build/lib/librealm.1.dylib:arm64+0x2e93d98)
    #12 0x13ab0137c in Realm::ThreadedTaskScheduler::scheduler_loop() (/Users/jfaibussowit/soft/nv/legate.core.internal/arch-darwin-debug/cmake_build/_deps/legion-build/lib/librealm.1.dylib:arm64+0x2e8937c)
    #13 0x13ab06070 in Realm::ThreadedTaskScheduler::scheduler_loop_wlock() (/Users/jfaibussowit/soft/nv/legate.core.internal/arch-darwin-debug/cmake_build/_deps/legion-build/lib/librealm.1.dylib:arm64+0x2e8e070)
    #14 0x13ab8d6bc in void Realm::Thread::thread_entry_wrapper<Realm::ThreadedTaskScheduler, &Realm::ThreadedTaskScheduler::scheduler_loop_wlock()>(void*) (/Users/jfaibussowit/soft/nv/legate.core.internal/arch-darwin-debug/cmake_build/_deps/legion-build/lib/librealm.1.dylib:arm64+0x2f156bc)
    #15 0x13aba8340 in Realm::KernelThread::pthread_entry(void*) (/Users/jfaibussowit/soft/nv/legate.core.internal/arch-darwin-debug/cmake_build/_deps/legion-build/lib/librealm.1.dylib:arm64+0x2f30340)
    #16 0x18f19ef90 in _pthread_start (/usr/lib/system/libsystem_pthread.dylib:arm64e+0x6f90)
    #17 0x18f199d30 in thread_start (/usr/lib/system/libsystem_pthread.dylib:arm64e+0x1d30)
@lightsighter
Copy link
Contributor

lightsighter commented Jul 25, 2024

This is one of those false positives that I hate from UBSAN. Just because I give out a pointer to an object before it is done being constructed does not mean that I'm using the object before initialization.

I'm not planning on fixing this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants