Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[LLDB][SBSaveCore] Add Extension to Save a thread and N pointers deep #111601

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

Jlalond
Copy link
Contributor

@Jlalond Jlalond commented Oct 8, 2024

This adds an extension to allow the saving of a Thread and it's pointers N layers deep via BFS. This lets SBSaveCore 'automagically' figure out everything important for a given thread to save on your behalf.

#extend lldb::SBSaveCoreOptions {
#ifdef SWIGPYTHON
%pythoncode% {
'''Add a thread to the SaveCoreOptions thread list, and follow it's children N pointers deep, adding each memory region to the SaveCoreOptions Memory region list.'''
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docstring contains a grammatical error. The possessive form of "it" should be "its" rather than "it's". The corrected version should read:

'''Add a thread to the SaveCoreOptions thread list, and follow its children N pointers deep, adding each memory region to the SaveCoreOptions Memory region list.'''

This change ensures proper usage of the possessive pronoun and improves the overall clarity of the documentation.

Spotted by Graphite Reviewer

Is this helpful? React 👍 or 👎 to let us know.

@llvmbot
Copy link
Collaborator

llvmbot commented Oct 8, 2024

@llvm/pr-subscribers-lldb

Author: Jacob Lalonde (Jlalond)

Changes

This adds an extension to allow the saving of a Thread and it's pointers N layers deep via BFS. This lets SBSaveCore 'automagically' figure out everything important for a given thread to save on your behalf.


Full diff: https://github.com/llvm/llvm-project/pull/111601.diff

3 Files Affected:

  • (added) lldb/bindings/interface/SBSaveCoreOptionsExtensions.i (+31)
  • (modified) lldb/bindings/interfaces.swig (+1)
  • (modified) lldb/test/API/functionalities/process_save_core_minidump/TestProcessSaveCoreMinidump.py (+22)
diff --git a/lldb/bindings/interface/SBSaveCoreOptionsExtensions.i b/lldb/bindings/interface/SBSaveCoreOptionsExtensions.i
new file mode 100644
index 00000000000000..5d20aacacadad4
--- /dev/null
+++ b/lldb/bindings/interface/SBSaveCoreOptionsExtensions.i
@@ -0,0 +1,31 @@
+#extend lldb::SBSaveCoreOptions {
+#ifdef SWIGPYTHON
+    %pythoncode% {
+        '''Add a thread to the SaveCoreOptions thread list, and follow it's children N pointers deep, adding each memory region to the SaveCoreOptions Memory region list.'''
+        def save_thread_with_heaps(self, thread, num_pointers_deep = 3):
+            self.AddThread(thread)
+            frame = thread.GetFrameAtIndex(0)
+            process = thread.GetProcess()
+            queue = []
+            for var in frame.locals:
+                if var.TypeIsPointerType():
+                    queue.append(var.Dereference())
+            
+            while (num_pointers_deep > 0 and len(queue) > 0):
+                queue_size = len(queue)
+                for i in range(0, queue_size):
+                    var = queue.pop(0)
+                    memory_region = lldb.SBMemoryRegionInfo()
+                    process.GetMemoryRegionInfo(var.GetAddress().GetOffset(), memory_region)
+                    self.AddMemoryRegionToSave(memory_region)
+                    /* 
+                    We only check for direct pointer children T*, should probably scan 
+                    internal to the children themselves. 
+                    */
+                    for x in var.children:
+                        if x.TypeIsPointerType():
+                            queue.append(x.Dereference())
+
+                num_pointers_deep -= 1
+}
+#endif SWIGPYTHON
diff --git a/lldb/bindings/interfaces.swig b/lldb/bindings/interfaces.swig
index 8a6fed95f0b729..8a82ba28d3f2ae 100644
--- a/lldb/bindings/interfaces.swig
+++ b/lldb/bindings/interfaces.swig
@@ -26,6 +26,7 @@
 %include "./interface/SBCommunicationDocstrings.i"
 %include "./interface/SBCompileUnitDocstrings.i"
 %include "./interface/SBSaveCoreOptionsDocstrings.i"
+#include "./interface/SBSaveCoreOptionsExtensions.i"
 %include "./interface/SBDataDocstrings.i"
 %include "./interface/SBDebuggerDocstrings.i"
 %include "./interface/SBDeclarationDocstrings.i"
diff --git a/lldb/test/API/functionalities/process_save_core_minidump/TestProcessSaveCoreMinidump.py b/lldb/test/API/functionalities/process_save_core_minidump/TestProcessSaveCoreMinidump.py
index 03cc415924e0bb..cb5d625f4bffa4 100644
--- a/lldb/test/API/functionalities/process_save_core_minidump/TestProcessSaveCoreMinidump.py
+++ b/lldb/test/API/functionalities/process_save_core_minidump/TestProcessSaveCoreMinidump.py
@@ -3,6 +3,7 @@
 """
 
 import os
+
 import lldb
 from lldbsuite.test.decorators import *
 from lldbsuite.test.lldbtest import *
@@ -470,6 +471,27 @@ def save_core_with_region(self, process, region_index):
             if os.path.isfile(custom_file):
                 os.unlink(custom_file)
 
+    def save_core_one_thread_one_heap(self, process, region_index):
+        try:
+            custom_file = self.getBuildArtifact("core.one_thread_one_heap.dmp")
+            options = lldb.SBSaveCoreOptions()
+            options.SetOutputFile(lldb.SBFileSpec(custom_file))
+            options.SetPluginName("minidump")
+            options.SetStyle(lldb.eSaveCoreCustomOnly)
+            thread = process.GetThreadAtIndex(0)
+            options.save_thread_with_heaps(thread)
+
+            error = process.SaveCore(options)
+            self.assertTrue(error.Success())
+            core_target = self.dbg.CreateTarget(None)
+            core_proc = core_target.LoadCore(custom_file)
+            # proc/pid maps prevent us from checking the number of regions, but
+            # this is mostly a smoke test for the extension.
+            self.assertEqual(core_proc.GetNumThreads(), 1)
+        finally:
+            if os.path.isfile(custom_file):
+                os.unlink(custom_file)
+
     @skipUnlessArch("x86_64")
     @skipUnlessPlatform(["linux"])
     def test_save_minidump_custom_save_style_duplicated_regions(self):

@@ -470,6 +471,27 @@ def save_core_with_region(self, process, region_index):
if os.path.isfile(custom_file):
os.unlink(custom_file)

def save_core_one_thread_one_heap(self, process, region_index):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Who is invoking this method?
Is region_index used?

Copy link
Contributor

@jeffreytan81 jeffreytan81 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can do better than save at memory region fidelity which is at least at page boundary if I am not wrong.

We can use type information to query and save the object's size without including hundreds of unrelated objects in the heap happen to be in the same page. This should greatly reduce the size of the dump.

Comment on lines +19 to +20
process.GetMemoryRegionInfo(var.GetAddress().GetOffset(), memory_region)
self.AddMemoryRegionToSave(memory_region)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if the pointer points to inaccessible memory or the pointer is corrupted? You need to check the validity of the memory region?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The save-core impl will actually just omit a non readable/corrupted memory.

We could have the concern that we get an a memory range that is massive, like 2^60-2^64, but I think that's a generic concern of this API

process = thread.GetProcess()
queue = []
for var in frame.locals:
if var.TypeIsPointerType():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This API is deprecated per doc. You should use SBType::IsPointerType() and also check reference type.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I didn't know that. Thanks!

for i in range(0, queue_size):
var = queue.pop(0)
memory_region = lldb.SBMemoryRegionInfo()
process.GetMemoryRegionInfo(var.GetAddress().GetOffset(), memory_region)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you need to perform memory region merging? For example, if two pointers point to the same objects or different parts of the same memory region. Is this a problem?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new RangeMap will deduplicate memory based on permissions.

@Jlalond
Copy link
Contributor Author

Jlalond commented Oct 8, 2024

I think we can do better than save at memory region fidelity which is at least at page boundary if I am not wrong.

We can use type information to query and save the object's size without including hundreds of unrelated objects in the
heap happen to be in the same page. This should greatly reduce the size of the dump.

I'm okay with this, when @clayborg and I talked about this the original assumption is we should do the entire memory region it's in, because with deduplication we will likely get multiple pointers to the same heap. I think we should leave this heap-region mode, and then also build a 'just save my objects' extension.

@Jlalond Jlalond changed the title [LLDB][SBSaveCore] Add Extension to Save a thread and N pointers dead [LLDB][SBSaveCore] Add Extension to Save a thread and N pointers deep Oct 8, 2024
@jeffreytan81
Copy link
Contributor

I think we can do better than save at memory region fidelity which is at least at page boundary if I am not wrong.

We can use type information to query and save the object's size without including hundreds of unrelated objects in the
heap happen to be in the same page. This should greatly reduce the size of the dump.

I'm okay with this, when @clayborg and I talked about this the original assumption is we should do the entire memory region it's in, because with deduplication we will likely get multiple pointers to the same heap. I think we should leave this heap-region mode, and then also build a 'just save my objects' extension.

The value of this option lies in reducing the size to a minimum while retaining useful information. In such cases, saving object closures is a superior choice compared to saving memory regions. You can easily achieve minidumps that are 2 to 3 times smaller in some cases.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants