-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
libbpf-tools: Add new feature ALSan #4120
base: master
Are you sure you want to change the base?
Conversation
3473249
to
5b90929
Compare
There is a similar PR in #3828 . |
@chenhengqi, well, lsan is pretty different from memleak. |
Fine, could you describe how does it work ? |
Followings are the way it work.(Refer to this document https://github.com/google/sanitizers/wiki/AddressSanitizerLeakSanitizerDesignDocument) About root set: Indirect leak means not reachable allocation node from the root but can be reachable from some unreachable block. |
ef2f5aa
to
d1b49ec
Compare
The memory leak problem is an old problem and has not yet been conquered but very important. To solve memory leak problem, various methods have been tried and the best way is using a tool so far. There are two types of tools: static tools and dynamic tools. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi.
First of all, I would like to thank you for your contribution, it seems really cool.
I have some comments regarding the eBPF code.
I will take a look later at the C userland code (even though I am not 100% I will be able to understand the logic).
Best regards.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi.
This time, I took a look at the userspace code.
I am not 100% sure I understand the whole code particularly all the hash maps you need but it worked!
I also share your opinion about being able to quickly spot memory leak at runtime while not slowing the application (contrary to valgrind
even though I really like this tool).
I tried to run it and I had to modify the following:
diff --git a/libbpf-tools/lsan.c b/libbpf-tools/lsan.c
index dcb3862e..1cd9538b 100644
--- a/libbpf-tools/lsan.c
+++ b/libbpf-tools/lsan.c
@@ -358,11 +358,11 @@ static int get_libc_path(char *path)
continue;
file_name = strrchr(buf, '/') + 1;
-#ifdef __aarch64__
+// #ifdef __aarch64__
if (sscanf(file_name, "libc-%f.so", &version) == 1) {
-#else
- if (sscanf(file_name, "libc.so.%f", &version) == 1) {
-#endif
+// #else
+// if (sscanf(file_name, "libc.so.%f", &version) == 1) {
+// #endif
memcpy(path, buf, strlen(buf));
fclose(f);
return 0;
Indeed, on my system, I have:
francis@pwmachine:~/Codes/kinvolk/bcc/libbpf-tools$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 20.04.5 LTS
Release: 20.04
Codename: focal
francis@pwmachine:~/Codes/kinvolk/bcc/libbpf-tools$ uname -a
Linux pwmachine 5.15.0-48-generic #54~20.04.1-Ubuntu SMP Thu Sep 1 16:17:26 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
francis@pwmachine:~/Codes/kinvolk/bcc/libbpf-tools$ ls -l /usr/lib/x86_64-linux-gnu/libc.so.6
lrwxrwxrwx 1 root root 12 avril 7 03:24 /usr/lib/x86_64-linux-gnu/libc.so.6 -> libc-2.31.so
After this modification, I was able to get the same output as you (plus the value 99 printed by a.out
).
Best regards.
Sorry for the long delay, I will try finishing this review this week. |
Ping me if this PR is ready for review. |
a8db792
to
bb2970c
Compare
@chenhengqi |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi!
Thank you for the latest modifications!
I only found nits.
I will test it again but from code inspection it looks OK.
Best regards.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tested it again and it works fine:
$ sudo ./lsan -c /tmp/test -s /tmp/suppr.txt (remotes/bojun/lsan) *%
[sudo] Mot de passe de francis :
Info: Execute child process: /tmp/test
Info: execute command: /tmp/test(pid 94525)
99
99
[2023-11-08 16:15:57] Print leaks:
4 bytes direct leak found in 1 allocations from stack id(41605)
#1 0x00557fb2fee1bf foo+0x16 (/tmp/test+0x11bf)
#2 0x00557fb2fee21b main+0x1e (/tmp/test+0x121b)
#3 0x007f4794629d90 __libc_init_first+0x90 (/usr/lib/x86_64-linux-gnu/libc.so.6+0x29d90)
99
[2023-11-08 16:16:07] Print leaks:
8 bytes direct leak found in 2 allocations from stack id(41605)
#1 0x00557fb2fee1bf foo+0x16 (/tmp/test+0x11bf)
#2 0x00557fb2fee21b main+0x1e (/tmp/test+0x121b)
#3 0x007f4794629d90 __libc_init_first+0x90 (/usr/lib/x86_64-linux-gnu/libc.so.6+0x29d90)
99
[2023-11-08 16:16:17] Print leaks:
12 bytes direct leak found in 3 allocations from stack id(41605)
#1 0x00557fb2fee1bf foo+0x16 (/tmp/test+0x11bf)
#2 0x00557fb2fee21b main+0x1e (/tmp/test+0x121b)
#3 0x007f4794629d90 __libc_init_first+0x90 (/usr/lib/x86_64-linux-gnu/libc.so.6+0x29d90)
Apart from the two nits I found earlier, it is, for me, ready to go!
Thank you for this cool tool!
5ebfbd2
to
5badd76
Compare
Difference between Followings are the test code #include <stdlib.h>
#include <unistd.h>
int *arr[10000];
int *foo(size_t size) {
int *tmp = malloc(size);
*tmp = 99;
return tmp;
}
int *bar(size_t nmemb, size_t size) {
int *tmp = calloc(nmemb, size);
*tmp = 22;
return tmp;
}
int *baz(size_t size) {
int *tmp = valloc(size);
*tmp = 11;
return tmp;
}
int main(int argc, char* argv[]) {
int *a;
int i = 0;
while (1) {
a = foo(4);
arr[i++] = a;
a = bar(4, 4);
free(a);
a = baz(44);
sleep(5);
}
return 0;
} memory allocated by So,
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi!
I already approved it, so it should be OK.
Reading the code again, I just have one question with regard to avoid having several loops, but I am doubtful this is possible.
Best regards.
libbpf-tools/lsan.c
Outdated
for_each_tid_ptrace(PTRACE_SEIZE); | ||
for_each_tid_ptrace(PTRACE_INTERRUPT); | ||
for_each_tid_waitpid(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make sense to have the code inside one loop and doing all these operations inside this loop?
Something like this:
save_roots();
for (i = 0; i < cvector_size(tids); ++i) {
ret = ptrace(PTRACE_SEIZE, tids[i], NULL, NULL);
if // ...
ret = ptrace(PTRACE_INTERRUPT, tids[i], NULL, NULL);
if // ...
ret = waitpid(tids[i], NULL, __WALL);
if // ...
}
ret = read_table();
if (ret < 0) {
p_warn("Failed to read_table, retry after %d seconds",
env.interval);
for_each_tid_ptrace(PTRACE_DETACH);
return 0;
}
HASH_ITER(hh, allocs, curr, next) {
mark_indirectly_leaked_cb(curr->id);
collect_leaks_cb(curr->id);
}
for_each_tid_ptrace(PTRACE_DETACH);
report_leaks(syms_cache);
return 0;
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it could be possible to have the code inside one loop and doing all ptrace
attach
, interrupt
and waitpid
inside the loop, which will mitigate loop overhead.
I think it would be better to factorize it for readability.
However, despite the intention to factorize it for readability, it currently doesn't seem to be very readable.
I think that's why you think it would be better to make one loop.
I'll revise this function to enhance its readability.
Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried to change the code as you suggested, but because of the env.stop_the_world
flag, the code became more complicated. This means the current code has better readability in my opinion. So I decided to stick with it. However, your suggestion is very good, and I'll change the code if I find a more readable way.
Thanks!
0b3b594
to
3d87358
Compare
0b2ecc7
to
0d25050
Compare
I split my whole patch into three parts.
|
I changed In case of tracing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi!
I have one comment with regard to cvector.h
and uthash.h
.
Best regards.
Add submodule c-vector https://github.com/eteran/c-vector This project is header only library project that contains interfaces and implementations of vector container which can be used in C The license of this project is MIT
Add submodule uthash https://github.com/troydhanson/uthash This project is header only library project that contains interfaces and implementations of hash map container which can be used in C The license is written on the top of the file uthash.h
Add ALSan(Attachable Leak Sanitizer) feature on libbpf-tools ALSan feature originally comes from the llvm-project lsan https://github.com/llvm/llvm-project This tool detect and report unreachable memory periodically USAGE: $ ./alsan -h Usage: alsan [OPTION...] Detect memory leak resulting from unreachable pointers. Either -c or -p is a mandatory option EXAMPLES: alsan -p 1234 # Detect leaks on process id 1234 alsan -c a.out # Detect leaks on a.out alsan -c 'a.out arg' # Detect leaks on a.out with argument -c, --command=COMMAND Execute and detect memory leak on the specified command -i, --interval=INTERVAL Set interval in second to detect leak -p, --pid=PID Detect memory leak on the specified process -s, --suppressions=SUPPRESSIONS Suppressions file name -T, --top=TOP Report only specified amount of backtraces -v, --verbose Verbose debug output -w, --stop-the-world Stop the target process during tracing -?, --help Give this help list --usage Give a short usage message -V, --version Print program version Mandatory or optional arguments to long options are also mandatory or optional for any corresponding short options. Report bugs to https://github.com/iovisor/bcc/tree/master/libbpf-tools. Report example: $ sudo ./alsan -p 28346 [2024-05-22 14:44:58] Print leaks: 44 bytes direct leak found in 1 allocations from stack id(57214) iovisor#1 0x00583bca1b2250 baz+0x1c (/home/bojun/alsan/libbpf-tools/a.out+0x1250) iovisor#2 0x00583bca1b22d7 main+0x73 (/home/bojun/alsan/libbpf-tools/a.out+0x12d7) iovisor#3 0x007470c7c2a1ca [unknown] (/usr/lib/x86_64-linux-gnu/libc.so.6+0x2a1ca) iovisor#4 0x007470c7c2a28b __libc_start_main+0x8b (/usr/lib/x86_64-linux-gnu/libc.so.6+0x2a28b) iovisor#5 0x00583bca1b2105 _start+0x25 (/home/bojun/alsan/libbpf-tools/a.out+0x1105) [2024-05-22 14:45:08] Print leaks: 132 bytes direct leak found in 3 allocations from stack id(57214) iovisor#1 0x00583bca1b2250 baz+0x1c (/home/bojun/alsan/libbpf-tools/a.out+0x1250) iovisor#2 0x00583bca1b22d7 main+0x73 (/home/bojun/alsan/libbpf-tools/a.out+0x12d7) iovisor#3 0x007470c7c2a1ca [unknown] (/usr/lib/x86_64-linux-gnu/libc.so.6+0x2a1ca) iovisor#4 0x007470c7c2a28b __libc_start_main+0x8b (/usr/lib/x86_64-linux-gnu/libc.so.6+0x2a28b) iovisor#5 0x00583bca1b2105 _start+0x25 (/home/bojun/alsan/libbpf-tools/a.out+0x1105) Source code of test program: $ cat leak_test.c #include <stdlib.h> #include <unistd.h> int *arr[10000]; int *foo(size_t size) { int *tmp = malloc(size); *tmp = 99; return tmp; } int *bar(size_t nmemb, size_t size) { int *tmp = calloc(nmemb, size); *tmp = 22; return tmp; } int *baz(size_t size) { int *tmp = valloc(size); *tmp = 11; return tmp; } int main(int argc, char* argv[]) { int *a; int i = 0; while (1) { a = foo(4); arr[i++] = a; a = bar(4, 4); free(a); a = baz(44); sleep(5); } return 0; }
Rebase and change the tool name from |
There was a talk about this feature in C++ Now conference. https://youtu.be/9f5hd-8suVE?si=lOli99myFpExgzac I hope this video helps people understand this patch. |
Add ALSan(Attachable Leak Sanitizer) feature on libbpf-tools
ALSan feature originally comes from the llvm-project lsan
https://github.com/llvm/llvm-project
cvector.h comes from c-vector project,
commit d3f3156373b0587336ac7ee1568755d6cf93dd39
https://github.com/eteran/c-vector
uthash.h comes from uthash project,
commit bf15263081be6229be31addd48566df93921cb46
https://github.com/troydhanson/uthash
This tool detect and report unreachable memory periodically
Usage:
Report example:
Source code of test program: