-
Notifications
You must be signed in to change notification settings - Fork 140
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OOM: scan linux kernel bytecode #624
Comments
the size of vmlinux.bc is about 387M, and phasar could read llvm ir successfully ./phasar-cli -m vmlinux.bc --emit-ir |
after debugging with lldb, the program trapped at function |
Hi @small-cat, it's interesting that you intend to analyze the kernel with phasar. In order to reproduce your issue the following would be helpful: a) Which phasar version are you using? Which kernel configuration did you choose? What command lines did you use to configure/build the kernel? Your invocation of phasar computes interprocedural whole-program points-to information as a pre-analysis. The compiler does not do this and thus does not need that much of memory. Your issue could be related to some other bug, but could also be "just" the OS killing the process due to memory exhaustion. The challenge for me reproducing this issue might actually be the memory size of my machine, which is below 80GB. I might get a machine with more memory to try myself. Best regards |
I use phasar v0323 and llvm 14.0.0, I used allyesconfig to build kernel (5.10.59) make ARCH=arm64 LLVM=1 LLVM_IAS=1 allyesconfig memory of my computer is 32GB, when the process was killed, it lasts about 8-10 min. I check logs from dmesg and make sure it was killed because of oom-kill |
What happens if you use a kernel config that includes fewer parts of the kernel? I just failed to build the kernel with clang 14.0.6 and allyesconfig, but not ARCH=arm64 because it seems I don't have clang built with that backend. I'm trying a defconfig build now. |
With a defconfig build it is also running out of memory for me. The culprit is the points-to analysis, looks like it is grossly overapproximating and thus needing lots of memory. With tinyconfig it succeeds. What is your use case? Are you interested in analyzing the whole kernel? We need to look if the exploding memory usage is a bug or a forced consequence of the points-to analysis. |
Yes, I try to analyze the whole kernel. If I tailor the kernel with few parts, phasar looks fine. |
When computeValuesAliasSet in LLVMAliasSet.cpp, program will run out of memory, I found so many function calls in llvm ir is the llvm apis, and are these llvm apis necessary for the anaysis, and will affect the analysis process? I wonder if I can skip these functions and only pay attention to the code (functions) we want, instead of the llvm ir builtin functions. |
Hi @small-cat, I could reproduce the error on my system (I compiled with defconfig and ARCH=x86_64, but should not really matter). I could boil down the problem to a single alias-query in the int main(int Argc, const char **Argv) {
if (Argc < 2 || !std::filesystem::exists(Argv[1]) ||
std::filesystem::is_directory(Argv[1])) {
llvm::errs() << "myphasartool\n"
"A small PhASAR-based example program\n\n"
"Usage: myphasartool path/to/vmlinux.bc\n";
return 1;
}
llvm::LLVMContext Ctx;
llvm::SMDiagnostic Diag;
auto Mod = llvm::parseIRFile(Argv[1], Diag, Ctx);
if (!Mod) {
Diag.print(nullptr, llvm::errs());
return 1;
}
auto Fun = Mod->getFunction("x86_64_start_kernel");
assert(Fun);
auto Arg = Fun->getArg(0);
assert(Arg);
auto Glob = Mod->getNamedGlobal("pgdir_shift");
assert(Glob);
llvm::PassBuilder PB;
llvm::FunctionAnalysisManager FAM;
FAM.registerPass([&] {
llvm::AAManager AA;
AA.registerFunctionAnalysis<llvm::CFLAndersAA>();
AA.registerFunctionAnalysis<llvm::TypeBasedAA>();
AA.registerFunctionAnalysis<llvm::BasicAA>();
return AA;
});
PB.registerFunctionAnalyses(FAM);
llvm::FunctionPassManager FPM;
std::ignore = FPM.run(*Fun, FAM);
llvm::AAResults &AAR = FAM.getResult<llvm::AAManager>(*Fun);
std::ignore = AAR.alias(Arg, Glob);
} Just running the above code snippet leads to the same error (you may need to adjust the function/global names). This leads to the strong conclusion that the error indeed is within LLVM in the In #626 we provide an option to disable the CFL alias analysis; you may want to try this out. |
Small addition: I have tried running phasar with Note, that we are working on alias/pointer analyses implemented completely within PhASAR, but getting this right is a challenging task that will take quite a while still. |
@fabianbs96 Thank you very much for your reply. I try call graph analysis with cha instead of otf(default), and test empty analysis on vmlinux.bc, the command is the following: ./phasar-cli -m vmlinux.bc -C cha -D ifds-solvertest --entry-points=irq_enter the program terminated because of a coredump, and I try ifds-uninit, ide-solvertest, coredump too. I read the paper to know the principle of the ifds framework and debug the program, the bug occurs when compute PathEdge in the tabulate algorithm in propagate() function, but not always. Each time when I run the program, the coredump point from backtrace is not the same. The esg is so big to analyze the bug, and I does not reproduce the bug by a small case so far. May I send you an email, or phasar has the forum/community to discuss? |
I found the cause of coredump. It is because of the stack overflow. Phasar implements the ifds/ide framework by recursive. However, when phasar analyze the llvm ir of kernel, the depth of recursion is so deep that caused the stack overflow error. set the stack size |
Hi @small-cat, |
@fabianbs96 Sorry, I do not notice the commit after v0323. I will upgrade and have a try, thanks very much. |
I use wllvm to build kernel 5.10.59, and extract bytecode vmlinux.bc, when I use phasar
the process was killed after a while, and I found from the top command that the process need so large memory even 80G+ on macos, and process was killed by OS.
kernel could be built with clang lto, which will aggregate all the llvm ir into a whole file and optimize on it, but why phasar need so large memory to read bytecode and do the analysis?
Is this a bug?
The text was updated successfully, but these errors were encountered: