Initialize local roots with Val_unit
to fix segfault
#29
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Long story (below) short: local GC roots (in this case in an array) should be initialized with
Val_unit
instead of0
like theCAMLlocal*
macros do it.In the case described below, while in the middle of passing multiple out arguments, GC may be triggered, which could see improperly initialized parts of the output tuple and segfault.
Currently this PR is a draft because I've made the fix in only one place in camlidl code generation that mattered for the specific case (where it fixes the segfault). If my understanding of the OCaml runtime is correct and such fix is indeed needed, I can try to extend it to other parts of the code generation.This PR fixes all such places in camlidl I managed to find.
Long story
In Goblint I have been fighting with spurious segfaults (goblint/analyzer#1520 (comment)) for a while. I finally managed to get a backtrace from gdb:
It looks like something to do with Apron, although that could be any heap corruption that just happens to crash from allocations from Apron bindings.
The top of this stack corresponds to this "Local C roots" code in the runtime: https://github.com/ocaml/ocaml/blob/8eb41f72ded84df884c3671734c947f612091f84/runtime/roots_nat.c#L509-L517.
After prodding around with gdb for a while, I realized that the problematic pointer (
0x7fffffffb728
) corresponded exactly to the second item in a two-element top-level local C root.This comes from the following camlidl-generated code (with
SEGFAULT location
marked):At that point, some inlined functions allocate on the OCaml heap and trigger GC, which scans the top-level local roots in
_vres
. The first one is fine, because a proper array was already allocated for it. The second one is the problem: it still has value 0 from its initialization.caml_darken
checks that this is a block and tries to dereference stuff around the NULL pointer.If this were
Val_unit
, likeCAMLlocal*
would initialize it (before adding a local root), everything would be fine.