This is an addon library for Ctypes
that provides stronger guarantees about the lifetimes of foreign object allocated from OCaml. See this blog post for the original motivation.
Consider the following plain Ctypes
code:
open Ctypes
(** Returns a pointer into the argument character string that points to the first
instance of the argument character. *)
let strchr : char ptr -> char -> char ptr =
Foreign.foreign "strchr" (ptr char @-> char @-> returning (ptr char))
let () =
let p = CArray.of_string "abc" |> CArray.start in
let q = strchr p 'a' in
let () = Gc.compact () in
let c = !@ q in
if Char.(equal c 'a') then print_endline "yay!" else print_endline "boo!"
This code will almost always print "boo!" (run dune test
if you don't believe me!). The issue is that p
does not appear after the Gc.compact ()
, and since the garbage collector has no knowledge of the implicit dependency between p
, q
, and c
, in that q
points into p
(and so its dereference c
is only valid as long as p
continues to live) it collects p
during compaction.
The idea of this library is to provide a way of encoding this dependency semi-automatically.
The first step is to define a strchr
that knows about the dependency of its return value on its argument.
open Living
open Living_ctypes
let strchr : char ptr -> char -> char ptr Living_core.t =
let strchr_unsafe = Foreign.foreign "strchr" (ptr char @-> char @-> returning (ptr char)) in
fun s c -> Living_core.(strchr_unsafe s c => s)
Here, we have used (=>)
operator to encode the information that strchr_unsafe s c
being alive implies that s
must be alive too, in order for the program to be correct. Next, we replace any operations that can create dependant values in Ctypes
with their Living
counterparts, and replace the let
s that bind them with let*
s instead. We also need to return a value of type 'a Living.t
, so we just return a unit
wrapped in this type.
let _ =
let open Living_core.Let_syntax in
let* p = CArray.of_string "abc" |> Living_core.bind CArray.start in
let* q = strchr p 'a' in
let () = Gc.compact () in
let* c = !@ q in
if Char.(equal c 'a') then print_endline "yay!" else print_endline "boo!"
Living_core.return ()
Other than that, the code is now correct. Run it and it will always print "yay!". That's all there is to it!
The key step we needed to do manually is to encode the dependence of strchr_unsafe
's return value on its argument. This can take many forms, as C has many ways of returning values. The details are up to the FFI binding author to get right, but here are some hints:
- Always add to pointers into structures their dependence on the structure.
- Rewrite "output"-pointer-containing functions to return tuples instead, so that dependency is easier to track.
It is often useful to allow the GC to collect garbage, so you generally don't want your whole program to be wrapped in a Living_core.t
containing every dependency in it. Remember however, that this is an optimization, and should only be attempted once you know you need to by measuring performance. If you mess this up, you can get segfaults, and often it's good enough to just let stuff fall out of scope.
If you have measured performance and found you need to drop dependencies, you can do so by calling the Living_core.unsafe_free
function. This returns the current value of the computation without its dependencies. Some care must be taken however.
The process looks like this:
- Take your
'a Living_core.t
and figure out if it has any pointers or structures that have been allocated bymalloc
,Ctypes.allocate
,Ctypes.allocate_n
or the like. - If it does not, proceed to step 4.
- If it does, then copy all that data into OCaml heap objects that can't be GC'd from underneath you like an off-heap pointer can be, by using
Living_core.map
orLiving_core.bind
to map the'a
to a new, safer'b
- Call
Living_core.unsafe_free
.
It is important to do step 3. properly. Here are two examples; the first you should never do.
(* Bad Example *)
(*** NEVER DO THIS ***)
let _ =
let my_dependencyless_char =
CArray.of_string "abc"
|> Living_core.bind CArray.start
|> Living_core.bind (fun q -> strchr p 'a') in
(* Bad assumption: We don't care about q after we derefence it, since the char is copied to OCaml, so we use the non-wrapped version of !@ from base Ctypes on just the value. *)
|> fun my_dependencyful_char_ptr -> Ctypes.(!@) (Living_core.unsafe_free my_dependencyful_char_ptr)
in
Printf.printf "%c\n" my_dependencyless_char
The problem is that, at least theoretically, Ctypes.(!@)
could call the garbage collector before it dereferences the pointer q.unsafe_value
. This would land us back in hot water. Instead, prefer the following idiom:
(* Good Example *)
(*** DO THIS ***)
let _ =
let my_dependencyless_char =
CArray.of_string "abc"
|> Living_core.bind CArray.start
|> Living_core.bind (fun p -> strchr p 'a')
|> Living_core.map Ctypes.(!@) (* Key idea: map with !@ _inside_ the Living_core.t context! *)
|> Living_core.unsafe_free (* And only the call unsafe_free *)
in
Printf.printf "%c\n" my_dependencyless_char
That is, you should do all the mapping you need to do to get to a safe, OCaml-copied value within the context of the Living_core.t
, before finally calling Living_core.unsafe_free
.
In this second example, even if Ctypes.(!@)
calls the garbage collector, the Living
library ensures that the C string "abc" will not be collected out from under you.
Living_core
can be configured in a variety of ways using the Living_core.Make
functor.
- One may provide their own logging function
log_leak
, which is passed anstring option
possibly containing the name of the leakedLiving_core.t
- One may disable leak logging entirely by setting
should_log
tofalse
. - One may disable the safety net of preventing leaking of
Living_core.t
s which haven't beenunsafe_free
d by settingshould_prevent_leaks
to false. This is an optimization and only recommended if you're sure you've got stuff right. Segfaults await the uncareful programmer.
One my use default implementations of all of these things by accessing the Living_core.Living_config_default
or even more simply by using the instantiated functor Living_core.Default
.
If you choose to use Living
library in a project Foo
then please also make it a functor of type module Living_core_intf.LIVING_CORE -> module FOO
. This way users of your module can configure the LIVING_CORE
implementation used to agree with other libraries they are using. Some users might prefer to disable logging, to log to some special logger, or to disable safety after optimizing their usage, for example. However, I would imagine many bindings would prefer to not expose their explicit dependence on Living
at all, so if this is the case feel free to configure the module yourself -- just know you might be limiting some specific class of users. An alternative is to provide a default implementation, as Living_ctypes
and Living_core
do.