Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Purge Obuilder Docker store when Docker is pruned #242

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 11 additions & 4 deletions worker/cluster_worker.ml
Original file line number Diff line number Diff line change
Expand Up @@ -220,7 +220,7 @@ let check_docker_partition t =
if max_df_size < gb then Error `Disk_space_low
else Ok ()

let rec maybe_prune t queue =
let rec maybe_prune obuilder t queue =
check_docker_partition t >>= function
| Ok () -> Lwt.return_unit
| Error `Disk_space_low ->
Expand All @@ -236,6 +236,13 @@ let rec maybe_prune t queue =
(fun () ->
Lwt_process.exec ("", [| "docker"; "system"; "prune"; "-af" |]) >>= function
| Unix.WEXITED 0 ->
(match obuilder with
| None -> Lwt.return 0
| Some obuilder ->
match Obuilder_build.backend obuilder with
| `Native _ -> Lwt.return 0
| `Docker _ -> Obuilder_build.purge obuilder) >>= fun n ->
Comment on lines +243 to +244
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that we are just ignoring the values wrapped in the Native | Docker tags here, why do we need to expose them in the Boulder_build API at all?

Comment on lines +239 to +244
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If my suggestion about keeping the sandbox configuration type private makes sense, then this could be something like

Suggested change
(match obuilder with
| None -> Lwt.return 0
| Some obuilder ->
match Obuilder_build.backend obuilder with
| `Native _ -> Lwt.return 0
| `Docker _ -> Obuilder_build.purge obuilder) >>= fun n ->
(match obuilder with
| None -> Lwt.return 0
| Some obuilder -> Obuilder_build.purge obuilder) >>= fun n ->

Log.info (fun f -> f "%i items prune from Obuilder" n);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Log.info (fun f -> f "%i items prune from Obuilder" n);
Log.info (fun f -> f "%i items pruned from Obuilder" n);

Lwt_process.exec ("", [| "docker"; "builder"; "prune"; "-af" |])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is out of scope for your changes, but seems worth clarifying any how:

According to https://docs.docker.com/reference/cli/docker/builder/prune/, this will

Remove all unused build cache, not just dangling ones

Whereas, according to https://docs.docker.com/reference/cli/docker/system/prune/, docker system prune (which we should have just called on line 237) will

Remove all unused containers, networks, images (both dangling and unused), and optionally, volumes.

Based on the warning in the docks, this includes all

unused build cache

So I'm wondering whether we actually need this additional step, or if docker system prune should not already cover it...

| e -> Lwt.return e
)
Expand All @@ -249,12 +256,12 @@ let rec maybe_prune t queue =
| Error `Disk_space_low ->
Log.warn (fun f -> f "Disk-space still low after pruning! Will retry in one hour.");
Unix.sleep (60 * 60);
maybe_prune t queue
maybe_prune obuilder t queue
end
| _ ->
Log.warn (fun f -> f "docker prune command failed! Will retry in one hour.");
Unix.sleep (60 * 60);
maybe_prune t queue
maybe_prune obuilder t queue

let healthcheck obuilder =
let t0 = Unix.gettimeofday () in
Expand Down Expand Up @@ -306,7 +313,7 @@ let loop ~switch ?obuilder t queue =
Log.info (fun f -> f "At capacity. Waiting for a build to finish before requesting more…");
Lwt_condition.wait t.cond >>= loop
) else (
maybe_prune t queue >>= fun () ->
maybe_prune obuilder t queue >>= fun () ->
check_health t ~last_healthcheck ~queue obuilder >>= fun () ->
let outcome, set_outcome = Lwt.wait () in
let log = Log_data.create () in
Expand Down
2 changes: 1 addition & 1 deletion worker/context.ml
Original file line number Diff line number Diff line change
Expand Up @@ -197,7 +197,7 @@ let build_context t ~log ~tmpdir descr =
if include_git descr then (
let cmd, is_success =
if Sys.win32 then
["robocopy"; clone / ".git"; tmpdir / ".git"; "/COPY:DATSO"; "/E"; "/R:0"; "/DCOPY:T"],
["robocopy"; clone / ".git"; tmpdir / ".git"; "/COPY:DATSO"; "/E"; "/R:0"; "/DCOPY:T"; "/NDL"; "/NFL"],
fun s -> s = 1
else
["cp"; "-a"; clone / ".git"; tmpdir / ".git"],
Expand Down
13 changes: 13 additions & 0 deletions worker/obuilder_build.ml
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,8 @@ type t = {
prune_threshold : float option;
prune_item_threshold : int64 option; (* Threshold number of items to hold in obuilder store *)
prune_limit : int option; (* Number of items to prune from obuilder when threshold is reached *)
sandbox_config : [ `Native of Obuilder.Native_sandbox.config
| `Docker of Obuilder.Docker_sandbox.config ]
}

let ( / ) = Filename.concat
Expand Down Expand Up @@ -65,6 +67,7 @@ let create ?prune_threshold ?prune_item_threshold ?prune_limit config =
prune_item_threshold;
prune_limit;
cond = Lwt_condition.create ();
sandbox_config;
}

(* Prune [t] until free space rises above [prune_threshold]
Expand Down Expand Up @@ -143,3 +146,13 @@ let healthcheck t =
let cache_stats t =
let Builder ((module Builder), builder) = t.builder in
Builder.cache_stats builder

let purge t =
let Builder ((module Builder), builder) = t.builder in
let before = Unix.gettimeofday () +. prune_margin |> Unix.gmtime in
(* set a future time and a big number to ensure everything is deleted *)
Builder.prune builder ~before Int.max_int >>= fun n ->
Lwt.return n
Comment on lines +154 to +155
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Builder.prune builder ~before Int.max_int : int Lwt.t and we don't need to operate in the int, here, so we no need to take it of and return it back to the Lwt.t context here:

Suggested change
Builder.prune builder ~before Int.max_int >>= fun n ->
Lwt.return n
Builder.prune builder ~before Int.max_int


let backend t =
t.sandbox_config
Comment on lines +157 to +158
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

iiuc, the only use of this accessor is at https://github.com/ocurrent/ocluster/pull/242/files#diff-5b7638a06391f25c3c893df03fe8754d5c0702dfc2aa4e12c7881dcf077e106eR244 to classify the kind of configuration being used, which then let's us decide whether to actually call purge or to return 0 in a no-op. As such, it looks to me like there is no reason to leak this out of the Obuilder_build API. Instead, we can just handle that logic in the purge function itself. WDYT?

4 changes: 4 additions & 0 deletions worker/obuilder_build.mli
Original file line number Diff line number Diff line change
Expand Up @@ -20,3 +20,7 @@ val build : t ->
val healthcheck : t -> (unit, [> `Msg of string]) Lwt_result.t

val cache_stats : t -> int * int

val purge : t -> int Lwt.t
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A dock comment explaining what this is meant to be used for would be helpful.


val backend : t -> [`Native of Obuilder.Native_sandbox.config | `Docker of Obuilder.Docker_sandbox.config ]
Loading