Skip to content
This repository has been archived by the owner on May 15, 2024. It is now read-only.

feat: force packing if threshold is not reached after a configurable duration #178

Merged
merged 3 commits into from
Oct 26, 2023

Conversation

elijaharita
Copy link
Contributor

@elijaharita elijaharita commented Oct 24, 2023

the default duration is 24h.

this PR doesn't introduce a test (yet?). i've only verified its function by looking at logs so far.

i'm thinking about how this could be tested. maybe could add a routine to the beginning of the integration test file that ensures no pack jobs have run yet, then post a small amount of data below the threshold, wait a bit longer than the configured timeout (set to 5 seconds for the test, or something like that), and then check that a pack job has run.

thoughts? i haven't looked into whether it's possible / how to inspect pack jobs yet

Fixes #169

@elijaharita elijaharita requested review from masih and xinaxu October 24, 2023 00:25
@codecov-commenter
Copy link

codecov-commenter commented Oct 24, 2023

Codecov Report

Merging #178 (07eb38d) into main (23dd6f7) will increase coverage by 0.83%.
Report is 11 commits behind head on main.
The diff coverage is 43.75%.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #178      +/-   ##
==========================================
+ Coverage   30.82%   31.66%   +0.83%     
==========================================
  Files           5        5              
  Lines         743      777      +34     
==========================================
+ Hits          229      246      +17     
- Misses        479      494      +15     
- Partials       35       37       +2     
Files Coverage Δ
integration/singularity/options.go 24.13% <16.66%> (-0.27%) ⬇️
integration/singularity/store.go 36.53% <50.00%> (+1.24%) ⬆️

Copy link
Collaborator

@willscott willscott left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • the most common pattern for testing is that you take i a 'clock' object, and for the tests have a mock implementation that can jump ahead to later times as part of the test.
  • rather than packing as a timeout, we have considered, and it may be worthwhile to have an exposed 'flush' endpoint to allow a client to force packing to occur and indicate they're done with a dataset and would like to immediately force all unsaved state out to durable storage - e.g. before shutting down the docker compose instance.

@xinaxu
Copy link
Collaborator

xinaxu commented Oct 24, 2023

  • rather than packing as a timeout, we have considered, and it may be worthwhile to have an exposed 'flush' endpoint to allow a client to force packing to occur and indicate they're done with a dataset and would like to immediately force all unsaved state out to durable storage - e.g. before shutting down the docker compose instance.

I think the "flush" API makes DeStore less simple than it should be, as it exposes some level of internal implmentation to user. They can, alternatively, use internal Singularity API to achieve the same goal so I don't strongly feel we need this "flush" API now.
https://data-programs.gitbook.io/singularity/web-api-reference/job#preparation-id-source-name-finalize

@elijaharita elijaharita merged commit 51e4d29 into main Oct 26, 2023
7 checks passed
@elijaharita elijaharita deleted the feat/pack-max-wait branch October 26, 2023 21:48
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Last pack job should get prepared after a certain duration
4 participants