Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

switch lazyseq to iteration #1443

Merged
merged 4 commits into from
Oct 15, 2024
Merged

switch lazyseq to iteration #1443

merged 4 commits into from
Oct 15, 2024

Conversation

ereteog
Copy link
Contributor

@ereteog ereteog commented Sep 18, 2024

switch migration task to iteration instead of custom crawl approach.

§ QA

No QA is needed.

Copy link
Contributor

@frenchy64 frenchy64 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this have any memory implications? This gets passed to seque which seems to behave the same based on my reading.

:vf identity
:initk read-params
:kf #(dissoc % :documents)
:somef seq))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think both :vf and :somef should be their default values.

:somef => some? because it's testing a (s/maybe BatchParams)

:vf is already identity.

@ereteog
Copy link
Contributor Author

ereteog commented Oct 9, 2024

Does this have any memory implications? This gets passed to seque which seems to behave the same based on my reading.

I tried to generate a memory problem and it didn't trigger any.
However a very similar approach did generate a memory issue in IROH for reports and it actually appeared to have a memory leak issue during last migration.
I did not rework on it since last time since I could not reproduce the memory issue, but it still bother me :-).

@frenchy64
Copy link
Contributor

frenchy64 commented Oct 9, 2024

I think they are similarly lazy, I'm guessing iteration is slightly leaner since it doesn't need to cache its own seq.

I ended up finding a bug which might be relevant: seque forces n+2 elements of the producing seq.

https://ask.clojure.org/index.php/14178/seque-forces-n-2-items-ahead-of-consumer-instead-of-n

(let [producer (fn [i]
                 (prn "producer" i)
                 (inc i))
      s (seque 1 (iteration producer :initk 0))]
  (Thread/sleep 1000))
;"producer" 0
;"producer" 1
;"producer" 2
nil

(let [producer (fn step [i]
                 (lazy-seq
                   (prn "producer" i)
                   (cons i (step (inc i)))))
      s (seque 1 (producer 0))]
  (Thread/sleep 1000)
  nil)
;"producer" 0
;"producer" 1
;"producer" 2
nil

@frenchy64
Copy link
Contributor

frenchy64 commented Oct 9, 2024

In addition, seque has a memory leak where it will hold onto the the n+1 and n+2 elements even after the seque is garbage collected. This is because it sends off an agent which has a known memory leak since it conveys itself in an *agent* thread binding https://clojure.atlassian.net/browse/CLJ-2619

@ereteog
Copy link
Contributor Author

ereteog commented Oct 10, 2024

In addition, seque has a memory leak where it will hold onto the the n+1 and n+2 elements even after the seque is garbage collected.

That would explain a lot!
I could not reproduce the issue with a similar test I did for another implementation that I rewrote with iteration. Now I think that I know what is the main difference, thx!
The seque is in migrate-query that is called with a reduce here.
What you said would explain what happened last time for events store, which migration showed a memory leak and have to be restarted several time (hopefully the process can restart where it stopped :-) ). The migration is splitted per week because we need to sort and sorting on weeks is fast while sorting the whole index is slow (of course :-) ).
Since we migrated about 5 years of ~8 billion documents for events, it means that ~250 seque were not garbage collected :-)!

Copy link
Contributor

@frenchy64 frenchy64 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

@frenchy64
Copy link
Contributor

@frenchy64
Copy link
Contributor

frenchy64 commented Oct 11, 2024

On reflection, we consume each seque entirely via reduce (right?). This means the agents that are leaking memory will not contain any migration data. The migration might have failed because our buffer size was too large. Now that it's effectively 1, if we still find memory issues we might need to reduce the size of the queries (and/or pagination?).

@ereteog
Copy link
Contributor Author

ereteog commented Oct 14, 2024

we have 2 reduce

a reduce on each query (1 week of data) we have a seque on the lazyseq in migrate-query

 data-queue (seque buffer-size
                          (read-source read-params))
new-migrated-count (reduce #(write-target %1 %2 services)
                                   migrated-count
                                   data-queue)]

and we reduce over queries:

(reduce #(migrate-query %1 %2 services)
                 base-params
                 queries)                      

We did observe an increase of memory over hours, which deals with several queries.

In any case that's worth migrating to an idiomatic abstraction, right?

@frenchy64
Copy link
Contributor

In any case that's worth migrating to an idiomatic abstraction, right?

Yes, I prefer reducing over an iteration. It will at least reduce the possible explanations if we continue to see memory issues.

@ereteog ereteog merged commit 62eabdf into master Oct 15, 2024
14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants