-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
switch lazyseq to iteration #1443
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this have any memory implications? This gets passed to seque
which seems to behave the same based on my reading.
:vf identity | ||
:initk read-params | ||
:kf #(dissoc % :documents) | ||
:somef seq)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think both :vf
and :somef
should be their default values.
:somef
=> some?
because it's testing a (s/maybe BatchParams)
:vf
is already identity
.
I tried to generate a memory problem and it didn't trigger any. |
I think they are similarly lazy, I'm guessing I ended up finding a bug which might be relevant: seque forces n+2 elements of the producing seq. https://ask.clojure.org/index.php/14178/seque-forces-n-2-items-ahead-of-consumer-instead-of-n (let [producer (fn [i]
(prn "producer" i)
(inc i))
s (seque 1 (iteration producer :initk 0))]
(Thread/sleep 1000))
;"producer" 0
;"producer" 1
;"producer" 2
nil
(let [producer (fn step [i]
(lazy-seq
(prn "producer" i)
(cons i (step (inc i)))))
s (seque 1 (producer 0))]
(Thread/sleep 1000)
nil)
;"producer" 0
;"producer" 1
;"producer" 2
nil |
In addition, |
That would explain a lot! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice!
Reported memory leak: https://ask.clojure.org/index.php/14185/memory-leak-in-seque-via-agents |
On reflection, we consume each seque entirely via reduce (right?). This means the agents that are leaking memory will not contain any migration data. The migration might have failed because our buffer size was too large. Now that it's effectively 1, if we still find memory issues we might need to reduce the size of the queries (and/or pagination?). |
we have 2 reduce a reduce on each query (1 week of data) we have a seque on the lazyseq in migrate-query
and we reduce over queries:
We did observe an increase of memory over hours, which deals with several queries. In any case that's worth migrating to an idiomatic abstraction, right? |
Yes, I prefer reducing over an iteration. It will at least reduce the possible explanations if we continue to see memory issues. |
switch migration task to
iteration
instead of custom crawl approach.§ QA
No QA is needed.