Does ozo need to fetch all data from postgres before it starts to e.g. push it back to a vector? #295

psmaron · 2021-04-20T13:23:58Z

I'm wondering how ozo exchanges data between PostgreSQL. I know sockets are used between the two and if transferred data is big enough, it's divided and sent by chunks, e.g. of 16KB each, so that when we are fetching 2MB of data, 2000/16 = 128 packets will be sent.
Assuming I want to push fetched rows to a vector, my question is: will ozo wait for all packets and only then start to push back to vector, or it'll push back to vector on the fly, releasing memory of 'used' (pushed back) packets? The reason between these 2 approaches is that (in some small time window) in the first case we'll need to reserve about 2*2MB of memory on the application side (2MB for the vector and 2MB for all data transferred from postgres), while the former case requires only roughly 2MB (2MB for the vector, and some memory for not-yet-pushed-to-vector packets).
I'm guessing ozo needs to have all the data from postgres transferred before it starts to push it back to a vector, am I right?

The text was updated successfully, but these errors were encountered:

thed636 · 2021-04-21T19:55:07Z

Hi!

The library doesn't provide a stream interface. So the result should be received completely before the operation continuation call. And this is not only the library limits but the underlying libpq library limit. The best approach is to combine the application logic with a proper database query to fetch data by a limited amount of rows. Streaming is not the best choice due to holding a transaction during the operation. Long transactions aren't good for database performance. We tried to utilize single-row mode for a kind of streaming, but that was slow and led to the long transaction with the database performance issues.

Please see also #230 with a short discussion about CURSOR and COPY.

Hope that helps.

psmaron · 2021-04-22T06:36:12Z

All is clear, thank you!

psmaron changed the title ~~Does ozo~~ Does ozo need to fetch all data from postgres before it starts to e.g. push it back to a vector? Apr 20, 2021

thed636 added the question label Apr 21, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Does ozo need to fetch all data from postgres before it starts to e.g. push it back to a vector? #295

Does ozo need to fetch all data from postgres before it starts to e.g. push it back to a vector? #295

psmaron commented Apr 20, 2021

thed636 commented Apr 21, 2021

psmaron commented Apr 22, 2021

Does ozo need to fetch all data from postgres before it starts to e.g. push it back to a vector? #295

Does ozo need to fetch all data from postgres before it starts to e.g. push it back to a vector? #295

Comments

psmaron commented Apr 20, 2021

thed636 commented Apr 21, 2021

psmaron commented Apr 22, 2021