-
Hi all! Appreciate this fantastic library which allowed me to easily port ETL code from BigQuery to Snowflake. Now I have a question when working with the latter. Without being able to increase computational resources, what is the best way to download a large-ish table from Snowflake? Right now I am running That is why I would like to avoid having all of the table in memory at once. I thought of two potential solutions and I wonder if they are possible using ibis?
Thanks in advance!! |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 5 replies
-
Hey @evlaw-ea -- glad to hear that the ETL code porting is working well!! We are thinking about ways to move data around, but it's a tricky problem (actually a bunch of tricky problems).
I think this will get around your out-of-memory issues, but it will not be performant. It's going to pull down N tuples at a time, where N is the batch size, and then you can operate on those batches in sequence.
I don't believe you can pass anything to I think the best way to get data out of snowflake is to export it to parquet on a user stage, and then pull down those parquet files. There's not currently a way to do that with Ibis, although we'll probably add that functionality at some point. |
Beta Was this translation helpful? Give feedback.
-
Hi @evlaw-ea 👋🏻! Thanks for the kinds words and for raising this discussion.
Any chance you can do that groupby in ibis instead of in pandas? That might solve the problem without having to think too hard about which materialization API ( |
Beta Was this translation helpful? Give feedback.
Hey @evlaw-ea -- glad to hear that the ETL code porting is working well!!
We are thinking about ways to move data around, but it's a tricky problem (actually a bunch of tricky problems).
I think this will get around your out-of-memory issues, but it will not be performant. It's going to pull down N tuples at a time, where N is the batch size, and then you can operate on those batches in sequence.