Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Arrow output #140

Open
alippai opened this issue Jun 29, 2020 · 5 comments
Open

Arrow output #140

alippai opened this issue Jun 29, 2020 · 5 comments

Comments

@alippai
Copy link

alippai commented Jun 29, 2020

Is it possible - similarly to Turbodbc - to get the resultset in Arrow format efficiently?

@Koka
Copy link
Owner

Koka commented Jul 7, 2020

Hello, I guess you could try to do so using https://docs.rs/crate/arrow/0.16.0 crate and this example https://github.com/Koka/odbc-rs/blob/master/examples/custom_get_data.rs

@pacman82
Copy link
Contributor

pacman82 commented Jul 7, 2020

Hi, I'm the original creator of odbc-safe and a maintainer of turbodbc.

To achive this you need to be able to query many rows at once in bulk.

Sadly I think you'd have to go all the way down to odbc-sys. Reason is that Odbc concept of both a buffer shared between consumer and the prepared statetment is hard to express in safe Rust (mutable shared ownership is forbidden). So a safe abstraction has to be higher level and around that usecase specifically. I consider it a mistake of mine to make odbc-safe so low level. Without introducing reference counting and interior mutability it won't ever support this.

I started a piece of private code to encapsulate that use case. Yet I didn't publish it, because I fear I won't have the resources to maintain it. I can share it with you, if you are interssted.

@alippai
Copy link
Author

alippai commented Jul 7, 2020

I have 10y+ experience with JS, TS, Python, PHP, Java, but I'm 100% beginner with Rust and I don't have C++ experience either. This exercise requires the low-level knowledge of those languages :/
I can test and benchmark it, if somebody wants to develop it.

@pacman82
Copy link
Contributor

pacman82 commented Jul 7, 2020

No promises here, to summarize the answer to your original question: Right now there is no of the shelf solution to efficiently fill an arrow array from an odbc data source.

@pacman82
Copy link
Contributor

@alippai I've since written https://github.com/pacman82/odbc2parquet which demonstrates how to bind columnar buffers and retrieve the results efficiently. It directly utilizes the parquet type system, rather than arrow (mostly since rusts parquet library does not support writing from arrow yet), but it should give you an idea of how to bind such buffers, if you want to go down that route.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants