You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I tried to connect to Apache Hive via JDBC and faced a few issues, which I would like to share here. Feel free to discuss my ideas or split them into multiple issues.
I found a workaround (see below) to connect to Hive and read data from it. However, I had to deal with a lot of SQLFeatureNotSupportedExceptions from the Hive JDBC library, because they did not implement some (optional? I'm not a JDBC expert, so I'm not sure how "optional" these things are, i.e. if they should implement it or client code should be able to deal with the missing SQL features.) stuff, especially regarding metadata. -> Maybe there could be some kind of fallback implementation that might not be as performant, but allows to work with such incomplete JDBC driver implementations.
This way I managed to get my own Hive DbType into readResultSet(). It definitely has limitations and I'm very unsure about my minimal implementations of the methods of the DbType. But at least it seems to work so far.
The text was updated successfully, but these errors were encountered:
Thanks for sharing! There are a few user-requests in the Issues about possiblity to register custom SQL dialects and I believe it will possible not earlier than 0.15 release (now we are finishing with 0.14)
But we have some bottlenecks in our plugin for schema generation and this is a reason why we closed the hierarchy of DB classes. Hope to solve or suggest workaround for this problem and be open for new data sources
I tried to connect to Apache Hive via JDBC and faced a few issues, which I would like to share here. Feel free to discuss my ideas or split them into multiple issues.
Hive
instance of DbType, the when-statements in https://github.com/Kotlin/dataframe/blob/master/dataframe-jdbc/src/main/kotlin/org/jetbrains/kotlinx/dataframe/io/db/util.kt checking the DbType are not extensible. -> The implementation should be easier extensible, as new database technologies arrise and supporting them in Kotlin Dataframe takes a while or even never happens, if it's a rather exotic one. I found the issues Add Apache Pinot® as supported database #637 and Redshift not supported #549, that also ask for further databases to be supported. Maybe there could be something like a "DbTypeRegistry", where users can add custom DbTypes at runtime instead of these static when-statements.Finally, my workaround to connect to Hive:
This way I managed to get my own Hive DbType into
readResultSet()
. It definitely has limitations and I'm very unsure about my minimal implementations of the methods of the DbType. But at least it seems to work so far.The text was updated successfully, but these errors were encountered: