You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is there a good way to convert an Arc<dyn Array> to either Vec<Option<T>>, where T is a Rust native type, or else straight to a polars::series::Series?
So far the rough solution I've come up with looks as follows:
fnarrow_array_to_polars_series(name:&str,array:&Arc<dyn arrow::array::Array>,) -> Result<polars::series::Series,String>{match array.data_type(){
arrow::datatypes::DataType::Binary => {match array.as_any().downcast_ref::<arrow::array::BinaryArray>(){Some(downcast) => Ok(polars::series::Series::new(
name,
downcast.iter().collect::<Vec<Option<&[u8]>>>(),)),
_ => Err("Couldn't downcast!".into()),}}
arrow::datatypes::DataType::Int8 => {match array.as_any().downcast_ref::<arrow::array::Int32Array>(){Some(downcast) => Ok(polars::series::Series::new(
name,
downcast.iter().collect::<Vec<Option<i32>>>(),)),
_ => Err("Couldn't downcast!".into()),}}// Numerous other arrow::datatypes::DataTypes to be filled in below here...
_ => Err("Unhandled data type!".into()),}}
However, there are at least 3 problems with fully implementing this approach:
Requires a separate match arm for each instance of arrow::datatypes::DataType of which there are quite a few.
Each match arm involves correlating the DataType with the correct arrow Array type and native Rust type. I know this can be improved using macros but it's still slightly complex and error prone.
There may not even be equivalent Rust native types to use for every DataType (e.g. not sure how to handle the match arm for DataType:List?) so it may not even be possible to handle every DataType.
I feel like there may well be a better approach than the one sketched above, so if anyone has any pointers or suggestions I'd be very grateful.
(For context, the arrow arrays are coming from a SQL query executed by connectorx, and I'm trying to find a way to convert this column data into a polars::data frame::DataFrame for subsequent manipulation. I know connectorx has a .polars() solution, and that connectorx can also return data as arrow2 arrays, which is a lightweight implementation of arrow used by Polars, but then I'm tied to using the same version of Polars used by connectorx, which is very old at this point, and I think going in the direction I currently am will allow me to decouple the Polars version from connectorx and get access to more recent and powerful versions of Polars.)
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
Hi,
Is there a good way to convert an
Arc<dyn Array>
to eitherVec<Option<T>>
, where T is a Rust native type, or else straight to apolars::series::Series
?So far the rough solution I've come up with looks as follows:
However, there are at least 3 problems with fully implementing this approach:
I feel like there may well be a better approach than the one sketched above, so if anyone has any pointers or suggestions I'd be very grateful.
(For context, the arrow arrays are coming from a SQL query executed by connectorx, and I'm trying to find a way to convert this column data into a
polars::data frame::DataFrame
for subsequent manipulation. I know connectorx has a .polars() solution, and that connectorx can also return data as arrow2 arrays, which is a lightweight implementation of arrow used by Polars, but then I'm tied to using the same version of Polars used by connectorx, which is very old at this point, and I think going in the direction I currently am will allow me to decouple the Polars version from connectorx and get access to more recent and powerful versions of Polars.)Thanks!
Dan
Beta Was this translation helpful? Give feedback.
All reactions