Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Panic when inserting array literal with NULL element into a table where the matching column has a non-nullable element #12598

Open
rroelke opened this issue Sep 23, 2024 · 3 comments
Labels
bug Something isn't working waiting-on-upstream PR is waiting on an upstream dependency to be updated

Comments

@rroelke
Copy link

rroelke commented Sep 23, 2024

Describe the bug

I have a custom data sink whose schema contains an array whose element type is non-nullable. If I attempt to insert an array literal with a null element into this data sink via SQL, the process exits with a panic. This behavior can be reproduced using MemTable.

To Reproduce

The following example code can be added to datafusion-examples/src and run via cargo run --example array-non-nullable-element:

use std::sync::Arc;

use arrow::datatypes::{DataType, Field, Schema};
use arrow::record_batch::RecordBatch;
use datafusion::common::Result;
use datafusion::prelude::*;

#[tokio::main]
async fn main() -> Result<()> {
    let schema = Arc::new(Schema::new(vec![
        Field::new("id", DataType::Int64, false),
        Field::new("arr", DataType::new_list(DataType::Int64, false), true),
    ]));

    let ctx = SessionContext::new();

    ctx.register_batch("t", RecordBatch::new_empty(Arc::clone(&schema)))?;
    let _ = ctx.table("t").await?;

    let _ = ctx
        .sql("INSERT INTO t VALUES (1, [1, 2, 3]);")
        .await?
        .collect()
        .await?;

    let _ = ctx
        .sql("INSERT INTO t VALUES (2, [1, NULL, 3]);")
        .await?
        .collect()
        .await?;

    Ok(())
}

Running with RUST_BACKTRACE=1 indicates that there is an unwrap() occurring in ListArray::new.

   3: core::result::Result<T,E>::unwrap
             at /rustc/eeb90cda1969383f56a2637cbd3037bdf598841c/library/core/src/result.rs:1102:23
   4: arrow_array::array::list_array::GenericListArray<OffsetSize>::new
             at $HOME/.cargo/registry/src/index.crates.io-6f17d22bba15001f/arrow-array-53.0.0/src/array/list_array.rs:228:9
   5: arrow_cast::cast::list::cast_list_values
             at $HOME/.cargo/registry/src/index.crates.io-6f17d22bba15001f/arrow-cast-53.0.0/src/cast/list.rs:144:17
   6: arrow_cast::cast::cast_with_options
             at $HOME/.cargo/registry/src/index.crates.io-6f17d22bba15001f/arrow-cast-53.0.0/src/cast/mod.rs:750:32
   7: datafusion_expr_common::columnar_value::ColumnarValue::cast_to
             at $DATAFUSION/datafusion/expr-common/src/columnar_value.rs:195:17

Expected behavior

I would expect this query to bubble a DataFusionError up to the call site instead of panicking the process.

Additional context

No response

@rroelke rroelke added the bug Something isn't working label Sep 23, 2024
@jonahgao
Copy link
Member

It looks like an arrow-rs problem, GenericListArray::try_new should be used instead of GenericListArray::new🤔

@rroelke
Copy link
Author

rroelke commented Sep 25, 2024

Is the right approach to close this issue after opening a new one against arrow-rs?

@jonahgao
Copy link
Member

Is the right approach to close this issue after opening a new one against arrow-rs?

I think we can keep it open and wait for an upstream fix.

@jonahgao jonahgao added the waiting-on-upstream PR is waiting on an upstream dependency to be updated label Sep 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working waiting-on-upstream PR is waiting on an upstream dependency to be updated
Projects
None yet
Development

No branches or pull requests

2 participants