Skip to content

Commit

Permalink
[SPARK-46535][SQL] Fix NPE when describe extended a column without co…
Browse files Browse the repository at this point in the history
…l stats

### What changes were proposed in this pull request?

### Why are the changes needed?

Currently executing DESCRIBE TABLE EXTENDED a column without col stats with v2 table will throw a null pointer exception.

```text
Cannot invoke "org.apache.spark.sql.connector.read.colstats.ColumnStatistics.min()" because the return value of "scala.Option.get()" is null
java.lang.NullPointerException: Cannot invoke "org.apache.spark.sql.connector.read.colstats.ColumnStatistics.min()" because the return value of "scala.Option.get()" is null
	at org.apache.spark.sql.execution.datasources.v2.DescribeColumnExec.run(DescribeColumnExec.scala:63)
	at org.apache.spark.sql.execution.datasources.v2.V2CommandExec.result$lzycompute(V2CommandExec.scala:43)
	at org.apache.spark.sql.execution.datasources.v2.V2CommandExec.result(V2CommandExec.scala:43)
	at org.apache.spark.sql.execution.datasources.v2.V2CommandExec.executeCollect(V2CommandExec.scala:49)
	at org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.$anonfun$applyOrElse$1(QueryExecution.scala:118)
	at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId0$6(SQLExecution.scala:150)
	at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:241)
	at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId0$1(SQLExecution.scala:116)
	at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:918)
```

This RP will fix it

### Does this PR introduce _any_ user-facing change?

### How was this patch tested?

Add a new test `describe extended (formatted) a column without col stats`

### Was this patch authored or co-authored using generative AI tooling?

Closes apache#44524 from Zouxxyy/dev/fix-stats.

Lead-authored-by: zouxxyy <[email protected]>
Co-authored-by: Kent Yao <[email protected]>
Signed-off-by: Max Gekk <[email protected]>
  • Loading branch information
2 people authored and MaxGekk committed Dec 28, 2023
1 parent 5db6824 commit af8228c
Show file tree
Hide file tree
Showing 2 changed files with 22 additions and 1 deletion.
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,7 @@ case class DescribeColumnExec(
read.newScanBuilder(CaseInsensitiveStringMap.empty()).build() match {
case s: SupportsReportStatistics =>
val stats = s.estimateStatistics()
Some(stats.columnStats().get(FieldReference.column(column.name)))
Option(stats.columnStats().get(FieldReference.column(column.name)))
case _ => None
}
case _ => None
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -175,4 +175,25 @@ class DescribeTableSuite extends command.DescribeTableSuiteBase
Row("max_col_len", "NULL")))
}
}

test("SPARK-46535: describe extended (formatted) a column without col stats") {
withNamespaceAndTable("ns", "tbl") { tbl =>
sql(
s"""
|CREATE TABLE $tbl
|(key INT COMMENT 'column_comment', col STRING)
|$defaultUsing""".stripMargin)

val descriptionDf = sql(s"DESCRIBE TABLE EXTENDED $tbl key")
assert(descriptionDf.schema.map(field => (field.name, field.dataType)) === Seq(
("info_name", StringType),
("info_value", StringType)))
QueryTest.checkAnswer(
descriptionDf,
Seq(
Row("col_name", "key"),
Row("data_type", "int"),
Row("comment", "column_comment")))
}
}
}

0 comments on commit af8228c

Please sign in to comment.