[BUG] ParquetCachedBatchSerializer does not grab the GPU semaphore and does not have retry blocks #11989

revans2 · 2025-01-21T19:00:09Z

Describe the bug
The ParquetCachedBatchSerializer does not have any code in it to grab the GPU semaphore before it reads in data using the GPU. GPUInMemoryTableScanExec does not do this either. This means that we can run into situations where we are increasing the load on the GPU memory.

To make things worse if we do run out of memory the serializer code has no retry block in it so we are only relying on spilling to solve all of the problems.

Steps/Code to reproduce bug

// produce 32 GiB of uncompressed data (RLE should make the cached version very small)
val df = spark.range(4294967296L).cache()
// Generate the cached data and process it...
df.filter("id > 100").selectExpr("COUNT(DISTINCT id)").show()
// First run passes, so run it again to read the cached data...
df.filter("id > 100").selectExpr("COUNT(DISTINCT id)").show()
// Failed with an OOM
25/01/21 19:09:55 ERROR Executor: Exception in task 6.0 in stage 3.0 (TID 23)
java.lang.OutOfMemoryError: Could not allocate native memory: std::bad_alloc: out_of_memory: RMM failure at:/home/roberte/src/spark-rapids-jni/target/libcudf/cmake-build/_deps/rmm-src/include/rmm/mr/device/limiting_resource_adaptor.hpp:152: Exceeded memory limit
	at ai.rapids.cudf.Table.readParquet(Native Method)
	at ai.rapids.cudf.Table.readParquet(Table.java:1433)
	at ai.rapids.cudf.Table.readParquet(Table.java:1400)
	at ai.rapids.cudf.Table.readParquet(Table.java:1413)
	at com.nvidia.spark.rapids.ParquetCachedBatchSerializer.$anonfun$convertCachedBatchToColumnarInternal$1(ParquetCachedBatchSerializer.scala:500)
	at scala.collection.Iterator$$anon$10.next(Iterator.scala:461)
	at scala.collection.Iterator$$anon$10.next(Iterator.scala:461)
	at scala.collection.Iterator$$anon$11.nextCur(Iterator.scala:486)
	at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:492)
	at com.nvidia.spark.rapids.CollectTimeIterator.$anonfun$hasNext$1(GpuMetrics.scala:282)
	at com.nvidia.spark.rapids.CollectTimeIterator.$anonfun$hasNext$1$adapted(GpuMetrics.scala:281)
...

Expected behavior
We should be able to run even with low memory.

The text was updated successfully, but these errors were encountered:

revans2 · 2025-01-21T19:13:28Z

Note that the retry case really only works on a GPU with less than the 32 GiB needed to hold the uncompressed data.

revans2 · 2025-01-21T20:59:44Z

I think I have a fix for the semaphore problem, but I need some more time to evaluate it and then I will probably split this up and have a separate issue to deal with the retry code, as that looks to be quite a bit more complicated.

revans2 added ? - Needs Triage Need team to review and classify bug Something isn't working labels Jan 21, 2025

revans2 self-assigned this Jan 21, 2025

mattahrens removed the ? - Needs Triage Need team to review and classify label Jan 21, 2025

revans2 linked a pull request Jan 21, 2025 that will close this issue

Grab the GPU Semaphore when reading cached batch data with the GPU #11991

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] ParquetCachedBatchSerializer does not grab the GPU semaphore and does not have retry blocks #11989

[BUG] ParquetCachedBatchSerializer does not grab the GPU semaphore and does not have retry blocks #11989

revans2 commented Jan 21, 2025 •

edited

Loading

revans2 commented Jan 21, 2025

revans2 commented Jan 21, 2025

[BUG] ParquetCachedBatchSerializer does not grab the GPU semaphore and does not have retry blocks #11989

[BUG] ParquetCachedBatchSerializer does not grab the GPU semaphore and does not have retry blocks #11989

Comments

revans2 commented Jan 21, 2025 • edited Loading

revans2 commented Jan 21, 2025

revans2 commented Jan 21, 2025

revans2 commented Jan 21, 2025 •

edited

Loading